doubi4491 2017-01-05 16:49 采纳率: 0%
浏览 127
已采纳

如何使用golang抓取h1标签的标题?

Suppose this is a h1 tag

<h1>FindMe</h1>

in a huge webpage with many other h1 tags, but this is the first h1 tag. So I am using the net/html package and I am searching for the first StartTagToken, after my program has found the token, how do I get what is written inside the heading i.e. FindMe in this case.

This is the code I have right now

z := html.NewTokenizer(body)    

for{
    tt := z.Next()

    if tt= html.ErrorToken{
        return
    }
    else if tt== html.StartTagToken{
        tag := z.Token()

        if tag.Data =="h1"{
            fmt.Println("We found the title
")
            //some code to find what is stored in the heading
        }
    }
} 

How do I go about doing that?

EDIT: More specifically, what is the property of variable tag which would give me the text inside of it. I may be wrong with the conceptual terms here. Please bear with me

  • 写回答

1条回答 默认 最新

  • dpu66046 2017-01-05 17:16
    关注

    What you got is the StartTagToken, the part you're intrested in is between it and the corresponding EndTagToken as TextToken. So you need to read the next token and it's Data should be the value you're after, something like

    ...
    if tag.Data =="h1"{
       if tt = z.Next(); tt == html.TextToken {
           fmt.Println(z.Token().Data)
       }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog
  • ¥15 Excel发现不可读取的内容
  • ¥15 关于#stm32#的问题:CANOpen的PDO同步传输问题