douyou7102 2016-07-16 11:16
浏览 46
已采纳

在Golang中最后一次出现模式后,HTML模板解析提取标头

I have a HTML code as a golang string, out of which I want to extract a particular header, after the last occurence of a pattern. To explain with an example:

    func main() {
    h := `
<html>
 <body>
  <a name="0"> text </a>
  <a name="1"> abc </a>
  <a name="2"> def ghi jkl </a>
  <a name="3"> abc </a>
  <a name="4"> Some text </a>
 </body>
</html>`

    pattern := "abc"

    // Now I want <a name="3"> to be printed. I mean, when someone
    // searches for the pattern abc, the last occurence is the <a>
    // section with the name "3". If the pattern is "def" then "2"
    // should be printed, if the pattern is "text" then 4 should
    // be printed

}

Any idea how I can do this ? I played around with the templates and the Scanner packages but could not get it working.

  • 写回答

2条回答 默认 最新

  • douyingmou1389 2016-07-16 18:17
    关注

    That depends on what the html input is. You may be able to get away with using regexp, but if you're working with arbitrary html, you're going to have to use a full html parser, such as https://godoc.org/golang.org/x/net/html.

    For example, using goquery (which uses x/net/html):

    package main
    
    import (
            "fmt"
            "strings"
    
            "github.com/PuerkitoBio/goquery"
    )
    
    func main() {
            h := `
    <html>
     <body>
      <a name="0"> text </a>
      <a name="1"> abc </a>
      <a name="2"> def ghi jkl </a>
      <a name="3"> abc </a>
      <a name="4"> Some text </a>
     </body>
    </html>`
    
            pattern := "abc"
    
            doc, err := goquery.NewDocumentFromReader(strings.NewReader(h))
            if err != nil {
                    panic(err)
            }
    
            doc.Find("a").Each(func(i int, s *goquery.Selection) {
                    if strings.TrimSpace(s.Text()) == pattern {
                            name, ok := s.Attr("name")
                            if ok {
                                    fmt.Println(name)
                            }
                    }
            })
    
    }
    

    EDIT: or instead of the doc.Find part you may be able to use a contains selector depending on your actual input:

    // Don't do this if pattern is arbitrary user input
    name, ok := doc.Find(fmt.Sprintf("a:contains(%s)", pattern)).Last().Attr("name")
    if ok {
            fmt.Println(name)
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器