drgdn82648 2014-01-29 03:50
浏览 61
已采纳

以通用方式使用html.ParseFragment

Using the experimental code.google.com/p/go.net/html package, we can use ParseFragment to parse some sub-section of an HTML document.

Like this:

var s = `
    <option id="foo">first</option>
    <option Class="tester">second</option>
    <option>third</option>
`
doc, err := html.ParseFragment(strings.NewReader(s), &html.Node{
    Type: html.ElementNode,
    Data: "body",
    DataAtom: atom.Body,
})

This works fine for most elements. But it doesn't seem to work when certain elements are at the root position of the HTML, like tbody, tr, and td (and perhaps others, not sure). It simply ignores the tags and only gives the text content.

This can be remedied by providing the semantically correct parent instead of atom.Body, but that requires that we know in advance what the HTML will be.

I'd hoped there was a generic root like atom.DocumentFragment, but I don't see that. So is there some way to use this in such a manner that it'll work with any arbitrary HTML fragment?

  • 写回答

1条回答 默认 最新

  • dongyuandou2521 2014-01-30 16:34
    关注

    ParseFragment is always context-sensitive because it follows the HTML5 fragment-parsing algorithm. That algorithm is designed for implementing the DOM innerHTML property, and the correct tree to generate from a given innerHTML string depends on the surrounding context (especially whether the context is in a table or not).

    So the html package has no way to parse an HTML fragment independently of its context.

    If you need more information about how the parsing depends on the context, see http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#reset-the-insertion-mode-appropriately

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥30 BC260Y用MQTT向阿里云发布主题消息一直错误
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)