doulei3488 2010-06-19 06:53
浏览 54

如何从包含带有PHP的HTML文档的字符串中删除使用xpath-> query查找的节点

The use case is quite simple. I would like to find node via an xpath statement in a string(!) that basically contains an HTML document and delete them.

I know how to find the nodes with PHP. It is basically like this: create new DOMDocument LoadHTML (or LoadXML) Create new DOMXpath and then method "query" or "evaluate". Done.

However deleting is the tricky part. One would think that you just delete the nodes with a few statements (and at the end parentNode->removechild) and just save the result back into the string with saveHTML. Unfortunately this operation transforms almost every time "too many things" in the original HTML string.

So my question now is. How could I delete the nodes return by xpath->query ($query) without using saveHTML or saveXML? And without writing my own parser.

Hope it was clear enough :-)

Thanks for looking at this!

  • 写回答

2条回答 默认 最新

  • douci2015 2010-06-19 07:49
    关注

    First of all, make sure you remove the found nodes from the bottom and up. This is to make sure you remove child nodes before parent nodes.

    Second, what do you mean by "transforms to many things"? PHP's DOM XML will parse the document into a DOM node tree. Then you work on the tree, and when you aree done it will convert the DOM tree back into XML/HTML. You may very well lose indentation, arguments may change places and so on. The important thing is that the document means exactly the same thing, i.e. is an exact XML/HTML representation of the DOM tree.

    评论

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器