doulei3488 2010-06-19 06:53
浏览 54

如何从包含带有PHP的HTML文档的字符串中删除使用xpath-> query查找的节点

The use case is quite simple. I would like to find node via an xpath statement in a string(!) that basically contains an HTML document and delete them.

I know how to find the nodes with PHP. It is basically like this: create new DOMDocument LoadHTML (or LoadXML) Create new DOMXpath and then method "query" or "evaluate". Done.

However deleting is the tricky part. One would think that you just delete the nodes with a few statements (and at the end parentNode->removechild) and just save the result back into the string with saveHTML. Unfortunately this operation transforms almost every time "too many things" in the original HTML string.

So my question now is. How could I delete the nodes return by xpath->query ($query) without using saveHTML or saveXML? And without writing my own parser.

Hope it was clear enough :-)

Thanks for looking at this!

  • 写回答

2条回答 默认 最新

  • douci2015 2010-06-19 07:49
    关注

    First of all, make sure you remove the found nodes from the bottom and up. This is to make sure you remove child nodes before parent nodes.

    Second, what do you mean by "transforms to many things"? PHP's DOM XML will parse the document into a DOM node tree. Then you work on the tree, and when you aree done it will convert the DOM tree back into XML/HTML. You may very well lose indentation, arguments may change places and so on. The important thing is that the document means exactly the same thing, i.e. is an exact XML/HTML representation of the DOM tree.

    评论

报告相同问题?

悬赏问题

  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分