如何从包含带有PHP的HTML文档的字符串中删除使用xpath-> query查找的节点

The use case is quite simple. I would like to find node via an xpath statement in a string(!) that basically contains an HTML document and delete them.

I know how to find the nodes with PHP. It is basically like this: create new DOMDocument LoadHTML (or LoadXML) Create new DOMXpath and then method "query" or "evaluate". Done.

However deleting is the tricky part. One would think that you just delete the nodes with a few statements (and at the end parentNode->removechild) and just save the result back into the string with saveHTML. Unfortunately this operation transforms almost every time "too many things" in the original HTML string.

So my question now is. How could I delete the nodes return by xpath->query ($query) without using saveHTML or saveXML? And without writing my own parser.

Hope it was clear enough :-)

Thanks for looking at this!

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douci2015 2010-06-19 07:49
关注
First of all, make sure you remove the found nodes from the bottom and up. This is to make sure you remove child nodes before parent nodes.

Second, what do you mean by "transforms to many things"? PHP's DOM XML will parse the document into a DOM node tree. Then you work on the tree, and when you aree done it will convert the DOM tree back into XML/HTML. You may very well lose indentation, arguments may change places and so on. The important thing is that the document means exactly the same thing, i.e. is an exact XML/HTML representation of the DOM tree.

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

PHP $ xpath->查询循环 php
2019-02-28 08:35

回答 1 已采纳 Because you're using single quotes your resulting query string looks exactly like this (with $i an
xpath-> query的递归上下文节点 php
2013-11-01 15:03

回答 1 已采纳 / at the beginning specifies an absolute location path (i.e, from the document root). Instead, you
xpath-> query（）仅适用于星号 php xml
2014-12-15 16:03

回答 1 已采纳 You need to register and use a namespace prefix for the namespace used in the XML. From the tag an
php 使用xpath_在PHP中使用XPath
2020-07-04 02:16

cuxiong8996的博客 CRUD：创建，读取，更新和删除 CSS：级联样式表 DOM：文档对象模型 JSON：JavaScript对象表示法 RDF：资源描述框架 REST：代表性状态转移 RSS：真正简单的联合 SKU：库存单位 URI：统一资源标识符 ...
评估DOMXpath-> query的结果是否返回匹配 php
2011-02-24 18:22

回答 1 已采纳 Check the length parameter on the return of the query() function. xpath->query() returns a DOMN
php解析html内容的字符串变量中的XPath php
2014-09-09 11:28

回答 1 已采纳 You mean something like this? $doc->loadXML('<img src="path/to/image.ext><br>some
使用xpath从background-image样式属性中提取值 php
2017-11-01 05:47

回答 1 已采纳 1) You lost quotes wrapping xpath - it's string. 2) with dom xpath, query returns set of nodes w
html获取php对象属性,PHP DOM XPath获取HTML节点方法大全
2021-04-29 04:57

战争与枪无关的博客 PHP的有些技巧可能大家并不常用到，比如DOM...相比于使用正则表达式，这个方法更简单快捷。我就就常用DOMDocument和XPath两个类做一个介绍。假设有这样一个HTML页面(部分)，其内容如下：$html = << Welcome P...
xPath - 你能从节点中删除逗号吗？ php
2012-04-15 20:39

回答 2 已采纳 You could use translate() to replace the , character with empty an string: xpath->query("trans
在PHP中使用XPath替换XML属性 php xml
2019-06-11 17:26

回答 1 已采纳 The answer as Nigel Ren suggested was just to remove these two lines, as they no longer apply: $
如何在PHP中使用DomDocument或XPath获取HTML文档的确切结构？ html php
2015-07-19 15:07

回答 1 已采纳 Suppose, $str contains the HTML // Create DomDocument $doc = new DomDocument(); $doc->loadH
php xpath类库,PHP 怎么使用 XPath 来采集页面数据内容
2021-03-24 11:47

快乐小学僧的博客之前有说过使用 Python 使用 XPath 去采集页面数据内容，前段时间参与百度内测的一个号主页展现接口，需要文章页面改造的application/ld+json代码我想过使用 QueryList 的框架去操作，但是因为他大小也算个框架，...
使用带有联合的xPath选择节点 php
2017-08-02 14:45

回答 2 已采纳 Xpath expressions work as filters. They do not aggregate/compile in that kind of sense (Like an SQ
php 换行 html_php-自动换行/剪切HTML字符串中的文本
2021-03-22 20:58

宁南山的博客在这里,我想做的是：我有一个包含HTML标签的字符串,并且我想使用除HTML标签之外的自动换行功能将其剪切.我被卡住了：public function textWrap($string, $width){$dom = new DOMDocument();$dom->loadHTML($...
php 删除span标签,PHP使用DOMXPath剥离标签并删除节点
2021-04-16 14:48

hfcorriez的博客我有一个像这样的字符串：Some Content to keepThis content should remain, but span around it should be strippedKeep this content tooThis whole node should be deleted我想做的是,如果span具有像ice-de...
没有解决我的问题, 去提问

悬赏问题

¥15 素材场景中光线烘焙后灯光失效
¥15 请教一下各位，为什么我这个没有实现模拟点击
¥15 执行 virtuoso 命令后，界面没有，cadence 启动不起来
¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
¥20 有关区间dp的问题求解
¥15 多电路系统共用电源的串扰问题
¥15 slam rangenet++配置
¥15 有没有研究水声通信方面的帮我改俩matlab代码
¥15 ubuntu子系统密码忘记
¥15 保护模式-系统加载-段寄存器

如何从包含带有PHP的HTML文档的字符串中删除使用xpath-> query查找的节点

2条回答 默认 最新

悬赏问题

2条回答默认最新