这个xPath有点帮助吗？

I am getting some info from an RSS.

<?php
$dom = new DOMDocument;
libxml_use_internal_errors(TRUE);
$dom->load('http://www.myrss.com');
libxml_clear_errors();

$xPath = new DOMXPath($dom);
$links = $xPath->query('xxxxx');
foreach($links as $link) {
    printf("%s 
", $link->nodeValue);
}
?>

I have managed to get the TITLE, LINK and DESCRIPTION with //item/title and so on, howver I want to get the text content and image of description seperated.

As I can see through page source using firefox this is the code I see for image and the content. Both are in <description></description>

IMAGE

<div class="separator" style="clear: both; text-align: center;"><a href="LINK TO IMAGE" imageanchor="1" 
style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="192" 
src="LINK TO IMAGE" width="320" /></a></div>

CONTENT TEXT

<span class="Apple-style-span" style="font-family: 'Trebuchet MS', sans-serif;"> CONTENT TEXT IS HERE </span>

What xPath should I use to get those data? Thank you

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

dongquanyu5816 2011-03-25 17:15

关注

If it is what it looks like and the content is HTML-encoded, you can't do it in one step. You must retrieve every description text and parse into its own DOM (unless you want to resort to regex, which I would strongly discourage).

When in doubt, you can pass it through Tidy before. DOMDocument has loadHTML(), which is pretty resilient, but it is not guaranteed that it can load any HTML.

// beware, this is untested. it should give you an idea, though.

$dom = new DOMDocument;
libxml_use_internal_errors(TRUE);

$dom->load('http://www.myrss.com');
libxml_clear_errors();

$xPath = new DOMXPath($dom);
$items = $xPath->query('/rss/channel/item');

foreach($items as $item) {
    $descr = $xPath->query('./description', $item);
    // there should be at most one, but foreach gracefully
    // handles the case where there is no <description>
    foreach ($descr as $d) {
        $temp_dom = new DOMDocument();
        $temp_dom->loadHTML( $d->nodeValue );   // error handling/Tidy here!

        $temp_xpath = new DOMXPath($temp_dom);

        $img = $temp_xpath->query('//img');
        $txt = $temp_xpath->query('//span[@class="Apple-style-span"]');

        // now do something with $img and $txt
    }

}

本回答被题主选为最佳回答 , 对您是否有帮助呢?

查看更多回答(2条)

报告相同问题？

关注问题

这个xPath有点帮助吗？ php
2011-03-25 15:54

回答 3 已采纳 If it is what it looks like and the content is HTML-encoded, you can't do it in one step. You must
这个xpath咋写的啊？ python
2022-06-19 23:27

回答 2 已采纳如果链接是唯一的，可以用：//a[contains(@href, "movie.douban.com")]如果父级唯一，可以通过父级定位：//div[@class="pic"]/acss也同理，若有帮
如何使用PHP xpath获取所有属性？ php
2015-12-10 10:49

回答 2 已采纳 In XPath, you can use @* to reference attributes of any name, for example : $nodes = $xpath->q
php如何遍历孩子节点,使用PHP DOMXpath遍历子节点？
2021-04-23 22:46

韦桂超的博客理想情况下,我想在每个子节点上做另一个xquery,但似乎无法直截了当.这是我的情景：数据：Link text 1Something else text 1Link text 2Something else text 2Link text 3Something else text 3和代码：$...
PHP SimpleXMLElement xpath php
2018-03-22 19:02

回答 1 已采纳 This gives me an empty array! No it doesn't. Look closely at your output, and you will see th
在PHP中使用XPath替换XML属性 php xml
2019-06-11 17:26

回答 1 已采纳 The answer as Nigel Ren suggested was just to remove these two lines, as they no longer apply: $
如何在XPath中注册PHP函数？ php xml
2013-10-24 10:17

回答 1 已采纳 In your question it looks like a typo, there is no function named ends-with therefore I would expe
php-xpath-simple-filter:PHP XPath 简单过滤器
2021-05-30 10:05

这是一个简单的 PHP 类，用于帮助过滤 SimpleXML 内容，在很大程度上使用约定和配置。示例代码使用以下 XML 检查“src/test/php”中的测试 <?xml version="1.0" encoding="UTF-8"?> <name>Pa amb ...
为什么XPath找不到标签？ php xml
2017-07-04 19:00

回答 1 已采纳 Well, you query is almost correct. You have just forgotten putting last closing "]" char. It ough
如何在php中获取xpath查询的结果？ php xml
2014-09-13 16:07

回答 1 已采纳 Your DOM is empty, you never add $node to it. Try: $reader = new XMLReader(); $reader->open("c
PHP和XPath查询 php
2017-04-12 18:17

回答 1 已采纳 There are a few approaches to do this. First of all, you should register the namespace: $xml->
php dom 查询,php – DOM XPath查询帮助
2021-04-12 18:34

覃龙光的博客所以这个xpath查询让我抓狂.我正在尝试做的是搜索特定课程类型的高尔夫球场的xml文件(在这种情况下是与谷歌地图api一起使用的kml文件)并抓取每个匹配的< Placemark>元素,以便我可以创建一个新的xml对象与结果,...
XPATH - 如何提取这个？ [关闭] php
2013-01-27 22:22

回答 1 已采纳 Fetch the anchor tag which you can search for, this is still easy. The more difficult part is find
PHP的html实现xpath解析,PHP xpath提取网页数据内容代码解析
2021-03-24 08:17

NICOTENDO的博客想要使用xpath来解析html内容, PHP自带两个对象DOMDocument，DOMXpath，其中初始化 loadHtml一般都会报很多警告，但是并不影响使用，用@屏蔽错误。/*** 初始化DOMXpath对象** @param [type] $content 网页内容* @...
php xpath,PHP使用xpath解析XML的方法详解
2021-04-09 10:14

老大哥11的博客分享给大家供大家参考，具体如下：XML文件在PHP网站开发的轻量级应用中使用非常广泛，而PHP解析和读取XML文件的方式有很多种，比如JS DOM、SimpleXml、Xpath等方式解析XML文件，今天来讲讲在PHP中使用Xpath解析XML的...
没有解决我的问题, 去提问

悬赏问题

¥20 关于#硬件工程#的问题，请各位专家解答！
¥15 关于#matlab#的问题：期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707，使系统具有较小的超调量
¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
¥30 截图中的mathematics程序转换成matlab
¥15 动力学代码报错，维度不匹配
¥15 Power query添加列问题
¥50 Kubernetes&Fission&Eleasticsearch
¥15 報錯：Person is not mapped，如何解決？
¥15 c++头文件不能识别CDialog
¥15 Excel发现不可读取的内容

码龄粉丝数原力等级 --

这个xPath有点帮助吗？

3条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

这个xPath有点帮助吗？

3条回答 默认 最新

悬赏问题

3条回答默认最新