在PHP中从XML内部解析HTML标记

I'm trying to create my own RSS feed (learning purposes) using simplexml_load_string while parsing http://uk.news.yahoo.com/rss in PHP. I get stuck at reading the HTML tags inside the <description> tag.

My code so far looks like this:

$feed = file_get_contents('http://uk.news.yahoo.com/rss');
$rss = simplexml_load_string($feed);

//for each element in the feed
foreach ($rss->channel->item as $item) {
    echo '<h3>'. $item->title . '</h3>'; 

        foreach($item->description as $desc){

             //how to read the href from the a tag???

             //this does not work at all
             $tags = $item->xpath('//a');
             foreach ($tags as $tag) {
                 echo $tag['href'];
             }
       }
}

Any ideas how to extract each HTML tag?

Thanks

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dsbfbz75185 2013-07-09 07:28
关注
The description content has its special characters encoded, so it's not treated as nodes within the XML, rather it's just a string. You can decode the special characters, then load the HTML into DOMDocument and do whatever you want to do. For example:

foreach ($rss->channel->item as $item) { echo '<h3>'. $item->title . '</h3>'; foreach($item->description as $desc){ $dom = new DOMDocument(); $dom->loadHTML(htmlspecialchars_decode((string)$desc)); $anchors = $dom->getElementsByTagName('a'); echo $anchors->item(0)->getAttribute('href'); } }

XPath is also available for use with DOMDocument, see DOMXPath.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报
编辑

预览
轻敲空格完成输入
显示为

卡片

标题

链接
评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

编辑

预览

报告相同问题？

关注问题

如何在php中删除xml中的额外标记 php xml
2016-11-16 05:57

回答 1 已采纳 Do not create XML as text, use DOM or XMLWriter methods to create/add the nodes. You load your XML
PHP - 如何使用html标记解析xml中的链接？ html php xml
2012-09-18 02:00

回答 2 已采纳 I hope this is helpful. I enjoy using xpath to cut through the XML I get back from SimpleXML: &l
在命名空间标记中解析XML属性（PHP） php xml
2011-07-14 11:30

回答 1 已采纳 See http://www.php.net/manual/en/domdocument.getelementsbytagnamens.php. You need to register the
PHP使用DOM对XML解析处理操作示例
2020-10-16 10:00

### PHP使用DOM对XML解析处理操作示例知识点 #### 1. DOMDocument对象与DOM模型 DOM（Document Object Model）文档对象模型是XML和HTML文档的编程接口，它提供了一个结构化的方式来表示文档，并允许程序和脚本动态...
在php中创建xml时删除xml版本标记 php xml
2011-05-10 01:02

回答 10 已采纳 In theory you can provide the LIBXML_NOXMLDECL option to drop the XML declaration when saving a do
PHP DOMDocument：如何使用COLONS解析自定义XML / RSS标记名称？ php xml
2016-06-29 01:14

回答 2 已采纳 Use getElementsByTagNameNS(): $node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "de
PHP / SOAP / XML：如何解析这个XML？ php xml
2013-06-10 14:13

回答 1 已采纳 This is very much so a hack, because ideal you should consume Soap requests with the SoapClient cl
php遍历解析xml字符串的方法
2020-12-19 04:06

在PHP中，处理XML数据是常见的任务，尤其在与服务器通信、数据交换或者解析配置文件时。本实例将详细介绍如何使用PHP的SimpleXMLElement类来遍历和解析XML字符串。首先，XML（eXtensible Markup Language）是一种...
如何在PHP中通过simple_html_dom解析HTML时区分单个Div中的标记 html php
2014-11-04 08:46

回答 1 已采纳 With $text1 = "Text1, has comma, Text2" and $text2 = ", Text2" you could use substr_replace(): $t
PHP解析HTML img标签src路径无法正常工作 html php
2013-05-17 07:38

回答 3 已采纳 All right, i found the problem! There is no need for complicated solutions, as the problem is this
PHP DOMDocument用标签中的非字母字符解析xml结构？ php
2011-07-13 18:30

回答 1 已采纳 Sort of, you really want getElementsByTagNameNS. At the beginning of the document, you might notic
PHP XML数据解析代码
2020-10-28 18:44

例如，可以使用`libxml_use_internal_errors(true)`和`libxml_clear_errors()`来捕获和处理XML解析时可能产生的错误。 - **性能优化**：对于大型XML文件，考虑使用其他更高效的解析方法，如DOM或SAX解析器。 #### ...
php解析xml 的四种简单方法(附实例)
2020-10-21 15:30

XML Expat Parser使用Expat XML解析器，它是基于事件的解析器。Expat解析器将XML文档视为一系列的事件，如开始标签、结束标签、字符数据等，并且在这些事件发生时调用指定的处理函数。Expat解析器的一个显著优势是它...
php解析xml方法实例详解
2020-10-23 22:13

在当前的IT技术中，XML（Extensible Markup Language，可扩展标记语言）作为一种被广泛使用的标记语言，常被用来存储、传输和交换数据。它具有良好的跨平台性，因此在数据交换、网络通讯、Web服务等多个领域中扮演着...
PHP 和 XML：PHP 中的 XML 解析
2024-04-20 21:15

新华的博客 PHP 中的 XML 解析是 Web 开发中的一项关键任务，涉及从 XML 文档中提取和操作数据。SimpleXML 扩展通过提供一种面向对象的方法来访问 XML 元素，从而简化了此过程。借助 SimpleXML，开发人员可以毫不费力地导航 XML...
没有解决我的问题, 去提问

在PHP中从XML内部解析HTML标记

3条回答 默认 最新

3条回答默认最新