doumei2023 2014-07-07 13:32
浏览 61
已采纳

从xml中解析具有相同名称的特定行

I have an xml with 10 records and the structure is:

<entry>
<title>My Title</title>
<link rel="alternate" type="text/html" href="http://myweb.com/posts/one.html"/>
<published>2014-07-07T00:34:00+00:00</published>
<updated>2014-07-07T00:34:00+00:00</updated>
<id>http://myweb.com/posts/one.html</id>
<author>
<name>Myweb.com</name>
</author>
<content>
Some Content Here
</content>
<link rel="enclosure" href="http://myweb.com/uploads/300px-300px.jpg" type="image/jpeg" length=""/>
</entry>

I am using the code bellow to parse it and its almost working great except that i can't fetch the image url that is in the duplicate line:

 <link rel="enclosure" href="http://myweb.com/uploads/300px-300px.jpg" type="image/jpeg" length=""/>

My code is:

$url = "http://myweb.com/posts.xml";
$xml = simplexml_load_file($url);
foreach($xml->entry as $PRODUCT) {

$my_title = trim($PRODUCT->title);
$url = trim($PRODUCT->id);
$im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]');

echo $my_title . " " . $url . " " . $im;
echo "<br>";

}

This: $im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]'); Returns "Array" and not the url inisde href.

Thanks

  • 写回答

2条回答 默认 最新

  • dongmaobeng7145 2014-07-08 21:13
    关注

    This: $im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]'); Returns "Array" and not the url inisde href.

    Whenever you see a string containing the word "Array" in PHP, where you were expecting something else, you need to think "hm, I seem to have cast an array to a string, how did that happen?" (Similarly, if you unexpectedly see the string "A", consider the possibility that it's a one-letter substring of "Array").

    In this case, the reason why is quite simple: if you look up the manual page for the SimpleXMLElement::xpath() method, you'll see that it returns an array unless there is an error (not finding a match is not an error, and will give you an empty array).

    The only reason this is surprising, is that most methods on that class return another instance of the same class, with magic overloads for things like the (string) cast. However, all of those objects represent a more-or-less coherent fragment of the XML document (e.g. 1 or more consecutive nodes, or siblings filtered by a particular tag-name), and can never represent "nothing". An XPath result could be empty, or contain nodes of various types from all over the document; I don't know for sure, but I suspect this is why an array return was chosen here rather than another variety of SimpleXMLElement object.

    So $PRODUCT->xPath('//link[@rel="enclosure"]')[0] will give you the first result (or $xpath_results = $PRODUCT->xPath('//link[@rel="enclosure"]'); $im = $xpath_results[0] if you can't rely on at least PHP 5.4, or want to insert a check in between for no nodes being matched).

    There are a few extra catches here, though:

    • Namespaces: as ThW points out, Atom feeds often have an XML namespace declaration, and you need to handle this in your XPath query by registering a prefix, e.g. $product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom'); and then use it in your XPath expression (e.g. //atom:link rather than //link).
    • You didn't specify that you wanted the href attribute: either change your XPath expression to select it (//link[@rel="enclosure"]/@href) or change your access to grab it from the SimpleXMLElement returned ($xpath_results[0]['href']).

    Stick it all together (and get rid of that ugly and unusual all-caps variable name), and the compact version (no error checking, minimum readability) would be either:

    $product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
    (string)$product->xPath('//atom:link[@rel="enclosure"]')[0]['href']
    

    or

    $product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
    (string)$product->xPath('//atom:link[@rel="enclosure"]/@href')[0]
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大