dongxian3574 2015-12-07 23:46
浏览 240
已采纳

如何从XML文件中的每个<description>获取第一个<p>?

I'm parsing a RSS feed to get the raw data and manipulate it.

On a WordPress RSS feed. I can find the title, link, description and publication of a the post by iterating over the SimpleXMLElement. The nodes are located in:

$title = $xml->channel->item[$i]->title;
$link = $xml->channel->item[$i]->link;
$description = $xml->channel->item[$i]->description;
$pubDate = $xml->channel->item[$i]->pubDate;

respectively.

The problem is $description had 2 <p>s inside. One one which is useless for me; the second one.

So how do I assign $description to only the first <p> of description?

Getting simply $xml->channel->item[$i]->description->p[0] won't work. It results in an internal server error.

My whole code looks like this:

<?php 
$html = "";
$url = "http://sntsh.com/posts/feed/";
$xml = simplexml_load_file($url);

for($i = 0; $i < 10; $i++){
    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description->children();
    $pubDate = $xml->channel->item[$i]->pubDate;

    $html .= "<a href='$link'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate";
}
echo $html;
  • 写回答

1条回答 默认 最新

  • dte49889 2015-12-08 00:58
    关注

    You can get the children of an element using the children() method. If you can guarantee that the first child will always be the element that you need, you can use it this way:

    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description->children();
    $pubDate = $xml->channel->item[$i]->pubDate;
    

    The children() function is meant to be used in an iterative manner, where every time you call it it returns the next child as a SimpleXMLElement. http://php.net/manual/en/simplexmlelement.children.php

    Edit
    It seems that the cause of the issue are the <![CDATA[ ]]> tags. They cause the SimpleXMLElement to be empty. Stripping them fixes it:

    $html = '';
    $src = file_get_contents('http://sntsh.com/posts/feed/');
    $search = ["<![CDATA[","]]>"];
    $replace = array('','');
    $data = str_replace($search,$replace,$src);
    $xml = simplexml_load_string($data);
    
    for($i = 0; $i < count($xml->channel->item); $i++)
    {
        $title = $xml->channel->item[$i]->title;
        $link = $xml->channel->item[$i]->link;
        $description = $xml->channel->item[$i]->description->children();
        // Or
        // $description = $xml->channel->item[$i]->description->p[0];
        $pubDate = $xml->channel->item[$i]->pubDate;
    
        $html .= "<a href='$link'><h3>$title</h3></a>";
        $html .= trim($description).'...';
        $html .= "<br />$pubDate";
    }
    echo $html;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?