doulan9188 2016-10-14 14:26
浏览 25
已采纳

如何使用PHP从第三方网站解析格式错误的RSS源?

I'm trying to parse RSS feeds from some medias. My script works for most of them. The problem is that I need to agregate all of them, eventhough they are malformed.

I don't manage to get the description of these two feeds. How could I proceed anyway ?

Here is my script :

<?php
function RSS_items ($url) {
    $i = 0;
    $doc = new DOMDocument();
    $doc->load($url);
    $channels = $doc->getElementsByTagName('channel');
    foreach($channels as $channel) {
        $items = $channel->getElementsByTagName('item');
        foreach($items as $item) {
            $i++;
            $y[$i]['title'] = $item->getElementsByTagName('title')->item(0)->firstChild->textContent;
            $y[$i]['link'] = $item->getElementsByTagName('link')->item(0)->firstChild->textContent;
            $y[$i]['updated'] = $item->getElementsByTagName('pubDate')->item(0)->firstChild->textContent;
            $y[$i]['description'] = $item->getElementsByTagName('description')->item(0)->firstChild->textContent;
        }
    }
    echo '<pre>';
    print_r ($y);
    echo '</pre>';
}
// the two malformed feeds
RSS_items ('http://www.lefigaro.fr/rss/figaro_actualites-a-la-une.xml');
RSS_items ('https://francais.rt.com/rss');
?>
  • 写回答

1条回答 默认 最新

  • dongpanshi2839 2016-10-14 16:13
    关注

    Problem of your code is in useing firstChild property that select first child of element. But in target XML, description tag hasn't any childs that you want to select first of them. Remove it from code. The result should be like this

    $item->getElementsByTagName('description')->item(0)->textContent;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 对于这个复杂问题的解释说明
  • ¥50 三种调度算法报错 采用的你的方案
  • ¥15 关于#python#的问题,请各位专家解答!
  • ¥200 询问:python实现大地主题正反算的程序设计,有偿
  • ¥15 smptlib使用465端口发送邮件失败
  • ¥200 总是报错,能帮助用python实现程序实现高斯正反算吗?有偿
  • ¥15 对于squad数据集的基于bert模型的微调
  • ¥15 为什么我运行这个网络会出现以下报错?CRNN神经网络
  • ¥20 steam下载游戏占用内存
  • ¥15 CST保存项目时失败