douyan4900 2013-09-28 23:26
浏览 27

PHP - 使用DOMDocument - 尝试查找feed的链接

I have a function using this bit of code and I am trying to figure out what I am doing wrong. I want to find a web page's rss feed if it has one. As of right now, it's not returning any URL, it shows the type, but that's it. And the blog_url key does not get set in the array. Here is the code:

  $results = array();
  $doc = new DOMDocument();
  @$doc->preserveWhiteSpace = FALSE;
  $html = file_get_contents($url);
  $doc->loadHTML("$html");

  $links = $doc->getElementsByTagName('link');
  foreach ($links as $tag) {
    $type = $tag->getAttribute('type');
    if (preg_match("/(rss+xml|atom+xml')/si", $type))
      $href_text = $tag->nodeValue;
      if(preg_match("/('feed|journal|blog')/si", $href_text))
        $results['blog_url'] = $tag->getAttribute('href');
  }
  • 写回答

1条回答 默认 最新

  • duanci8209 2013-10-15 08:30
    关注
    <?php
    
    $url = ''; // EDIT THIS
    
    $doc = new DOMDocument;
    @$doc->loadHTMLFile($url);
    
    $xpath = new DOMXPath($doc);
    $nodes = $xpath->query('head/link[@rel="alternate"][@type="application/atom+xml" or @type="application/rss+xml"][@href]');
    
    $result = array();
    
    foreach ($nodes as $node) {
        $href = $node->getAttribute('href');
        if (preg_match('/(feed|journal|blog)/si', $href)) {
            $result['blog_url'] = $href;
            break;
        }
    }
    
    print_r($result);
    
    评论

报告相同问题?

悬赏问题

  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向
  • ¥15 如何用python向钉钉机器人发送可以放大的图片?
  • ¥15 matlab(相关搜索:紧聚焦)
  • ¥15 基于51单片机的厨房煤气泄露检测报警系统设计
  • ¥15 Arduino无法同时连接多个hx711模块,如何解决?