dongqiu7365 2015-03-15 08:07
浏览 47
已采纳

如何解析网页以将simplexml_import_dom更改为DOMXPath? [关闭]

All the links in the web page http://php.net were extracted with simplexml_import_dom in code1.

code1
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile('http://php.net');
$xml = simplexml_import_dom($dom);
$nodes = $xml->xpath('//a[@href]');
foreach ($nodes as $node) {
    echo $node['href'], "<br />
";
}
?>

Now i want parse the web page with DOMXPath,change simplexml_import_dom in code1 into DOMXPath in code2,there is a bug in code2 ,how to fix it?

code2
<?php
$html = file_get_contents('http://php.net');
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a[@href]');
foreach ($nodes as $node) {
    echo $node['href'], "<br />
";
}
?>
  • 写回答

2条回答 默认 最新

  • douxi2011 2015-03-15 08:33
    关注

    returned data from query is objects not array!

    if you get warning like :

    Warning: DOMDocument::loadHTML(): Tag nav invalid in Entity
    

    in output you can add this line before loadHTML function call

    it because of html5 tag used in document

    libxml_use_internal_errors(true);
    

    code :

    $html =  file_get_contents('http://php.net');
    $dom = new DOMDocument();
    $dom->loadHTML($html);
    $xpath = new DOMXPath($dom);
    $nodes = $xpath->query('//a[@href]');
    foreach ($nodes as $node) {
        echo $node->getAttribute("href"), "<br />
    ";
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 帮我写一个c++工程
  • ¥30 Eclipse官网打不开,官网首页进不去,显示无法访问此页面,求解决方法
  • ¥15 关于smbclient 库的使用
  • ¥15 微信小程序协议怎么写
  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?
  • ¥20 怎么用dlib库的算法识别小麦病虫害
  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教