dongtieshang5429 2009-11-24 18:01
浏览 113
已采纳

使用DOM获取div(包括子标签)的内容

i am using DOM to get content of div tag but inner html part is not shown. Function is:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTMLFile("$url");
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
$divTag = $xpath->query('//div[@id="post"]');
foreach ($divTag as $val) {
echo $val->getAttribute('title') . ' - ' . $val->nodeValue . "<br />
";
}

if source of page is (just for Div)

<div id="post">Some text <img src="..." /> <table>some codes</table></div>

then function returns just

"Some text " 

but i want to get all HTML elements too, like that:

Some text <img src="..." /> <table>some codes</table>

Is there any way to do it? Thanks right now.

  • 写回答

3条回答 默认 最新

  • doujia9204 2009-11-24 18:58
    关注

    If you're looking for the DOMDocument version of innerHTML in the browser DOM, the nearest is saveXML.

    echo $dom->saveXML(val).'<br />
    ';
    

    (Remember to htmlspecialchars if you want that to actually appear as text.)

    This gives you the outerHTML though. If you really need the innerHTML, you'd have to loop through each of the element's child nodes and pass them to saveXML, then implode them.

    And it's XML serialisation only: there is no corresponding HTML version. saveHTML does exist but can only save the whole document at once, sadly. If it matters that you get legacy-HTML, you might be able to get away with it by passing in the LIBXML_NOEMPTYTAG option to ensure that annoying empty tags like <script src="..."></script> don't break the browser.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端
  • ¥15 基于PLC的三轴机械手程序
  • ¥15 多址通信方式的抗噪声性能和系统容量对比
  • ¥15 winform的chart曲线生成时有凸起
  • ¥15 msix packaging tool打包问题
  • ¥15 finalshell节点的搭建代码和那个端口代码教程
  • ¥15 Centos / PETSc / PETGEM
  • ¥15 centos7.9 IPv6端口telnet和端口监控问题
  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作