dtdb99743 2011-12-09 09:42
浏览 62
已采纳

使用PHP从HTML标记中获取子节点列表

I am currently using the PHP DOM to get the BODY tag from HTML.

$doc = new DOMDocument();
$doc->loadHTML($HTML);    
$body = preg_replace("/.*<body[^>]*>|<\/body>.*/si", "", $HTML);

The above code completely gives me the html from the body tag for a given HTML.

Can I get the HTML tags with $body as an array?

  • 写回答

1条回答 默认 最新

  • dsznndq4912405 2011-12-09 10:12
    关注

    If possible, I would use DOM - it will make your solution a lot more reliable and cleaner to use.

    This should get your headed in the right direction (I'm not writing the solution for you, sorry):

    $html = file_get_contents("http://google.com");
    $dom = new DOMdocument();
    @$dom->loadHTML($html);
    $xpath = new DOMXPath($dom);
    
    $elements = $xpath->query("//*");
    
    
    foreach ($elements as $element) {
    
            echo "<h1>". $element->nodeName. "</h1>";
            $nodes = $element->childNodes;
    
            foreach ($nodes as $node) {
                    echo "<h2>".$node->nodeName. "</h2>";
                    echo $node->nodeValue. "
    ";
            }
    
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料