dqqt31923 2014-10-29 04:18
浏览 96
已采纳

DOMXPath / DOMDocument - 在注释块中获取div

Lets say I have this comment block containing HTML:

<html>
<body>

<code class="hidden">
<!-- 
    <div class="a">

        <div class="b">

            <div class="c">
                <a href="link">Link Test 1</a>
            </div>

            <div class="c">
                <a href="link">Link Test 2</a>
            </div>

            <div class="c">
                <a href="link">Link Test 3</a>
            </div>

        </div>

    </div>
-->
</code>

<code>
     <!-- test -->
</code>

</body>
</html>

Using DOMXPath for PHP, how do I get the links and text within the tag?

This is what I have so far:

    $dom = new DOMDocument();
    $dom->loadHTML("HTML STRING"); # not actually in code
    $xpath = new DOMXPath($dom);
    $query = '/html/body/code/comment()';
    $divs = $dom->getElementsByTagName('div')->item(0);

    $entries = $xpath->query($query, $divs);

    foreach($entries as $entry) {

        # shows entire text block
        echo $entry->textContent;

    }

How do I navigate so that I can get the "c" classes and then put the links into an array?

EDIT Please note that there are multiple <code> tags within the page, so I can't just get an element with the code attribute.

  • 写回答

1条回答 默认 最新

  • dongxiezhi0590 2014-10-29 04:26
    关注

    You already can target the comment containing the links, just follow thru that and make another query inside it. Example:

    $sample_markup = '<html>
    <body>
    
    <code class="hidden">
    <!--
        <div class="a">
    
            <div class="b">
    
                <div class="c">
                    <a href="link">Link Test 1</a>
                </div>
    
                <div class="c">
                    <a href="link">Link Test 2</a>
                </div>
    
                <div class="c">
                    <a href="link">Link Test 3</a>
                </div>
    
            </div>
    
        </div>
    -->
    </code>
    
    </body>
    </html>';
    $dom = new DOMDocument();
    $dom->loadHTML($sample_markup); # not actually in code
    $xpath = new DOMXPath($dom);
    $query = '/html/body/code/comment()';
    $entries = $xpath->query($query);
    foreach ($entries as $key => $comment) {
        $value = $comment->nodeValue;
        $html_comment = new DOMDocument();
        $html_comment->loadHTML($value);
        $xpath_sub = new DOMXpath($html_comment);
        $links = $xpath_sub->query('//div[@class="c"]/a'); // target the links!
        // loop each link, do what you have to do
        foreach($links as $link) {
            echo $link->getAttribute('href') . '<br/>';
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥65 永磁型步进电机PID算法
  • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?
  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥15 latex怎么处理论文引理引用参考文献
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?
  • ¥15 为什么使用javacv转封装rtsp为rtmp时出现如下问题:[h264 @ 000000004faf7500]no frame?
  • ¥15 乘性高斯噪声在深度学习网络中的应用