dongqing314511
dongqing314511
2017-05-20 12:21

如何找到具有特定值的h3标签

已采纳

Well, I have a HTML File with the following structure:

<h3>Heading 1</h3>
  <table>
   <!-- contains a <thead> and <tbody> which also cointain several columns/lines-->
  </table>
<h3>Heading 2</h3>
  <table>
   <!-- contains a <thead> and <tbody> which also cointain several columns/lines-->
  </table>

I want to get JUST the first table with all its content. So I'll load the HTML File

<?php 
  $dom = new DOMDocument();
  libxml_use_internal_errors(true);
  $dom->loadHTML(file_get_contents('http://www.example.com'));
  libxml_clear_errors();
?>

All tables have the same classes and also have NO specific ID's. That's why the only way I could think of was to grab the h3-tag with the value "Heading 1". I already found this one, which works well for me. (Thinking of the fact that other tables and captions could be added leaves the solution as unfavorable)
How could I grab the h3 tag WITH the value "Heading 1"? + How could I select the following table?

EDIT#1: I don't have access to the HTML File, so I can't edit it.
EDIT#2: My Solution (thanks to Martin Henriksen) for now is:

<?php
    $doc = new DOMDocument(1.0);
    libxml_use_internal_errors(true);
    $doc->loadHTML(file_get_contents('http://example.com'));
    libxml_clear_errors();
    foreach($doc->getElementsByTagName('h3') as $element){
      if($element->nodeValue == 'exampleString')
        $table = $element->nextSibling->nextSibling;
        $innerHTML= '';
        $children = $table->childNodes;
        foreach ($children as $child) {
          $innerHTML .= $child->ownerDocument->saveXML( $child );
        }
        echo $innerHTML;
        file_put_contents("test.xml", $innerHTML);
    }
  ?>
  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

2条回答

  • doupo5178 doupo5178 4年前

    You can fetch out all the DomElements which the tag h3, and check what value it holds by accessing the nodeValue. When you found the h3 tag, you can select the next element in the DomTree by nextSibling.

    foreach($dom->getElementsByTagName('h3') as $element)
    {
        if($element->nodeValue == 'Heading 1')
            $table = $element->nextSibling;
    }
    
    点赞 评论 复制链接分享
  • douzhushen_9776 douzhushen_9776 4年前

    You can Find any tag in HTML using simple_html_dom.php class you can download this file from this link https://sourceforge.net/projects/simplehtmldom/?source=typ_redirect

    Than

    <?php
    include_once('simple_html_dom.php');
    
    $htm  = "**YOUR HTML CODE**";
    $html = str_get_html($htm);
    $h3_tag = $html->find("<h3>",0)->innertext;
    echo "HTML code in h3 tag"; 
    print_r($h3_tag);
    ?>
    
    点赞 评论 复制链接分享

相关推荐