dongshou1856 2016-07-16 22:50
浏览 22
已采纳

如何在<body>中获取<a>标签,但不包括页眉和页脚部分

If I have a webpage like this:

<body>
  <header>
    <a href='http://domain1.com'>link 1 text</a>
  </header>

  <a href='http://domain2.com'>link 2 text</a>

  <footer>
    <a href='http://domain3.com'>link 3 text</a>
  </footer>
</body>

How do I pull the <a> tags out of the <body> but exclude the links from <header> and <footer>?

In the real web page, there will be a lot of <a> tags in the <header> so I'd rather not have to cycle through ALL of them.

I want to pull out the URLs and anchor text from each of the <a> tags that are NOT inside the <header> or <footer> tags.

EDIT: this is how I find links in the header:

$header = $html->find('header',0);
foreach ($header->find('a') as $a){
  do something
}

I would like to do this (note the use of "!")

$foo = $html->find('!header,!footer');
foreach ($foo->find('a') as $a){
  do something
}
  • 写回答

3条回答 默认 最新

  • duanli9591 2016-07-16 23:07
    关注

    Remove the header and footer from the DOM you are working with before looking for the links.

    <?php
        include("simple_html_dom.php");
        $source = <<<EOD
        <body>
            <header>
                <a href='http://domain1.com'>link 1 text</a>
            </header>
    
            <a href='http://domain2.com'>link 2 text</a>
    
            <a href='http://domain4.com'>link 4 text</a>
    
            <footer>
                <a href='http://domain3.com'>link 3 text</a>
            </footer>
        </body>
    EOD;
    
        $html = str_get_html($source);
        foreach ($html->find('header, footer') as $unwanted) {
            $unwanted->outertext = "";
        }
        $html->load($html->save()); 
        $links = $html->find("a");
        foreach ($links as $link) {
            print $link;
    };
    
    ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 merge函数占用内存过大
  • ¥15 Revit2020下载问题
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大
  • ¥15 单片机无法进入HAL_TIM_PWM_PulseFinishedCallback回调函数
  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.