douqi5079 2018-04-08 13:22
浏览 197
已采纳

php - loadHTML() - 每个<p>直到某个类

I'm calling some wikipedia content two different way:

$html = file_get_contents('https://en.wikipedia.org/wiki/Sans-serif');

The first one is to call the first paragraph

$dom = new DomDocument();
@$dom->loadHTML($html);
$p = $dom->getElementsByTagName('p')->item(0)->nodeValue;
echo $p;

The second one is to call the first paragraph after a specific $id

$dom = new DOMDocument();
@$dom->loadHTML($html);
$p=$dom->getElementById('$id')->getElementsByTagName('p')->item(0);
echo $p->nodeValue;

I'm looking for a third way to call all the first part. So I was thinking about calling all the <p> before the id or class "toc" which is the id/class of the table of content.

Any idea how to do that?

  • 写回答

2条回答 默认 最新

  • doulai2573 2018-04-09 12:15
    关注

    You could use DOMDocument and DOMXPath with for example an xpath expression like:

    //div[@id="toc"]/preceding-sibling::p

    $doc = new DOMDocument();
    $doc->load("https://en.wikipedia.org/wiki/Sans-serif");
    $xpath = new DOMXPath($doc);
    $nodes = $xpath->query('//div[@id="toc"]/preceding-sibling::p');
    
    foreach ($nodes as $node) {
        echo $node->nodeValue;
    }
    

    That would give you the content of the paragraphs preceding the div with id = toc.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器