doulang1945 2016-07-21 23:06
浏览 35
已采纳

缺少getElementsByTagName中的元素

I'm trying to get all the links from this site: https://www.supremecourt.uk/cases/search-results.html?q=affidavit

with the following code:

libxml_use_internal_errors(true);

$html = file_get_contents("https://www.supremecourt.uk/cases/search-results.html?q=affidavit");

$docs = new domDocument; 

$docs->loadHTML($html); 


$anchors = $docs->getElementsByTagName('a');

$links = array();

foreach($anchors as $anchor) {
    echo $links[] = $anchor->getAttribute('href');
    echo '<br>';
}

but the returned links do not include links from the search results. Why is that, and how can I fix it?

  • 写回答

1条回答 默认 最新

  • duangu9666 2016-07-21 23:17
    关注

    Search results on this site are provided by Google CSE via JSONP request and probably (not sure as I never tried to "break" CSE but there is signature in request to Google so this task is not easy for sure) couldn't be obtained from PHP or other ways that don't include the headless browser which can do all JS things (PhantomJS, CasperJS, Selenium).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 stata安慰剂检验作图但是真实值不出现在图上
  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题