doulang1945 2016-07-21 23:06
浏览 35
已采纳

缺少getElementsByTagName中的元素

I'm trying to get all the links from this site: https://www.supremecourt.uk/cases/search-results.html?q=affidavit

with the following code:

libxml_use_internal_errors(true);

$html = file_get_contents("https://www.supremecourt.uk/cases/search-results.html?q=affidavit");

$docs = new domDocument; 

$docs->loadHTML($html); 


$anchors = $docs->getElementsByTagName('a');

$links = array();

foreach($anchors as $anchor) {
    echo $links[] = $anchor->getAttribute('href');
    echo '<br>';
}

but the returned links do not include links from the search results. Why is that, and how can I fix it?

  • 写回答

1条回答 默认 最新

  • duangu9666 2016-07-21 23:17
    关注

    Search results on this site are provided by Google CSE via JSONP request and probably (not sure as I never tried to "break" CSE but there is signature in request to Google so this task is not easy for sure) couldn't be obtained from PHP or other ways that don't include the headless browser which can do all JS things (PhantomJS, CasperJS, Selenium).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 CST怎么把天线放在座椅环境中并仿真
  • ¥15 任务A:大数据平台搭建(容器环境)怎么做呢?
  • ¥15 r语言神经网络自变量重要性分析
  • ¥15 基于双目测规则物体尺寸
  • ¥15 wegame打不开英雄联盟
  • ¥15 公司的电脑,win10系统自带远程协助,访问家里个人电脑,提示出现内部错误,各种常规的设置都已经尝试,感觉公司对此功能进行了限制(我们是集团公司)
  • ¥15 救!ENVI5.6深度学习初始化模型报错怎么办?
  • ¥30 eclipse开启服务后,网页无法打开
  • ¥30 雷达辐射源信号参考模型
  • ¥15 html+css+js如何实现这样子的效果?