duanniu4106 2012-11-19 21:30
浏览 54
已采纳

如何从谷歌/雅虎图像搜索等获取更多链接(html-simple-dom)

I can get the source code of the search result page. So my question is about how to get MORE. For google, it only shows the first 20 image results in the source code I get, for Yahoo it's about 50. Because in both cases real people need to scroll down the page to see more search result.

Question: Is there anyway the script can do the "scroll down" for me so I can get more results?

The code I'm using:

require_once('simple_html_dom.php');
$url = "https://www.google.com/search?tbm=isch&q=cool+image";
$html = file_get_html($url);
foreach($html->find('img') as $element) {


    $image_url = $element->src; 

    echo $image_url, "<br />";}
  • 写回答

1条回答 默认 最新

  • doucong3048 2012-12-09 06:48
    关注

    I'll answer my own question. - -|||

    Google actually keeps the old version. To use that version, first search something, then scroll to the bottom and click "Switch to basic version".

    Now only 20 images are displayed on each page and the url contains page parameters.

    Because it's displaying 20 images each page, the second page's url has the parameter:

    start=20
    

    and the third page will be

    start=40
    

    This parameter: sout=1 is needed in the url to tell google you want the basic version.

    To conclude, the simplest google image search url with page number would be:

    $url = "https://www.google.com/search?tbm=isch&sout=1&start=" . ($pageNum -1)*20. "&q="  . $key_word ;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分