duanniu4106 2012-11-19 21:30
浏览 54
已采纳

如何从谷歌/雅虎图像搜索等获取更多链接(html-simple-dom)

I can get the source code of the search result page. So my question is about how to get MORE. For google, it only shows the first 20 image results in the source code I get, for Yahoo it's about 50. Because in both cases real people need to scroll down the page to see more search result.

Question: Is there anyway the script can do the "scroll down" for me so I can get more results?

The code I'm using:

require_once('simple_html_dom.php');
$url = "https://www.google.com/search?tbm=isch&q=cool+image";
$html = file_get_html($url);
foreach($html->find('img') as $element) {


    $image_url = $element->src; 

    echo $image_url, "<br />";}
  • 写回答

1条回答 默认 最新

  • doucong3048 2012-12-09 06:48
    关注

    I'll answer my own question. - -|||

    Google actually keeps the old version. To use that version, first search something, then scroll to the bottom and click "Switch to basic version".

    Now only 20 images are displayed on each page and the url contains page parameters.

    Because it's displaying 20 images each page, the second page's url has the parameter:

    start=20
    

    and the third page will be

    start=40
    

    This parameter: sout=1 is needed in the url to tell google you want the basic version.

    To conclude, the simplest google image search url with page number would be:

    $url = "https://www.google.com/search?tbm=isch&sout=1&start=" . ($pageNum -1)*20. "&q="  . $key_word ;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?