duanqiaoren9975 2017-04-22 23:53
浏览 75
已采纳

PHP首先在源代码中提取链接

I'm trying to extract the first occurence of a link that starts like this

https://encrypted-tbn3.gstatic.com/images?...

from the source code of a page. The link starts and ends with a ". Here is what I've got so far:

$search_query = $array[0]['Name'];
$search_query = urlencode($search_query);
$context = stream_context_create(array('http' => array('header' => 'User-Agent: Mozilla compatible')));
$response = file_get_contents( "https://www.google.com/search?q=$search_query&tbm=isch", false, $context);
$html = str_get_html($response);
$url = explode('"',strstr($html, 'https://encrypted-tbn3.gstatic.com/images?'[0]))

However the output of $url is not the link I try to extract, but something very different. I have added an image.enter image description here

Could anyone explain the output to me and how I would get the desired link? Thanks

  • 写回答

2条回答 默认 最新

  • doujia1871 2017-04-23 00:14
    关注

    It seems that you're using PHP Simple HTML DOM Parser.
    I normally use DOMDocument, which is part of php build-in classes.
    Here's a working example of what you need:

    $search_query = $array[0]['Name'];
    $search_query = urlencode($search_query);
    $context = stream_context_create(array('http' => array('header' => 'User-Agent: Mozilla compatible')));
    $response = file_get_contents( "https://www.google.com/search?q=$search_query&tbm=isch", false, $context);
    
    libxml_use_internal_errors(true);
    $dom = new DOMDocument();
    $dom->loadHTML($response);
    
    foreach ($dom->getElementsByTagName('img') as $item) {
        $img_src =  $item->getAttribute('src');
        if (strpos($img_src, 'https://encrypted') !== false) {
            print $img_src."
    ";
        }
    }
    

    Output:

    https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcSumjp6e37O_86nc36mlktuWpbFuCI4nkkkocoBCYW3qCOicqdu_KEK-MY
    https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcR_ttK8NlBgui_JndBj349UxZx0kHn0Z-Essswci-_5UQCmUOruY1PNl3M
    https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcSydaTpSDw2mvU2JRBGEYUOstTUl4R1VhRevv1Sdinf0fxRvU26l3pTuqo
    ...
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记