dongyakui8675 2019-03-21 09:37
浏览 79
已采纳

PHP Curl无法加载页面

I am trying to open a specific page (https://www.yellowpages.com.au) I have tried simplehtmldom I have also tried Curl I have tried with different headers and added a certificate I can open other pages, just not this one and would like to know how the site is stopping my access and what I can do about it.

$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");


$certificate = "cacert-2019-01-23.pem";
curl_setopt($ch, CURLOPT_CAINFO, $certificate);
curl_setopt($ch, CURLOPT_CAPATH, $certificate);


$data = curl_exec($ch);
curl_close($ch);
echo $data;

Thanks

  • 写回答

1条回答 默认 最新

  • dongyouzhui1969 2019-03-21 10:34
    关注

    the url https://www.yellowpages.com.au does not serve a page, it just redirects you (via a HTTP Location Redirect) to another url (to be specific: Location: http://www.yellowpages.com.au/dataprotection?path=/) that actually serves a page. in order to load that url with curl, you must tell curl to follow HTTP Location redirects, that can be done with CURLOPT_FOLLOWLOCATION,

    in addition to that, yellowpages.com.au blocks requests that lack a User-Agent header, libcurl sets no default User-Agent header, you can set that with the CURLOPT_USERAGENT option, here is a working example:

    <?php
    $ch=curl_init('https://www.yellowpages.com.au');
    curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
    curl_setopt($ch,CURLOPT_USERAGENT,'libcurl/'.(curl_version()['version']).' PHP/'.PHP_VERSION);
    curl_exec($ch);
    curl_close($ch);
    

    outputs:

    $ php foo4.php
    
    <!DOCTYPE html>
    <html lang="en" class="no-js">
    <head>
        <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no"/>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
        <meta http-equiv="X-UA-Compatible" content="IE=edge"/>
    
            <title>Yellow Pages&reg; | Data Protection</title>
    
        <link rel="shortcut icon" href="/favicon.ico?v=2" />
    
        <!--[if (lt IE 9)&!(IEMobile)]><script src="/assets/ie/respond.sensis-9575467dfbc008e5b0d486dc4f481624.js" type="text/javascript" ></script><![endif]-->
        <!--[if (lt IE 10)&!(IEMobile)]><script src="/assets/ie/custom-event-ie9.js" type="text/javascript" ></script><![endif]-->
        <!--[if (lt IE 10)&!(IEMobile)]><link rel="stylesheet" href="/assets/ie/gradient-hacks-ie89-12453d23f1fec3d9d46e56cc6e023576.css"/><![endif]-->
    
            <script src="https://www.google.com/recaptcha/api.js?" async defer></script>
            <meta name="ROBOTS" content="NOINDEX, NOFOLLOW"/>
    
    </head>
    <body style="border-width: 0;
                    background-color: #EDEDED;
                    font-size: 85%;
                    line-height: 1.3;
                    margin: 0;
                    font-family: Helvetica, sans-serif;" id="">
    
    
            <div style="padding: 10px 15px;
                        height: 70px;
                        min-height: 45px;
                        background-color: #ffce00;
                        background-image: linear-gradient(to right, #ffce00, #fedb55, #ffce00);
                        box-shadow: inset 0px -5px 7px -5px rgba(0, 0, 0, 0.35);">
                <div style="position: relative;
                            max-width: 1240px;
                            margin: 0 auto;">
                    <a href="/">
    

    (capped)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信小程序 用oss下载 aliyun-oss-sdk-6.18.0.min client报错
  • ¥15 ArcGIS批量裁剪
  • ¥15 labview程序设计
  • ¥15 为什么在配置Linux系统的时候执行脚本总是出现E: Failed to fetch http:L/cn.archive.ubuntu.com
  • ¥15 Cloudreve保存用户组存储空间大小时报错
  • ¥15 伪标签为什么不能作为弱监督语义分割的结果?
  • ¥15 编一个判断一个区间范围内的数字的个位数的立方和是否等于其本身的程序在输入第1组数据后卡住了(语言-c语言)
  • ¥15 Mac版Fiddler Everywhere4.0.1提示强制更新
  • ¥15 android 集成sentry上报时报错。
  • ¥15 抖音看过的视频,缓存在哪个文件