the url https://www.yellowpages.com.au
does not serve a page, it just redirects you (via a HTTP Location Redirect) to another url (to be specific: Location: http://www.yellowpages.com.au/dataprotection?path=/
) that actually serves a page. in order to load that url with curl, you must tell curl to follow HTTP Location redirects
, that can be done with CURLOPT_FOLLOWLOCATION,
in addition to that, yellowpages.com.au blocks requests that lack a User-Agent header, libcurl sets no default User-Agent header, you can set that with the CURLOPT_USERAGENT option, here is a working example:
<?php
$ch=curl_init('https://www.yellowpages.com.au');
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch,CURLOPT_USERAGENT,'libcurl/'.(curl_version()['version']).' PHP/'.PHP_VERSION);
curl_exec($ch);
curl_close($ch);
outputs:
$ php foo4.php
<!DOCTYPE html>
<html lang="en" class="no-js">
<head>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no"/>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=edge"/>
<title>Yellow Pages® | Data Protection</title>
<link rel="shortcut icon" href="/favicon.ico?v=2" />
<!--[if (lt IE 9)&!(IEMobile)]><script src="/assets/ie/respond.sensis-9575467dfbc008e5b0d486dc4f481624.js" type="text/javascript" ></script><![endif]-->
<!--[if (lt IE 10)&!(IEMobile)]><script src="/assets/ie/custom-event-ie9.js" type="text/javascript" ></script><![endif]-->
<!--[if (lt IE 10)&!(IEMobile)]><link rel="stylesheet" href="/assets/ie/gradient-hacks-ie89-12453d23f1fec3d9d46e56cc6e023576.css"/><![endif]-->
<script src="https://www.google.com/recaptcha/api.js?" async defer></script>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW"/>
</head>
<body style="border-width: 0;
background-color: #EDEDED;
font-size: 85%;
line-height: 1.3;
margin: 0;
font-family: Helvetica, sans-serif;" id="">
<div style="padding: 10px 15px;
height: 70px;
min-height: 45px;
background-color: #ffce00;
background-image: linear-gradient(to right, #ffce00, #fedb55, #ffce00);
box-shadow: inset 0px -5px 7px -5px rgba(0, 0, 0, 0.35);">
<div style="position: relative;
max-width: 1240px;
margin: 0 auto;">
<a href="/">
(capped)