Using some basic website scraping, I am trying to prepare a database for price comparison which will ease users' search experiences. Now, I have several questions:
Should I use file_get_contents()
or curl
to get the contents of the required web page?
$link = "http://xyz.com";
$res55 = curl_init($link);
curl_setopt ($res55, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($res55, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($res55);
Further, every time I crawl a web page, I fetch a lot of links to visit next. This may take a long time (days if you crawl big websites like Ebay). In that case, my PHP code will time-out. What should be the automated way to do this? Is there a way to prevent PHP from timing out by making changes on the server, or is there another solution?