I am making a simple link checker to check thousands of direct links for files in a site I am managing now. All files are from archive_org. I made a textarea
<table width="100%"> <tr><td>URLs to check:</td><td><textarea name="myurl" id="myurl" cols="100" rows="20"></textarea></td></tr>
<tr><td align="center" colspan="2"><br/><input class="text" type="submit" name="submitBtn" value="Check links"></td></tr> </table>
and all links on it will be stored in an array called $url (each url is put in a new line)
$url = explode("
", $_POST['myurl']);
I printed it using print_r and links inside the array are the same as entered without any character added.
I checked the urls using two methods: fopen() and curl functions, and no matter how many links I put, the program see all links are broken except for the last one. The last link in the array is the only one which is checked correctly.
I used get_headers function, and I noticed that all links (except for the last one) have underscore (_) added to their end. The get_headers code is:
for ($i=0;$i<count($url);$i++) {
$headers = @get_headers($url[$i]);
$headers = (is_array($headers)) ? implode( "
", $headers) : $headers;
print_r($headers);
echo "<br /><br />";
}
In the headers I noticed the links are as such:
HTTP/1.0 302 Moved Temporarily Server: nginx/1.1.19 Date: Mon, 02 Sep 2013 10:46:40 GMT Content-Type: text/html; charset=UTF-8 X-Powered-By: PHP/5.3.10-1ubuntu3.2 Accept-Ranges: bytes Location: http://ia600308.us.archive[dot]org/23/items/historyofthedecl00731gut/1dfre012103.mp3_ X-Cache: MISS from Dataprolinks X-Cache: MISS from AIMAN-DPL X-Cache-Lookup: MISS from AIMAN-DPL:3128 Connection: close HTTP/1.0 404 Not Found Server: nginx/1.1.19 Date: Mon, 02 Sep 2013 10:46:41 GMT Content-Type: text/html; charset=UTF-8 X-Powered-By: PHP/5.3.10-1ubuntu3.2 Set-Cookie: PHPSESSID=s2j3ct95vdji0ua89f32grd984; path=/; domain=.archive.org Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache X-Cache: MISS from Dataprolinks X-Cache: MISS from AIMAN-DPL X-Cache-Lookup: MISS from AIMAN-DPL:3128 Connection: close
There is an underscore added to the link, except for the header of the last url, no underscore is added. I guess this underscore is responsible for the checking error.
Where am I making mistakes?