dongzhenjian5195 2013-09-02 10:41
浏览 74
已采纳

检查时会添加下划线链接

I am making a simple link checker to check thousands of direct links for files in a site I am managing now. All files are from archive_org. I made a textarea

<table width="100%"> <tr><td>URLs to check:</td><td><textarea name="myurl" id="myurl" cols="100" rows="20"></textarea></td></tr> 
<tr><td align="center" colspan="2"><br/><input class="text" type="submit" name="submitBtn" value="Check links"></td></tr> </table>

and all links on it will be stored in an array called $url (each url is put in a new line)

$url = explode("
", $_POST['myurl']);

I printed it using print_r and links inside the array are the same as entered without any character added.

I checked the urls using two methods: fopen() and curl functions, and no matter how many links I put, the program see all links are broken except for the last one. The last link in the array is the only one which is checked correctly.

I used get_headers function, and I noticed that all links (except for the last one) have underscore (_) added to their end. The get_headers code is:

for ($i=0;$i<count($url);$i++) {
   $headers = @get_headers($url[$i]);
   $headers = (is_array($headers)) ? implode( "
 ", $headers) : $headers;
   print_r($headers);
   echo "<br /><br />";   
    }

In the headers I noticed the links are as such:

HTTP/1.0 302 Moved Temporarily Server: nginx/1.1.19 Date: Mon, 02 Sep 2013 10:46:40 GMT Content-Type: text/html; charset=UTF-8 X-Powered-By: PHP/5.3.10-1ubuntu3.2 Accept-Ranges: bytes Location: http://ia600308.us.archive[dot]org/23/items/historyofthedecl00731gut/1dfre012103.mp3_ X-Cache: MISS from Dataprolinks X-Cache: MISS from AIMAN-DPL X-Cache-Lookup: MISS from AIMAN-DPL:3128 Connection: close HTTP/1.0 404 Not Found Server: nginx/1.1.19 Date: Mon, 02 Sep 2013 10:46:41 GMT Content-Type: text/html; charset=UTF-8 X-Powered-By: PHP/5.3.10-1ubuntu3.2 Set-Cookie: PHPSESSID=s2j3ct95vdji0ua89f32grd984; path=/; domain=.archive.org Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Pragma: no-cache X-Cache: MISS from Dataprolinks X-Cache: MISS from AIMAN-DPL X-Cache-Lookup: MISS from AIMAN-DPL:3128 Connection: close

There is an underscore added to the link, except for the header of the last url, no underscore is added. I guess this underscore is responsible for the checking error.

Where am I making mistakes?

  • 写回答

1条回答 默认 最新

  • donglu5000 2013-09-02 11:45
    关注

    For your cases, I guess you POST the URLs in Window, when you press "ENTER" key to separate the links, the "ENTER" as " ". In WWW, there must not include the "", therefore somewhere(php? curl? I have no idea about that.) convert it into "_".

    <?php
    
    $urls = array();
    $urls[] = 'http://archive.org/download/historyofthedecl00731gut/1dfre011103.mp3';
    $urls[] = 'http://archive.org/download/historyofthedecl00731gut/1dfre000103.txt';
    $urls[] = 'http://archive.org/download/historyofthedecl00731gut/1dfre082103.mp3';
    $urls[] = 'http://archive.org/download/historyofthedecl00731gut/1dfre001103.txt';
    $urls[] = 'http://archive.org/download/historyofthedecl00731gut/1dfre141103.mp3';
    
    print("<pre>" .print_r($urls, 1). "</pre><br /><br />");
    
    foreach($urls as $url){
        //ensure each url only start with ONE _ and end with ONE _
        print("<pre>_" . $url . "_</pre>");
        $header = array();
        $headers = @get_headers($url);
        print("<pre>" .print_r($headers, 1). "</pre><br /><br />");
    }
    
    ?>
    

    You can use my code to have a simple test: each link will be printed with "_" both in start and end. Then proof my explain. How to fix: just add the strip_tags(nl2br($url)) to remove the "", " ".

    Simple result

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 软件测试决策法疑问求解答
  • ¥15 win11 23H2删除推荐的项目,支持注册表等
  • ¥15 matlab 用yalmip搭建模型,cplex求解,线性化处理的方法
  • ¥15 qt6.6.3 基于百度云的语音识别 不会改
  • ¥15 关于#目标检测#的问题:大概就是类似后台自动检测某下架商品的库存,在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
  • ¥15 神经网络怎么把隐含层变量融合到损失函数中?
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥20 测距传感器数据手册i2c