douxiajia6720 2009-11-02 08:44
浏览 16
已采纳

从PHP文件响应互联网中的任何网页

How can I create a simple PHP file, which will retrieve the HTML and the Headers of any web page in the internet, change images/resources url to their full url (for example: image.gif to http://www.google.com/image.gif), and then response it?

  • 写回答

3条回答 默认 最新

  • duanlushen8940 2009-11-02 11:43
    关注

    Okay first of all to get the headers use the PHP get_headers function.

    <?php
    
    $url = "http://www.example.com/";
    $headers = get_headers($url, true);
    
    ?>
    

    Then read the content of the page into a variable.

    <?php
    
    $handle = fopen($url, r);
    $content = '';
    while(! feof($handle)) {
        $text .= fread($handle, 8192);
    }
    fclose($handle);
    
    ?>
    

    You then need to run through the content looking for resources and pre-pending the url to get the absolute path to the resource if it isn't already an absolute path. The following regex example will work on src attributes (e.g. images and javascript) and should give you a starting point to look at other resources such as CSS which uses href="". This regex won't match if a : is in the source a good indicator that it contains http:// and is therefore an absolute path. PLEASE NOTE this is by no means perfect and won't account for all sorts of weird and wonderful resource locations but it's a good start.

    <?php
    
    $pattern = '@src="([0-9A-Za-z-_/\.])+"@';
    preg_match_all($pattern, $text, $matches);
    
    foreach($matches[0] as $match) {
        $src = str_replace('src="', '', $match);
        $text = str_replace($match, 'src="' . $url . $src, $text);
    }
    
    print($text);
    
    ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 51单片机中C语言怎么做到下面类似的功能的函数(相关搜索:c语言)
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端
  • ¥15 基于PLC的三轴机械手程序
  • ¥15 多址通信方式的抗噪声性能和系统容量对比
  • ¥15 winform的chart曲线生成时有凸起