douwen1213 2014-01-26 10:37
浏览 42

来自多个网址的file_get_contents

i want to save page content to files from multiple url.

For start i have sites url from array

$site = array( 
        'url' => 'http://onesite.com/index.php?c='.$row['code0'].'&o='.$row['code1'].'&y='.$row['code2'].'&a='.$row['cod3'].'&sid=', 'selector' => 'table.tabel tr'
    );  

For saveving files I have try:

foreach($site  as $n) {
$referer = 'reffername';


$header[] = "Accept: text/xml,application/xml,application/json,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";

$opts = array('http'=>array('method'=>"GET",
                            'header'=>implode('
',$header)."
".
                            "Referer: $referer
",
                            'user_agent'=> "Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.9.0.2) Gecko/2008092313 Ubuntu/9.25 (jaunty) Firefox/3.8"));
$context = stream_context_create($opts);

$data = file_get_contents($site["url"], false, $context);

$file = md5('$id');

file_put_contents($file, $data);
$content = unserialize(file_get_contents($file));
}
  • 写回答

1条回答 默认 最新

  • doujia2090 2014-01-26 11:22
    关注

    Basic cURL multi script :

    // Your URL array that hold links to files 
    $urls = array(); 
    
    // cURL multi-handle
    $mh = curl_multi_init();
    
    // This will hold cURLS requests for each file
    $requests = array();
    
    $options = array(
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_AUTOREFERER    => true, 
        CURLOPT_USERAGENT      => 'paste your user agent string here',
        CURLOPT_HEADER         => false,
        CURLOPT_SSL_VERIFYPEER => false,
        CURLOPT_RETURNTRANSFER => true
    );
    
    //Corresponding filestream array for each file
    $fstreams = array();
    
    $folder = 'content/';
    if (!file_exists($folder)){ mkdir($folder, 0777, true); }
    
    foreach ($urls as $key => $url)
    {
        // Add initialized cURL object to array
        $requests[$key] = curl_init($url);
    
        // Set cURL object options
        curl_setopt_array($requests[$key], $options);
    
        // Extract filename from URl and create appropriate local path
        $path     = parse_url($url, PHP_URL_PATH);
        $filename = pathinfo($path, PATHINFO_FILENAME)).'-'.$key; // Or whatever you want
        $filepath = $folder.$filename;
    
        // Open a filestream for each file and assign it to corresponding cURL object
        $fstreams[$key] = fopen($filepath, 'w');
        curl_setopt($requests[$key], CURLOPT_FILE, $fstreams[$key]);
    
        // Add cURL object to multi-handle
        curl_multi_add_handle($mh, $requests[$key]);
    }
    
    // Do while all request have been completed
    do {
       curl_multi_exec($mh, $active);
    } while ($active > 0);
    
    // Collect all data here and clean up
    foreach ($requests as $key => $request) {
    
        //$returned[$key] = curl_multi_getcontent($request); // Use this if you're not downloading into file, also remove CURLOPT_FILE option and fstreams array
        curl_multi_remove_handle($mh, $request); //assuming we're being responsible about our resource management
        curl_close($request);                    //being responsible again.  THIS MUST GO AFTER curl_multi_getcontent();
        fclose($fstreams[$key]);
    }
    
    curl_multi_close($mh);
    
    评论

报告相同问题?

悬赏问题

  • ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
  • ¥15 marlin编译错误,如何解决?
  • ¥15 有偿四位数,节约算法和扫描算法
  • ¥15 VUE项目怎么运行,系统打不开
  • ¥50 pointpillars等目标检测算法怎么融合注意力机制
  • ¥20 Vs code Mac系统 PHP Debug调试环境配置
  • ¥60 大一项目课,微信小程序
  • ¥15 求视频摘要youtube和ovp数据集