dqwh26750 2018-12-18 19:06
浏览 55

如何循环浏览网站的CSV文件并使用curl测试它们是否在线?

This cron job php script finds the csv file on my server then loops through the urls on it. It attempts to check if its loaded via https or http or its offline via curl. This curl request may be taking up too much time. I've done this via post through ajax and it completes the job, but I needed to do this via cron job and a csv file. Are there any other possible solutions?

Can you find a reason why it doesn't complete the task?

Any help would be great.

function url_test($url){

  $timeout = 20;
  $ch = curl_init();
  curl_setopt ($ch, CURLOPT_HEADER  , true);  // we want headers
  curl_setopt($ch, CURLOPT_NOBODY  , true);  // we don't need body
  curl_setopt ( $ch, CURLOPT_URL, $url );
  curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
  curl_setopt ( $ch, CURLOPT_TIMEOUT, $timeout );
  $http_respond = curl_exec($ch);
  $http_respond = trim( strip_tags( $http_respond ) );
  $http_code = curl_getinfo( $ch, CURLINFO_HTTP_CODE );


  if ( ( $http_code == "200" ) || ( $http_code == "301")) {

    return true;
  } else {

    return false;

  }

}

// run on each url

$offline = 0;
$fullcount = 0;
if (($handle = fopen("/pathtocsv/".$csv, "r")) !== FALSE)
  {
  while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
  {
      $num = count($data);
      for ($c = 0; $c < $num; $c++)
        {

         $https = "https://".$data[$c];
         $https = strtolower($https);

         $http = "http://".$data[$c];
         $http = strtolower($http);

         $http = preg_replace('/\s+/', '', $http);
         $https = preg_replace('/\s+/', '', $https);


      $site = $data[$c];


     if(url_test($https)) 
      { 
        $fullcount++;
          echo $https. " <br>";

          ?>

          <?php
      }
      else if(url_test($http))
      {
        $fullcount++;
          echo $http. " <br>";
          ?>

          <?php
      }else{

           echo $site. " <br>";

          $mysqltime = date("Y-m-d H:i:s", $phptime);
          try
            {
            $conn = new PDO("conn info here);

            // set the PDO error mode to exception

            $conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
            $sql = $conn->prepare($sql);

            $sql = "INSERT INTO table (url,csv,related)
                VALUES ('$site','$csv',1)";

            // use exec() because no results are returned

            $conn->exec($sql);
            echo "New record created successfully";
            }

          catch(PDOException $e)
            {
            echo "Connection failed: " . $e->getMessage();
            }

          }
         curl_close( $ch );

      }
  • 写回答

1条回答 默认 最新

  • dongzhi4239 2018-12-18 19:19
    关注

    You can use the get_headers() function. reference

    It will return a response similar to:

    Array
    (
        [0] => HTTP/1.1 200 OK
        [1] => Date: Sat, 29 May 2004 12:28:13 GMT
        [2] => Server: Apache/1.3.27 (Unix)  (Red-Hat/Linux)
        [3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
        [4] => ETag: "3f80f-1b6-3e1cb03b"
        [5] => Accept-Ranges: bytes
        [6] => Content-Length: 438
        [7] => Connection: close
        [8] => Content-Type: text/html
    )
    

    which you can use to validate as needed.

    As for why the task you're running does not complete. Have you checked the error logs.

    评论

报告相同问题?

悬赏问题

  • ¥15 mmocr的训练错误,结果全为0
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀