douliang4858 2015-01-07 01:44
浏览 35

PHP或MySQL错误?

I am doing infinite curl via Webserver API to grep millions of data remotely. (Using PHP & MySQL)

The mechanism behind is that it went through for a few tables via the API and every round trip query numbers of data(as below trial) to store into local server, sleep for few seconds(as stated below) and redo the interval. So far all the process doesn't turn in a good result. It is either slow or failed.

What is exactly the show stoppers here? How can I smoothly execute this curl within a reasonable time frame?

Trial based as below:-

*s as in seconds.

------------------------------------------------------------------------------------------
| No. of Data Query |  Sleep  |   Results                                                |
------------------------------------------------------------------------------------------
|   100 per trip    |    5s   |    Smooth but slow (imagine, few millions of data to DL  |
------------------------------------------------------------------------------------------
|  1000 per trip    |    2s   |    After a few intervals it stopped for unknown reason.. |
------------------------------------------------------------------------------------------
| 10000 per trip    |    2s   |    After a few intervals it stopped for unknown reason...|
------------------------------------------------------------------------------------------
| 10000 per trip    |    5s   |    After a few intervals it stopped for unknown reason...|
------------------------------------------------------------------------------------------

Codes :-

<?php
ini_set('max_execution_time', 0); // We will set php execution time to unlimited

include_once __DIR__ . "/../bm/custom/xxx/xxx.php";
include_once  __DIR__ . "/../bm/util/xxx.php";
use bm\custom\xxx\xxx;
use bm\util\xxx;

$uclim = new xxx();
$client = new xxxx();

// Database Credentials
$host = 'localhost';
$user = 'xxx';
$password = 'xxxx';
$db = 'xxxx';
$db_conn = new mysqli($host, $user, $password, $db);

// Initialize result variable to be null
$result = null;

$allSensors = array(
    'SENSOR_xxx'=>'xxxx',
    'SENSOR_xxxa'=>'xxxx',
    'SENSOR_xxxb'=>'xxxx',
    'SENSOR_xxxc'=>'xxxx',
    'SENSOR_xxxd'=>'xxxx',
    'SENSOR_xxxe'=>'xxxx',
    'SENSOR_xxxf'=>'xxxx',
    'SENSOR_xxxg'=>'xxxx'
);

$configs = array(
        'API_KEY'=>'xxxxxx',
        'HOST'=>'https://xxx',
        'API_STR'=>'xxxx',
        'TYPE_STR_ORGANIZATION_LIST'=>'xxxx',
        'TYPE_STR_DATASTORE_SEARCH'=>'xxxx',
        'TYPE_STR'=>'xxxx' 
);

/* Here we specified which data to acquired */
foreach($allSensors as $tableName=>$tableResourceId){

 $url = null; 
 $offset = 0;

 while(true){ // Here we start infinite loop until the end of the results

     $data = array(
        'resource_id'=>$tableResourceId,
        'limit'=>1000, // Numbers of Query each interval.
        'offset'=>$offset
     );

     //if(empty($result->result->_links->next)){
      $url = $configs['HOST'] . $configs['API_STR'] . $configs['TYPE_STR'];
     //}

     $data = json_encode($data);

     $headers_in = array(
        'Authorization: ' . $configs['API_KEY'],
        'Content-Type: application/json', 
        'Content-Length: ' . strlen($data)
     );


     $result = $client->curlUsingPost($url, $data, $headers_in);
     $result = json_decode($result);


     if( !empty($result->result->_links->next) && empty($result->result->records)){ break; } // Here we check if empty records sent from the api then we will break the loops


     // We gather all the fields from the Sensors for database table creation
     if(!empty($result->result->fields)){
     $allFields = null;
     $i = 0;
     $len = count($result->result->fields);
     $checkPriKey = 0;
     foreach($result->result->fields as $fieldName){
        if($fieldName->id == "_id"){
          $checkPriKey = 1;
        }

        if($i == $len - 1){ // To check if it is last row
         $allFields .= $fieldName->id." ".$fieldName->type.", PRIMARY KEY (_id)";
        }else{
         $allFields .= $fieldName->id." ".$fieldName->type.",";
        }
        $i++;
     }

     // Merge the collected fields name into sql format
     $sql = "CREATE TABLE IF NOT EXISTS $tableName ( $allFields )";
     $db_conn->query($sql);
     }

     // We gather fieldnames pattern from records
     if(!empty($result->result->records)){
       $recordsField = array();
       $sqlTableField = null;
       $o = 0;
       $fieldsLen = count((array)$result->result->records[0]);
       foreach($result->result->records[0] as $key=>$value){
          array_push($recordsField, $key);
          if($o == $fieldsLen - 1){
            $sqlTableField .= $key;
          }else{
            $sqlTableField .= $key.",";
          }
          $o++;
       }

       // We will insert values from records into table.
       $multipleInsert = null;
       $recordCounter = 0;
       $recordLen = count((array)$result->result->records);
       foreach($result->result->records as $key=>$record){
          $insertStr = null;
          $u = 0;
          $recordsLen = count($recordsField);
          foreach($recordsField as $fieldName){
            if($u == $recordsLen - 1){
              $insertStr .= '"'.$record->$fieldName.'"';
            }else{
              $insertStr .= '"'.$record->$fieldName.'",';
            }
            $u++;
          }
        if($recordCounter == $recordLen - 1){
         $multipleInsert .= "(".$insertStr.");";
        }else{
         $multipleInsert .= "(".$insertStr."),";
        }
        $recordCounter++;
       }

       // Insert api requested value into DB 
       $compiledSql = "INSERT INTO $tableName ($sqlTableField) VALUES $multipleInsert";
       $db_conn->query($compiledSql);

     } // End of gather fieldnames pattern from records

   //Increament offset value
   $offset = $offset + 1000;

 } // End of infinite while loop
}


?>

PS: Please let me know if the above question need more elaboration.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
    • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
    • ¥15 手机接入宽带网线,如何释放宽带全部速度
    • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
    • ¥15 ETLCloud 处理json多层级问题
    • ¥15 matlab中使用gurobi时报错
    • ¥15 这个主板怎么能扩出一两个sata口
    • ¥15 不是,这到底错哪儿了😭
    • ¥15 2020长安杯与连接网探
    • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么