douwu5009 2019-02-01 17:50
浏览 379

PHP Multi-cURL请求延迟到超时


I have some PHP 5.4 code which fetches a batch of Facebook/Instagram photos in parallel using multi curl. This code has been working for years, and nothing has changed as far as I can tell.

I add multiple curl requests to a 'multi' request. Each curl request gets a CURLOPT_TIMEOUT. The problem I'm seeing is that, all of a sudden, some of my requests don't complete until this timeout is reached (no matter what timeout I set).


I do something like this (simplified):

do {
    while (CURLM_CALL_MULTI_PERFORM === curl_multi_exec($mh, $running));

    // Wait for activity on any curl-connection (optional, reduces CPU)

    // a request was just completed -- find out which one
    while($done = curl_multi_info_read($mh))
        $completedCurlRequest = $done['handle'];

        //save the file

        curl_multi_remove_handle($mh, $completedCurlRequest);
} while ($running);

I use this script to run batches of about 40 parallel requests to fetch some images (from Facebook). Most of them take about 500ms to complete. However, a few of the requests "hang" (until the CURLOPT_TIMEOUT) before they arrive.

Basically the curl_multi_select step takes the entire timeout. Or, if I remove that curl_multi_select line, the outer loop spins (burning CPU) until the timeout.


  • It doesn't matter what the timeout is - if I set the timeout to 30s, they arrive after 30 seconds, If I set the timeout to 1s, they arrive after 1s!

  • This is a really sudden change that does not correlate with any code release - it was all working fine up until 30th Jan 2019, but on the 31st it suddenly stopped working.

  • This isn't easy to reproduce, as it only affects an image once. If I repeat it for a batch of images I already fetched, it works fine the next time round.

  • It affects both Facebook and Instagram images, so I think the issue must be to do with my code or my server (and not Facebook or Instagram), as they wouldn't have both changed something simultaneously.


  1. Am I doing something wrong in my use of multi-curl that could cause this? (but if so, what's changed?)
  2. Have Facebook and Instagram changed anything that might cause this?
  3. Could something on my server have changed to trigger this?
  4. How can I debug this?

Update Here is the what I get back from a slow request when it finally completes:


"content_type": "image/jpeg",
"http_code": 200,
"header_size": 377,
"request_size": 180,
"total_time": 15.001012,    //<----- Total time == CURLOPT_TIMEOUT
"namelookup_time": 0.007149,
"connect_time": 0.12018,
"pretransfer_time": 0.441911,
"size_download": 40714,
"speed_download": 2714,
"download_content_length": -1,   //<------Not set


HTTP/2 200 
content-type: image/jpeg
x-haystack-needlechecksum: 3529661797
timing-allow-origin: *
access-control-allow-origin: *
cache-control: max-age=1209600, no-transform
date: Mon, 04 Feb 2019 14:04:17 GMT
access-control-expose-headers: X-FB-CEC-Video-Limit

It is missing the content-length header, but that always seems to be the case the first time a file is fetched. Only 1 or 2 of the 50 parallel requests are slow, yet all of the requests are missing their content length headers.

If I fetch the same file again, it is much quicker, and I do see content length being set this time


"download_content_length": 52721,


content-length: 52721           
  • 写回答

1条回答 默认 最新

  • dongshi1914 2019-02-04 09:28

    My current theory is that there is a bug in Facebook fileserver that means the connection is sometimes not being closed even though the data has been sent, so the connection stays open until the timeout. In the absence of the (optional) content-length header being sent by Facebook's fileserver, cURL can't know if the payload is complete, and so hangs.

    My current solution is to 'prime' the fileserver by first making a request for the image without a body, like this:

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);

    This is a pretty quick process, since there is no image being returned. I actually do this in the background using asynchronous multi curl, so I can get on with doing some other processing.

    After priming the fileserver, subsequent requests for the files are even quicker than before, as the content-length is known.

    This is a bit of a clumsy approach, but in the absence of any response from Facebook so far I'm not sure what else to do.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?



  • ¥15 matlab调用stl文件时报错说内存不足
  • ¥20 医学图像格式、医学图像后处理
  • ¥15 MATLAB中的矩阵分离问题
  • ¥15 服务端可运行语言,解析unity文件
  • ¥15 英飞凌TC387使用MCAL唤醒TJA1145问题
  • ¥15 android tv图标显示异常
  • ¥20 (标签-AR|关键词-预测分析)
  • ¥15 QT IFW 自定义界面添加lineedit小键盘输入数字无效果
  • ¥15 python thinter动态建立Entry并读取数据
  • ¥150 电路仿真,演示反激变压器升压