dongshuql24533 2016-08-09 06:06

PHP多cURL性能比顺序file_get_contents差

I am writing an interface in which I must launch 4 http requests to get some infomation.

I implemented the interface in 2 ways:

using sequential file_get_contents.
using multi curl.

I have benchmarked the 2 versions with jmeter. The result shows that multi curl is much better than sequential file_get_contents when there's only 1 thread in jmeter making requests, but much worse when 100 threads.

The question is: which could bring the bad performance of multi curl?

My multi curl code is as below:

$curl_handle_arr = array ();
$master = curl_multi_init();
foreach ( $call_url_arr as $key => $url )
{
    $curl_handle = curl_init( $url );
    $curl_handle_arr [$key] = $curl_handle;
    curl_setopt( $curl_handle , CURLOPT_RETURNTRANSFER , true );
    curl_setopt( $curl_handle , CURLOPT_POST , true );
    curl_setopt( $curl_handle , CURLOPT_POSTFIELDS , http_build_query( $params_arr [$key] ) );
    curl_multi_add_handle( $master , $curl_handle );
}
$running = null;
$mrc = null;
do
{
    $mrc = curl_multi_exec( $master , $running );
}
while ( $mrc == CURLM_CALL_MULTI_PERFORM );
while ( $running && $mrc == CURLM_OK )
{
    if (curl_multi_select( $master ) != - 1)
    {
        do
        {
            $mrc = curl_multi_exec( $master , $running );
        }
        while ( $mrc == CURLM_CALL_MULTI_PERFORM );
    }
}
foreach ( $call_url_arr as $key => $url )
{
    $curl_handle = $curl_handle_arr [$key];
    if (curl_error( $curl_handle ) == '')
    {
        $result_str_arr [$key] = curl_multi_getcontent( $curl_handle );
    }
    curl_multi_remove_handle( $master , $curl_handle );
}
curl_multi_close( $master );

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

dounayan3643 2016-08-10 01:39

关注

1. Simple optimization

You should sleep about 2500 microseconds if curl_multi_select failed.
Actually, it defintely fails sometimes for each execution.
Without sleeping, your CPU resources get occupied by lots of while (true) { } loops.
If you do nothing after some (not all) of the requests have finished,
you should let maximum timeout seconds larger.
Your code is written for old libcurls. As of libcurl version 7.2,
the state CURLM_CALL_MULTI_PERFORM does not appear anymore.

So, the following code

$running = null;
$mrc = null;
do
{
    $mrc = curl_multi_exec( $master , $running );
}
while ( $mrc == CURLM_CALL_MULTI_PERFORM );
while ( $running && $mrc == CURLM_OK )
{
    if (curl_multi_select( $master ) != - 1)
    {
        do
        {
            $mrc = curl_multi_exec( $master , $running );
        }
        while ( $mrc == CURLM_CALL_MULTI_PERFORM );
    }
}

should be

curl_multi_exec($master, $running);
do
{
    if (curl_multi_select($master, 99) === -1)
    {
        usleep(2500);
        continue;
    }
    curl_multi_exec($master, $running);
} while ($running);

Note

The timeout value of curl_multi_select should be tuned only if you want to do something like...

curl_multi_exec($master, $running);
do
{
    if (curl_multi_select($master, $TIMEOUT) === -1)
    {
        usleep(2500);
        continue;
    }
    curl_multi_exec($master, $running);
    while ($info = curl_multi_info_read($master))
    {
        /* Do something with $info */
    }
} while ($running);

Otherwise, the value should be extreamly large.
(However, PHP_INT_MAX is too large; libcurl treats it as an invalid value.)

2. Easy experiment in one PHP process

I tested using my parallel cURL executor library: mpyw/co

(The prep. for is improper and it should be by, sorry for my poor English xD)

<?php 

require 'vendor/autoload.php';

use mpyw\Co\Co;

function four_sequencial_requests_for_one_hundread_people()
{
    for ($i = 0; $i < 100; ++$i) {
        $tasks[] = function () use ($i) {
            $ch = curl_init();
            curl_setopt_array($ch, [
                CURLOPT_URL => 'example.com',
                CURLOPT_FORBID_REUSE => true,
                CURLOPT_RETURNTRANSFER => true,
            ]);
            for ($j = 0; $j < 4; ++$j) {
                yield $ch;
            }
        };
    }
    $start = microtime(true);
    yield $tasks;
    $end = microtime(true);
    printf("Time of %s: %.2f sec
", __FUNCTION__, $end - $start);
}

function requests_for_four_hundreds_people()
{
    for ($i = 0; $i < 400; ++$i) {
        $tasks[] = function () use ($i) {
            $ch = curl_init();
            curl_setopt_array($ch, [
                CURLOPT_URL => 'example.com',
                CURLOPT_FORBID_REUSE => true,
                CURLOPT_RETURNTRANSFER => true,
            ]);
            yield $ch;
        };
    }
    $start = microtime(true);
    yield $tasks;
    $end = microtime(true);
    printf("Time of %s: %.2f sec
", __FUNCTION__, $end - $start);
}

Co::wait(four_sequencial_requests_for_one_hundread_people(), [
    'concurrency' => 0, // Zero means unlimited
]);

Co::wait(requests_for_four_hundreds_people(), [
    'concurrency' => 0, // Zero means unlimited
]);

I tried for five times to get the following results:

I also tried in reverse order (The 3rd request was kicked xD):

These results represent too many concurrent TCP connections actually decrease throughputs.

3. Advanced optimization

3-A. For different destinations

If you want to optimize for both few and many concurrent requests, the following dirty solution may help you.

Share the number of requesters using apcu_add / apcu_fetch / apcu_delete.
Switch methods(sequencial or parallel) by current value.

3-B. For the same destinations

CURLMOPT_PIPELINING will help you. This option bundles all HTTP/1.1 connections for the same destination into one TCP connection.

curl_multi_setopt($master, CURLMOPT_PIPELINING, 1);

报告相同问题？

关注问题

如何将file_get_contents转换为cURL php
2018-03-01 16:02

回答 1 已采纳 You may still get a 401 with curl but you can try the following $ch = curl_init(); curl_setopt($c
PHP file_get_contents / curl - 获得与浏览器不同的结果 php
2014-01-29 19:12

回答 3 已采纳 try to test it using saving cookies to same directory where the script resides first so set the co
curl或file_get_contents忽略输出 php
2017-10-13 17:19

回答 1 已采纳 You can use a socket. See this example. Edit: Here is the code from the above link: // Example:
解决PHP curl或file_get_contents下载图片损坏或无法打开的问题
2020-10-16 05:45

今天小编就为大家分享一篇解决PHP curl或file_get_contents下载图片损坏或无法打开的问题，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧
有没有办法用php file_get_contents绕过403错误？ php
2017-12-02 19:36

回答 2 已采纳 You need to add the User-Agent header to the actual header: $context = stream_context_create(
致命错误：无法使用mysql，php，curl函数重新声明file_get_contents_curl（）。请[关闭] mysql php
2015-12-25 21:53

回答 1 已采纳 The error is very simple. You declare your function insight your while loop. Every time you iterat
file_get_contents无效 - 连接被拒绝 php
2017-01-01 20:02

回答 1 已采纳 Isn't Hostgator blocking the requests because of the DDoS protection? Give them a call, my hosting
php基于curl重写file_get_contents函数实例
2020-10-21 03:54

主要介绍了php基于curl重写file_get_contents函数的方法,结合实例形式分析了php使用curl重写file_get_contents函数实现屏蔽错误提示的相关技巧,需要的朋友可以参考下
为什么我的POST file_get_contents返回HTTP错误请求？ php
2016-02-03 16:24

回答 2 已采纳 http_build_query() converts an array to a URL-encoded query string like name=john&password=s3cr3t
PHP：使用file_get_contents的大量请求 mysql php
2015-02-26 10:49

回答 1 已采纳 Solved it by switching to curl instead of file_get_contents()
PHP file_put_contents设置速度限制 php
2015-12-13 04:23

回答 1 已采纳 You can try mod_bandwidth or more advanced mod_cband. Quote from mad_bandwidth site: Mod_ban
php中file_get_contents与curl性能比较分析
2020-12-19 20:22

在php中如果不仔细的去分析性能会发现file_get_contents与curl两个同很多共同点的，他们都可以采集文件打开文件，但是如果仔细一对比会发现很多不同点，下面我们一起来看看file_get_contents与curl区别。 PHP中fopen...
PHP脚本返回“file_get_contents php
2013-12-13 17:01

回答 1 已采纳 scandir will return . and .., along with all real files and subdirectories. You need to filter tho
比file_get_contents稳定的curl_get_contents分享
2020-12-19 00:56

分享一个实际在用的函数：复制代码代码如下: /*比file_get_contents稳定的多！$timeout为超时时间，单位是秒，默认为1s。*/ function curl_get_contents($url,$timeout=1) { $curlHandle = curl_init(); curl_...
php中file_get_content 和curl以及fopen 效率分析
2020-10-25 10:24

关于file_get_content 和curl以及fopen 的效率问题，小编比较倾向于使用curl来访问远程url。Php有curl模块扩展，功能很是强大。没事可以研究一下。
PHP curl 或 file_get_contents 获取需要授权页面的方法
2021-01-20 01:35

今天因工作需要，需要用 curl / file_get_contents 获取需要授权(Authorization)的页面内容，解决后写了这篇文章分享给大家。 PHP curl 扩展，能够在服务器端发起POST/GET请求，访问页面，并能获取页面的返回数据。 ...
php采用file_get_contents代替使用curl实例
2020-10-25 06:45

主要介绍了php采用file_get_contents代替使用curl的方法,实例讲述了file_get_contents模拟curl的post方法,对于服务器不支持curl的情况来说有一定的借鉴价值,需要的朋友可以参考下
没有解决我的问题, 去提问

悬赏问题

¥15 不是，这到底错哪儿了😭
¥15 2020长安杯与连接网探
¥15 关于#matlab#的问题：在模糊控制器中选出线路信息，在simulink中根据线路信息生成速度时间目标曲线（初速度为20m/s，15秒后减为0的速度时间图像）我想问线路信息是什么
¥15 banner广告展示设置多少时间不怎么会消耗用户价值
¥16 mybatis的代理对象无法通过@Autowired装填
¥15 可见光定位matlab仿真
¥15 arduino 四自由度机械臂
¥15 wordpress 产品图片 GIF 没法显示
¥15 求三国群英传pl国战时间的修改方法
¥15 matlab代码代写，需写出详细代码，代价私

码龄粉丝数原力等级 --

PHP多cURL性能比顺序file_get_contents差

1条回答默认最新

码龄粉丝数原力等级 --

1. Simple optimization

Note

2. Easy experiment in one PHP process

3. Advanced optimization

3-A. For different destinations

3-B. For the same destinations

悬赏问题

PHP多cURL性能比顺序file_get_contents差

1条回答 默认 最新

1. Simple optimization

Note

2. Easy experiment in one PHP process

3. Advanced optimization

3-A. For different destinations

3-B. For the same destinations

悬赏问题

1条回答默认最新