1. Simple optimization
- You should sleep about 2500 microseconds if
curl_multi_select
failed.
Actually, it defintely fails sometimes for each execution.
Without sleeping, your CPU resources get occupied by lots of while (true) { }
loops.
- If you do nothing after some (not all) of the requests have finished,
you should let maximum timeout seconds larger.
- Your code is written for old libcurls. As of libcurl version 7.2,
the state CURLM_CALL_MULTI_PERFORM
does not appear anymore.
So, the following code
$running = null;
$mrc = null;
do
{
$mrc = curl_multi_exec( $master , $running );
}
while ( $mrc == CURLM_CALL_MULTI_PERFORM );
while ( $running && $mrc == CURLM_OK )
{
if (curl_multi_select( $master ) != - 1)
{
do
{
$mrc = curl_multi_exec( $master , $running );
}
while ( $mrc == CURLM_CALL_MULTI_PERFORM );
}
}
should be
curl_multi_exec($master, $running);
do
{
if (curl_multi_select($master, 99) === -1)
{
usleep(2500);
continue;
}
curl_multi_exec($master, $running);
} while ($running);
Note
The timeout value of curl_multi_select
should be tuned only if you want to do something like...
curl_multi_exec($master, $running);
do
{
if (curl_multi_select($master, $TIMEOUT) === -1)
{
usleep(2500);
continue;
}
curl_multi_exec($master, $running);
while ($info = curl_multi_info_read($master))
{
/* Do something with $info */
}
} while ($running);
Otherwise, the value should be extreamly large.
(However, PHP_INT_MAX
is too large; libcurl treats it as an invalid value.)
2. Easy experiment in one PHP process
I tested using my parallel cURL executor library: mpyw/co
(The prep. for
is improper and it should be by
, sorry for my poor English xD)
<?php
require 'vendor/autoload.php';
use mpyw\Co\Co;
function four_sequencial_requests_for_one_hundread_people()
{
for ($i = 0; $i < 100; ++$i) {
$tasks[] = function () use ($i) {
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_URL => 'example.com',
CURLOPT_FORBID_REUSE => true,
CURLOPT_RETURNTRANSFER => true,
]);
for ($j = 0; $j < 4; ++$j) {
yield $ch;
}
};
}
$start = microtime(true);
yield $tasks;
$end = microtime(true);
printf("Time of %s: %.2f sec
", __FUNCTION__, $end - $start);
}
function requests_for_four_hundreds_people()
{
for ($i = 0; $i < 400; ++$i) {
$tasks[] = function () use ($i) {
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_URL => 'example.com',
CURLOPT_FORBID_REUSE => true,
CURLOPT_RETURNTRANSFER => true,
]);
yield $ch;
};
}
$start = microtime(true);
yield $tasks;
$end = microtime(true);
printf("Time of %s: %.2f sec
", __FUNCTION__, $end - $start);
}
Co::wait(four_sequencial_requests_for_one_hundread_people(), [
'concurrency' => 0, // Zero means unlimited
]);
Co::wait(requests_for_four_hundreds_people(), [
'concurrency' => 0, // Zero means unlimited
]);
I tried for five times to get the following results:
I also tried in reverse order (The 3rd request was kicked xD):
These results represent too many concurrent TCP connections actually decrease throughputs.
3. Advanced optimization
3-A. For different destinations
If you want to optimize for both few and many concurrent requests, the following dirty solution may help you.
- Share the number of requesters using
apcu_add
/ apcu_fetch
/ apcu_delete
.
- Switch methods(sequencial or parallel) by current value.
3-B. For the same destinations
CURLMOPT_PIPELINING
will help you. This option bundles all HTTP/1.1 connections for the same destination into one TCP connection.
curl_multi_setopt($master, CURLMOPT_PIPELINING, 1);