I run a benchmark on elasticsearch using elasticsearch-php. I compare the time taken by 10 000 index one by one vs 10 000 with bulk of 1 000 documents.
On my vpn server 3 cores 2 Gb mem the performance is quite the same with or without bulk index.
My php code (inspired by à post):
<?php
set_time_limit(0); // no timeout
require 'vendor/autoload.php';
$es = new Elasticsearch\Client([
'hosts'=>['127.0.0.1:9200']
]);
$max = 10000;
// ELASTICSEARCH BULK INDEX
$temps_debut = microtime(true);
for ($i = 0; $i <= $max; $i++) {
$params['body'][] = array(
'index' => array(
'_index' => 'articles',
'_type' => 'article',
'_id' => 'cle' . $i
)
);
$params['body'][] = array(
'my_field' => 'my_value' . $i
);
if ($i % 1000) { // Every 1000 documents stop and send the bulk request
$responses = $es->bulk($params);
$params = array(); // erase the old bulk request
unset($responses); // unset to save memory
}
}
$temps_fin = microtime(true);
echo 'Elasticsearch bulk: ' . round($i / round($temps_fin - $temps_debut, 4)) . ' per sec <br>';
// ELASTICSEARCH WITHOUT BULK INDEX
$temps_debut = microtime(true);
for ($i = 1; $i <= $max; $i++) {
$params = array();
$params['index'] = 'my_index';
$params['type'] = 'my_type';
$params['id'] = "key".$i;
$params['body'] = array('testField' => 'valeur'.$i);
$ret = $es->index($params);
}
$temps_fin = microtime(true);
echo 'Elasticsearch One by one : ' . round($i / round($temps_fin - $temps_debut, 4)) . 'per sec <br>';
?>
Elasticsearch bulk: 1209 per sec Elasticsearch One by one : 1197per sec
Is there something wrong on my bulk index to obtain better performance ?
Thank's