I'm using Azure Table Storage from my PHP application and seeing very slow response times. The application gathers reports each day for each user and places them in a partition using the partition key $userID . date("dmY")
. During load testing one particular day had several thousand reports generated which requires multiple round trips to fetch due to the 1,000 entity limit in an Azure Table Storage Query. Each trip to fetch 1,000 entities can take up to 2 seconds.
Code:
for($i = 0; $i < $daysToGoBack; $i++)
{
$filter = "PartitionKey eq '" . $userId . date("dmY", strtotime("-$i days")) . "'";
$options = new QueryEntitiesOptions();
$options->addSelectField('created');
$options->setFilter(Filter::applyQueryString($filter));
$this->benchmark->mark('main_query');
$result = $this->tableRestProxy->queryEntities('reports', $filter, $options);
$this->benchmark->mark('main_query_end');
echo "Query: " . $this->benchmark->elapsed_time('main_query', 'main_query_end') . "<br/>";
$entities = array_merge($result->getEntities(), $entities);
$nextPartitionKey = $result->getNextPartitionKey();
$nextRowKey = $result->getNextRowKey();
while(!is_null($nextRowKey) && !is_null($nextPartitionKey))
{
$options = new QueryEntitiesOptions();
$options->setNextPartitionKey($nextPartitionKey);
$options->setNextRowKey($nextRowKey);
$options->addSelectField('created');
$options->setFilter(Filter::applyQueryString($filter));
$this->benchmark->mark('sub_query');
$newResult = $this->tableRestProxy->queryEntities('reports', $options);
$this->benchmark->mark('sub_query_end');
echo "Continuation: " . $this->benchmark->elapsed_time('sub_query', 'sub_query_end') . "<br/>";
$newEntities = $newResult->getEntities();
$entities = array_merge($newEntities, $entities);
$nextPartitionKey = $newResult->getNextPartitionKey();
$nextRowKey = $newResult->getNextRowKey();
}
Result:
Query: 1.8183
Continuation: 1.2479
Continuation: 0.2423
Continuation: 0.2619
Continuation: 0.2476
Continuation: 0.2836
Continuation: 0.2345
Continuation: 0.2482
Continuation: 0.2565
Continuation: 0.2187
Continuation: 0.2319
Continuation: 0.2389
Continuation: 0.2221
Query: 0.0320
Query: 0.0338
Query: 0.1038
Query: 0.1263
Query: 0.1841
Query: 0.0547
The results above are about the best I could get out of it. The first query has almost 13,000 reports to pull back so does the initial query and then 12 calls back with the continuation token. Most of the time the queries can take much longer on exactly the same data.
Query: 1.8273
Continuation: 1.2592
Continuation: 0.8160
Continuation: 0.8463
Continuation: 0.7474
Continuation: 0.7104
Continuation: 1.3987
Continuation: 1.4321
Continuation: 1.4526
Continuation: 1.3184
Continuation: 0.7390
Continuation: 0.7212
Continuation: 0.2610
Query: 0.0630
Query: 0.1221
Query: 0.0728
Query: 0.1250
Query: 0.1717
Query: 0.0568
Are these results expected from Azure or is there a more efficient way to query this data out?