I have an unusual mongodb cursor-timeout issue - I get a cursor not found
error even though the timeout of 10 minutes has not been reached.
My env:
- ubuntu 14.04 LTS
- PHP 7.1.13
- libmongoc bundled version 1.8.2
- MongoDB extension version 1.3.3
- mongodb/mongodb 1.2.0
Here is some code. In short, I loop over millions of users, for each user I perform a large aggregation, loop over results with a cursor, and then run a bunch of bulk-upserts.
<?php
/**
* THE AIM IS TO PERFORM A BIG AGGREGATION ON AN ENTIRE COLLECTION, MASSAGE THE RESULTS, AND WRITE INTO A PERMANENT COLLECTION.
*
* THE AGGREGATION IS BIG, SO IT WRITES TO DISK, AND WE ACCESS WITH A CURSOR.
*
* IN ORDER TO IMPROVE PERFORMANCE, WE BULK-UPSERT INTO THE PERMANENT COLLECTION ON EVERY 10k DOCUMENTS.
*
*/
class Analytics
{
const MAX_CURSOR_PAGE_SIZE = 10000;
public function performAnalytics($arr_user_ids)
{
foreach ($arr_user_ids as $item_user_id)
$this->performUserAnalytics($item_user_id->id);
}
private function performUserAnalytics($userId)
{
// AGGREGATION QUERY
$userItemsPage = $this->getUserItemsPage($userId);
// BULK UPSERT
$arrItemOps = [];
foreach ($userItemsPage as $item_user)
{
array_push($arrItemOps, [
'updateOne' => [
[ 'my_key' => $item_user->_id, ],
[ '$set' => [ 'item_ids' => $item_user->item_ids ] ],
[ 'upsert' => true ]
]
]);
// BULK-UPSERT FOR PERFORMANCE
if(count($arrItemOps) > self::MAX_CURSOR_PAGE_SIZE)
{
Log::info("BULK-UPSERTING USER ITEMS");
$permanent_collection->bulkUpsert($arrItemOps);
$arrItemOps = [];
}
}
// FINAL BULK-UPSERT
if(count($arrItemOps) > 0)
$permanent_collection->bulkUpsert($arrItemOps);
}
// BIG AGGREGATION
private function getUserItemsPage($userId)
{
return $items_collection->aggregate([
[
'$match' => [
'user_id' => $userId
]
],
[
'$group' => [
'_id' => '$user_id',
'item_ids' => [
'$addToSet' => '$item_id'
]
]
]
],
[ 'allowDiskUse' => true ]);
}
}
The code should takes ~2 days to run. After 1.3 days I get the cursor not found
error on the foreach ($userItemsPage as $item_user)
line.
The thing that's giving me a headache is that I see a:
BULK-UPSERTING USER ITEMS
log line every 50 seconds (45s to loop over 10k results, and 5s to bulk-upsert), and then the error line 1 minute after that. Every user-analytics takes between 1-120 minutes. I have not modified the default batch-size of 101 documents.
So I'm struggling to see where the timeout error is hiding.
I have gone through @Danziger 's excellent reply here (MongoDB - Error: getMore command failed: Cursor not found) but it seems unrelated to my scenario.
And there is another related but unanswered question (MongoDB - PHP - MongoCursorException 'Cursor not found')
Thanks,