douzhuo8871 2018-01-29 01:28
浏览 599

Mongodb - 聚合和batchSize - 1分钟后找不到光标

I have an unusual mongodb cursor-timeout issue - I get a cursor not found error even though the timeout of 10 minutes has not been reached.

My env:

  • ubuntu 14.04 LTS
  • PHP 7.1.13
  • libmongoc bundled version 1.8.2
  • MongoDB extension version 1.3.3
  • mongodb/mongodb 1.2.0

Here is some code. In short, I loop over millions of users, for each user I perform a large aggregation, loop over results with a cursor, and then run a bunch of bulk-upserts.

<?php

/**
 * THE AIM IS TO PERFORM A BIG AGGREGATION ON AN ENTIRE COLLECTION, MASSAGE THE RESULTS, AND WRITE INTO A PERMANENT COLLECTION.
 *
 * THE AGGREGATION IS BIG, SO IT WRITES TO DISK, AND WE ACCESS WITH A CURSOR.
 *
 * IN ORDER TO IMPROVE PERFORMANCE, WE BULK-UPSERT INTO THE PERMANENT COLLECTION ON EVERY 10k DOCUMENTS.
 *
 */
class Analytics
{
    const MAX_CURSOR_PAGE_SIZE = 10000;

    public function performAnalytics($arr_user_ids)
    {
        foreach ($arr_user_ids as $item_user_id)
            $this->performUserAnalytics($item_user_id->id);
    }

    private function performUserAnalytics($userId)
    {
        // AGGREGATION QUERY
        $userItemsPage = $this->getUserItemsPage($userId);

        // BULK UPSERT
        $arrItemOps = [];
        foreach ($userItemsPage as $item_user)
        {
            array_push($arrItemOps, [
                'updateOne' => [
                    [ 'my_key' => $item_user->_id, ],
                    [ '$set' => [ 'item_ids' => $item_user->item_ids ] ],
                    [ 'upsert' => true ]
                ]
            ]);

            // BULK-UPSERT FOR PERFORMANCE
            if(count($arrItemOps) > self::MAX_CURSOR_PAGE_SIZE)
            {
                Log::info("BULK-UPSERTING USER ITEMS");
                $permanent_collection->bulkUpsert($arrItemOps);
                $arrItemOps = [];
            }
        }

        // FINAL BULK-UPSERT
        if(count($arrItemOps) > 0)
            $permanent_collection->bulkUpsert($arrItemOps);
    }

    // BIG AGGREGATION
    private function getUserItemsPage($userId)
    {
        return $items_collection->aggregate([
            [
                '$match' => [
                    'user_id' => $userId
                ]
            ],
            [
                '$group' => [
                    '_id' => '$user_id',
                    'item_ids' => [
                        '$addToSet' => '$item_id'
                    ]
                ]
            ]
        ],
        [ 'allowDiskUse' => true ]);
    }
}

The code should takes ~2 days to run. After 1.3 days I get the cursor not found error on the foreach ($userItemsPage as $item_user) line. The thing that's giving me a headache is that I see a:

BULK-UPSERTING USER ITEMS

log line every 50 seconds (45s to loop over 10k results, and 5s to bulk-upsert), and then the error line 1 minute after that. Every user-analytics takes between 1-120 minutes. I have not modified the default batch-size of 101 documents.

So I'm struggling to see where the timeout error is hiding.

I have gone through @Danziger 's excellent reply here (MongoDB - Error: getMore command failed: Cursor not found) but it seems unrelated to my scenario.

And there is another related but unanswered question (MongoDB - PHP - MongoCursorException 'Cursor not found')

Thanks,

  • 写回答

1条回答 默认 最新

  • dongzhuo7291 2018-01-31 22:54
    关注

    Ok, I think I understand what is going on now, find() and aggregate() behave differently with respect to batchSize.

    Find (https://docs.mongodb.com/manual/reference/method/cursor.batchSize):

    Find

    Specifies the number of documents to return in each batch

    Aggregate:

    specify an initial batch size for the cursor

    So with aggregate(), specifying a batchSize for subsequent batches is done on the returned cursor, not within the aggregation command. However that is not possible in PHP - we do not have the same level of control on the cursor object. This means that aggregation batchSize cannot be controlled in PHP. I have a raised a ticket for this with the fine folks at mongo (https://jira.mongodb.org/browse/PHPLIB-312).

    In my case, this manifested as mongo returning me a large batch (~130k documents) which could take >10min to process. I was inserting 10k documents at a time, but this did not mean that I was reading batches of 10k from mongo at a time.

    评论

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog