dongzhan1948 2014-08-21 11:35
浏览 274
已采纳

MongoDB过滤来自结果的重复用户

I have a mongo collection with activities in this format:

{
  "_id": 1,
  "user": 1,
  "time": 12345,
  "data": ...
}

Now I want to get the 5 latest entries (the whole entry) from this collection but I want only one entry for each user in the case there are more than one activities from a user in the latest activities. I do not want to filter the result after the DB query. I hope there is a mongoDB way to do this on the DB server.

I would like to perform this query with Doctrine MongoDB ODM but I suspect that this is not possible with the provided methods. But a direct mongo query is fine too.

  • 写回答

1条回答 默认 最新

  • dro80463 2014-08-26 15:46
    关注

    You aren't using a date value for your time, so I'm going to assume "latest" means "largest number in the time". Secondly, I'm going to get the top 2 latest entries with at most one per user. The idea is that only the highest value of the time matters for each user, so we just $group by user after sorting on time while projecting the field values from the $first result seen by $group, then take the top 2 entries overall. The example is in the mongo shell.

    > db.user.find()
    { "_id" : 1, "user" : 1, "time" : 12345, "data" : 48 }
    { "_id" : 2, "user" : 1, "time" : 12346, "data" : 32 }
    { "_id" : 3, "user" : 2, "time" : 347, "data" : 2 }
    { "_id" : 4, "user" : 2, "time" : 384, "data" : 99 }
    { "_id" : 5, "user" : 2, "time" : 384, "data" : 66 }
    { "_id" : 6, "user" : 3, "time" : 3384, "data" : 55 }
    { "_id" : 7, "user" : 3, "time" : 33844, "data" : 3 }
    > db.user.aggregate([
        { "$sort" : { "time" : -1 } }, 
        { "$group" : { 
            "_id" : "$user", 
            "time" : { "$first" : "$time" }, 
            "data" : { "$first" : "$data" } 
            } 
        }, 
        { "$sort" : { "time" : -1 } }, 
        { "$limit" : 2 }
    ])
    { "_id" : 3, "time" : 33844, "data" : 3 }
    { "_id" : 1, "time" : 12346, "data" : 32 }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog