dping1968 2018-04-20 11:40
浏览 66

通过聚合查找副本,并使用MongoDB和Golang查找

I need to find duplicates with aggregation and lookup with MongoDB and Golang. Here is my Event structure.

// Event describes the model of an Event
type Event struct {
    ID            string      `bson:"_id" json:"_id" valid:"alphanum,printableascii"`
    OldID         string      `bson:"old_id" json:"old_id" valid:"alphanum,printableascii"`
    ParentID      string      `bson:"_parent_id" json:"_parent_id" valid:"alphanum,printableascii"`
    Name          string      `bson:"name" json:"name"`
    Content       string      `bson:"content" json:"content"`
    Slug          string      `bson:"slug" json:"slug"`
    LocationID    string      `bson:"_location_id" json:"_location_id"`
    Price         string      `bson:"price" json:"price"`
    CreatedBy     string      `bson:"created_by" json:"created_by"`
    CreatedAt     time.Time   `bson:"created_at" json:"created_at"`
    ModifiedAt    time.Time   `bson:"modified_at" json:"modified_at"`
}

Here is the request I already have :

// Create the pipeline
    pipeline := []bson.M{
        bson.M{
            "$group": bson.M{
                "_id": bson.M{
                    "_location_id": "$_location_id",
                    "start_date":   "$start_date",
                },
                "docs":  bson.M{"$push": "$_id"},
                "count": bson.M{"$sum": 1},
            },
        },
        bson.M{
            "$match": bson.M{
                "count": bson.M{"$gt": 1.0},
            },
        },
    }

    // Do the request
    dupes := []bson.M{}
    err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)

The events must not have the same start_date and the same _location_id. This is what I can get :

/* 1 */
{
    "_id" : {
        "_location_id" : "4okPZllaoueYC3U2",
        "start_date" : ISODate("2018-04-22T18:00:00.000Z")
    },
    "count" : 2.0,
    "docs" : [ 
        "FFSC2sJcrWgj2FsU", 
        "lwHknTHFfVAzB8ui"
    ]
}

/* 2 */
{
    "_id" : {
        "_location_id" : "pC8rlLVao5c2CeBh",
        "start_date" : ISODate("2018-04-03T19:00:00.000Z")
    },
    "count" : 2.0,
    "docs" : [ 
        "jPRbkINiCExzh2tT", 
        "C8hx92QSZEl7HUIz"
    ]
}

Fine, it is working, but.. I would like to obtain, directly from Mongo, an array of my Event type, and if it is possible, an array of array of Event : [][]*Event. In order words, an array of the duplicates (between them).

For example :

// Pipeline
...

// Do the request
events := [][]*Events
err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&events)

Or, do I need to perform the logic with Golang to achieve what I need ?

The libraries I use are :

"gopkg.in/mgo.v2"
"gopkg.in/mgo.v2/bson"

Note only : no need to take care about the _location_id, I lookup it with a DTO inside my Golang logic.

EDIT : If I cannot lookup the IDs, can I at least obtain the IDs as an array directly in the result ? For example :

[ 
    "jPRbkINiCExzh2tT", 
    "C8hx92QSZEl7HUIz"
]

This is what I tried to add to the request : {$out: "uniqueIds"}. But it is not working.

  • 写回答

1条回答 默认 最新

  • dtn913117 2018-11-30 12:24
    关注

    Yes you need to perform the logic with Golang to achieve what you need.

    You can do like this :

        type DublesAgregate struct {
        Id        IdStruct        `bson:"_id"`
        Docs      []bson.ObjectId `bson:"docs"`
        Count     string          `bson:"count,omitempty"`
    }
    
    type IdStruct struct {
        Location_id      string `bson:"_location_id,omitempty"`
        Start_date       string `bson:"start_date,omitempty"`
    }
    
    // Create the pipeline
        pipeline := []bson.M{
            bson.M{
                "$group": bson.M{
                    "_id": bson.M{
                        "_location_id": "$_location_id",
                        "start_date":   "$start_date",
                    },
                    "docs":  bson.M{"$push": "$_id"},
                    "count": bson.M{"$sum": 1},
                },
            },
            bson.M{
                "$match": bson.M{
                    "count": bson.M{"$gt": 1.0},
                },
            },
        }
    
    // Do the request
        dupes := []DublesAgregate{}
        err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)
    
    // Get Docs slice
        result := []bson.ObjectId{}
        for _, group := range dupes {
            result = append(result, group.Docs...)
        }
    
    评论

报告相同问题?

悬赏问题

  • ¥15 matlab中使用gurobi时报错
  • ¥15 WPF 大屏看板表格背景图片设置
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探
  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么
  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂