douhao1956 2019-06-28 23:08
浏览 215

在Go中实现聚合的最佳方法(例如SQL中的GROUP BY)?

Let's say I have a struct

type row struct {
    f1, f2, f3 string
    v int64
}

We can imagine it as a row in a table.

Also, I need to implement a function which does aggregation like this query:

SELECT f1, f2, f3, SUM(v) FROM table GROUP BY f1, f2, f3

So, I have to implement function:

type key struct {
    f1, f2, f3 string
}
func aggregate(t []row) map[key]int64

or if can be

func aggregate(t []row) map[string]row

where map key is, for instance, f1+f2+f3

func aggregate(t []row)  []row

also works if result will include unique f1, f2, f3 combinations (DISTINCT f1, f2, f3)

I have two variants:

func aggregate1(t []row) map[key]int64 {
    res := map[key]int64{}
    for _, r := range t {
        res[key{r.f1, r.f2, r.f3}] += r.v
    }
    return res
}
func aggregate2(t []row) map[string]*row {
    res := map[string]*row{}
    for _, r := range t {
        var sb strings.Builder
        sb.WriteString(r.f1)
        sb.WriteString("#")
        sb.WriteString(r.f2)
        sb.WriteString("#")
        sb.WriteString(r.f3)
        id := sb.String()
        t := res[id]
        if t == nil {
            t = &row{f1: r.f1, f2: r.f2, f3: r.f3, v: 0}
            res[id] = t
        }
        t.v += r.v
    }
    return res
}

The first variant spends too much time in https://golang.org/pkg/runtime/?m=all#mapassign (runtime.mapassign)

The idea of the second variant is to use faster https://golang.org/pkg/runtime/?m=all#mapassign_faststr (runtime.mapassign_faststr), but strings.Builder.WriteString eliminates all benefits from runtime.mapassign_faststr :(

So, can you suggest more ideas about how to implement this aggregation?

I am thinking about how to efficiently calculate "id" in the second variant. It should be unique. My variant is unique because f1, f2 and f3 can't include "#" character.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 电脑桌面设定一个区域禁止鼠标操作
    • ¥15 求NPF226060磁芯的详细资料
    • ¥15 使用R语言marginaleffects包进行边际效应图绘制
    • ¥20 usb设备兼容性问题
    • ¥15 错误(10048): “调用exui内部功能”库命令的参数“参数4”不能接受空数据。怎么解决啊
    • ¥15 安装svn网络有问题怎么办
    • ¥15 Python爬取指定微博话题下的内容,保存为txt
    • ¥15 vue2登录调用后端接口如何实现
    • ¥65 永磁型步进电机PID算法
    • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?