Let's say I have a struct
type row struct {
f1, f2, f3 string
v int64
}
We can imagine it as a row in a table.
Also, I need to implement a function which does aggregation like this query:
SELECT f1, f2, f3, SUM(v) FROM table GROUP BY f1, f2, f3
So, I have to implement function:
type key struct {
f1, f2, f3 string
}
func aggregate(t []row) map[key]int64
or if can be
func aggregate(t []row) map[string]row
where map key is, for instance, f1+f2+f3
func aggregate(t []row) []row
also works if result will include unique f1, f2, f3 combinations (DISTINCT f1, f2, f3)
I have two variants:
func aggregate1(t []row) map[key]int64 {
res := map[key]int64{}
for _, r := range t {
res[key{r.f1, r.f2, r.f3}] += r.v
}
return res
}
func aggregate2(t []row) map[string]*row {
res := map[string]*row{}
for _, r := range t {
var sb strings.Builder
sb.WriteString(r.f1)
sb.WriteString("#")
sb.WriteString(r.f2)
sb.WriteString("#")
sb.WriteString(r.f3)
id := sb.String()
t := res[id]
if t == nil {
t = &row{f1: r.f1, f2: r.f2, f3: r.f3, v: 0}
res[id] = t
}
t.v += r.v
}
return res
}
The first variant spends too much time in https://golang.org/pkg/runtime/?m=all#mapassign (runtime.mapassign)
The idea of the second variant is to use faster https://golang.org/pkg/runtime/?m=all#mapassign_faststr (runtime.mapassign_faststr), but strings.Builder.WriteString eliminates all benefits from runtime.mapassign_faststr :(
So, can you suggest more ideas about how to implement this aggregation?
I am thinking about how to efficiently calculate "id" in the second variant. It should be unique. My variant is unique because f1, f2 and f3 can't include "#" character.