douou6696 2017-11-09 02:05
浏览 38
已采纳

解释预分配切片的基准

I've been trying to understand slice preallocation with make and why it's a good idea. I noticed a large performance difference between preallocating a slice and appending to it vs just initializing it with 0 length/capacity and then appending to it. I wrote a set of very simple benchmarks:

import "testing"

func BenchmarkNoPreallocate(b *testing.B) {
    for i := 0; i < b.N; i++ {
        // Don't preallocate our initial slice
        init := []int64{}
        init = append(init, 5)
    }
}

func BenchmarkPreallocate(b *testing.B) {
    for i := 0; i < b.N; i++ {
        // Preallocate our initial slice
        init := make([]int64, 0, 1)
        init = append(init, 5)
    }
}

and was a little puzzled with the results:

$ go test -bench=. -benchmem
goos: linux
goarch: amd64
BenchmarkNoPreallocate-4    30000000            41.8 ns/op         8 B/op          1 allocs/op
BenchmarkPreallocate-4      2000000000           0.29 ns/op        0 B/op          0 allocs/op

I have a couple of questions:

  • Why are there no allocations (it shows 0 allocs/op) in the preallocation benchmark case? Certainly we're preallocating, but the allocation had to have happened at some point.
  • I imagine this may become clearer after the first question is answered, but how is the preallocation case so much quicker? Am I misinterpetting this benchmark?

Please let me know if anything is unclear. Thank you!

  • 写回答

1条回答 默认 最新

  • duanpiao6679 2017-11-09 02:52
    关注

    Go has an optimizing compiler. Constants are evaluated at compile time. Variables are evaluated at runtime. Constant values can be used to optimize compiler generated code. For example,

    package main
    
    import "testing"
    
    func BenchmarkNoPreallocate(b *testing.B) {
        for i := 0; i < b.N; i++ {
            // Don't preallocate our initial slice
            init := []int64{}
            init = append(init, 5)
        }
    }
    
    func BenchmarkPreallocateConst(b *testing.B) {
        const (
            l = 0
            c = 1
        )
        for i := 0; i < b.N; i++ {
            // Preallocate our initial slice
            init := make([]int64, l, c)
            init = append(init, 5)
        }
    }
    
    func BenchmarkPreallocateVar(b *testing.B) {
        var (
            l = 0
            c = 1
        )
        for i := 0; i < b.N; i++ {
            // Preallocate our initial slice
            init := make([]int64, l, c)
            init = append(init, 5)
        }
    }
    

    Output:

    $ go test alloc_test.go -bench=. -benchmem
    BenchmarkNoPreallocate-4         50000000    39.3 ns/op     8 B/op    1 allocs/op
    BenchmarkPreallocateConst-4    2000000000     0.36 ns/op    0 B/op    0 allocs/op
    BenchmarkPreallocateVar-4        50000000    28.2 ns/op     8 B/op    1 allocs/op
    

    Another interesting set of benchmarks:

    package main
    
    import "testing"
    
    func BenchmarkNoPreallocate(b *testing.B) {
        const (
            l = 0
            c = 8 * 1024
        )
        for i := 0; i < b.N; i++ {
            // Don't preallocate our initial slice
            init := []int64{}
            for j := 0; j < c; j++ {
                init = append(init, 42)
            }
        }
    }
    
    func BenchmarkPreallocateConst(b *testing.B) {
        const (
            l = 0
            c = 8 * 1024
        )
        for i := 0; i < b.N; i++ {
            // Preallocate our initial slice
            init := make([]int64, l, c)
            for j := 0; j < cap(init); j++ {
                init = append(init, 42)
            }
        }
    }
    
    func BenchmarkPreallocateVar(b *testing.B) {
        var (
            l = 0
            c = 8 * 1024
        )
        for i := 0; i < b.N; i++ {
            // Preallocate our initial slice
            init := make([]int64, l, c)
            for j := 0; j < cap(init); j++ {
                init = append(init, 42)
            }
        }
    }
    

    Output:

    $ go test peter_test.go -bench=. -benchmem
    BenchmarkNoPreallocate-4       20000   75656 ns/op   287992 B/op   19 allocs/op
    BenchmarkPreallocateConst-4   100000   22386 ns/op    65536 B/op    1 allocs/op
    BenchmarkPreallocateVar-4     100000   22112 ns/op    65536 B/op    1 allocs/op
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 lammps拉伸应力应变曲线分析
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试,帮帮忙吧
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建