douliangbian7323 2016-02-29 08:16
浏览 185
已采纳

声明非常大的数组并遍历stdin时,golang连续内存增加

The following code declares two arrays, and then iterates over stdin ( just blindly iterates over the file - no interaction with the arrays ).

This is causing continuous increase in memory.

However, if I just declare two arrays and sleep - there is no increase in memory.

Similarly, if I just iterate over stdin - there is no increase in memory.

But together ( apart from the memory allocated for the arrays) there is a continuous increase.

I measure this by looking at the RES memory using top tool.

I have commented out the first few lines in func doSomething() to show that there is no memory increase when it is commented. Uncommenting the lines and running will cause an increase.

NOTE: This was run on go 1.4.2, 1.5.3 and 1.6

NOTE: You will need to recreate this on a machine with at least 16GB RAM as I have observed it only on the array size of 1 billion.

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
)

type MyStruct struct {
    arr1 []int
    arr2 []int
}

func (ms *MyStruct) Init(size int, arr1 []int, arr2 []int) error {
    fmt.Printf("initializing mystruct arr1...
")
    ms.arr1 = arr1
    if ms.arr1 == nil {
        ms.arr1 = make([]int, size, size)
    }
    fmt.Printf("initializing mystruct arr2...
")
    ms.arr2 = arr2
    if ms.arr2 == nil {
        ms.arr2 = make([]int, size, size)
    }
    fmt.Printf("done initializing ...
")
    for i := 0; i < size; i++ {
        ms.arr1[i] = 0
        ms.arr2[i] = 0
    }
    return nil
}

func doSomething() error {
    fmt.Printf("starting...
")
    fmt.Printf("allocating
")
    /* NOTE WHEN UNCOMMENTED CAUSES MEMORY INCREASE 
    ms := &MyStruct{}
    size := 1000000000
    ms.Init(size, nil, nil)
    */

    fmt.Printf("finished allocating..%d %d
", len(ms.arr1), len(ms.arr2))

    fmt.Printf("reading from stdin...
")
    reader := bufio.NewReader(os.Stdin)

    var line string
    var readErr error
    var lineNo int = 0
    for {
        if lineNo%1000000 == 0 {
            fmt.Printf("read %d lines...
", lineNo)
        }
        lineNo++

        line, readErr = reader.ReadString('
')
        if readErr != nil {
            fmt.Printf("break at %s
", line)
            break
        }
    }

    if readErr == io.EOF {
        readErr = nil
    }

    if readErr != nil {
        return readErr
    }

    return nil
}

func main() {
    if err := doSomething(); err != nil {
        panic(err)
    }
    fmt.Printf("done...
")
}
  1. Is this an issue with my code ? Or is the go system doing something unintended ?
  2. If its the latter, how can I go about debugging this ?

To make it easier to replicate here are pastebin files for good case ( commented portion of the above code) and bad case ( with uncommented portion )

wget http://pastebin.com/raw/QfG22xXk -O badcase.go
yes "1234567890" | go run badcase.go

wget http://pastebin.com/raw/G9xS2fKy -O goodcase.go
yes "1234567890" | go run goodcase.go
  • 写回答

1条回答 默认 最新

  • douxing1850 2016-02-29 13:23
    关注

    Thank you Volker for your above comments. I wanted to capture the process of debugging this as an answer.

    The RES top / htop just tells you at a process level what is going on with memory. GODEBUG="gctrace=1" gives you more insight into how the memory is being handled.

    A simple run with gctrace set gives the following

    root@localhost ~ # yes "12345678901234567890123456789012" | GODEBUG="gctrace=1" go run badcase.go
    starting...
    allocating
    initializing mystruct arr1...
    initializing mystruct arr2...
    gc 1 @0.050s 0%: 0.19+0.23+0.068 ms clock, 0.58+0.016/0.16/0.25+0.20 ms cpu, 7629->7629->7629 MB, 7630 MB goal, 8 P
    done initializing ...
    gc 2 @0.100s 0%: 0.070+2515+0.23 ms clock, 0.49+0.025/0.096/0.24+1.6finished allocating..1000000000 1000000000
     ms cpu, 15258->15258reading from stdin...
    ->15258 MB, 15259read 0 lines...
     MB goal, 8 P
    gc 3 @2.620s 0%: 0.009+0.32+0.23 ms clock, 0.072+0/0.20/0.11+1.8 ms cpu, 15259->15259->15258 MB, 30517 MB goal, 8 P
    
    read 1000000 lines...
    read 2000000 lines...
    read 3000000 lines...
    read 4000000 lines...
    ....
    read 51000000 lines...
    read 52000000 lines...
    read 53000000 lines...
    read 54000000 lines...
    

    What does this mean ?

    As you can see, the gc hasn't been called for a while now. This means that all the garbage generated from reader.ReadString hasn't been collected and free'd.

    Why isn't the garbage collector collecting this garbage ?

    From The go gc

    Instead we provide a single knob, called GOGC. This value controls the total size of the heap relative to the size of reachable objects. The default value of 100 means that total heap size is now 100% bigger than (i.e., twice) the size of the reachable objects after the last collection.

    Since GOGC wasn't set - the default was 100%. So, it would have collected the garbage only when it reached ~32GB. ( Since initially the two arrays give you 16GB of heap space - only when heap doubles will the gc trigger ).

    How can I change this ? Try setting the GOGC=25.

    With the GOGC as 25

    root@localhost ~ # yes "12345678901234567890123456789012" | GODEBUG="gctrace=1" GOGC=25 go run badcase.go
    starting...
    allocating
    initializing mystruct arr1...
    initializing mystruct arr2...
    gc 1 @0.051s 0%: 0.14+0.30+0.11 ms clock, 0.42+0.016/0.31/0.094+0.35 ms cpu, 7629->7629->7629 MB, 7630 MB goal, 8 P
    done initializing ...
    finished allocating..1000000000 1000000000
    gc 2 @0.102s reading from stdin...
    12%: 0.058+2480+0.26 ms clock, 0.40+0.022/2480/0.10+1.8 ms cpu, 15258->15258->15258 MB, 15259 MB goal, 8 P
    read 0 lines...
    gc 3 @2.584s 12%: 0.009+0.20+0.22 ms clock, 0.075+0/0.24/0.046+1.8 ms cpu, 15259->15259->15258 MB, 19073 MB goal, 8 P
    read 1000000 lines...
    read 2000000 lines...
    read 3000000 lines...
    read 4000000 lines...
    ....
    read 19000000 lines...
    read 20000000 lines...
    gc 4 @6.539s 4%: 0.019+2.3+0.23 ms clock, 0.15+0/2.1/12+1.8 ms cpu, 17166->17166->15258 MB, 19073 MB goal, 8 P
    

    As you can see, another gc was triggered.

    But top/htop shows it stable at ~20 GB instead of the calculated 16GB.

    The garbage collector doesn't "have" to give it back to the OS. It will sometimes keep it to use efficiently for the future. It doesn't have to keep taking from the OS and giving back - The extra 4 gb is in its pool of free space to use before asking the OS again.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 求帮我调试一下freefem代码
  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图