I have a go program that is using too much memory and therefore getting killed, so I want to try and keep memory usage down. Here's a simplified silly version of what I'm doing, revealing the problem:
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
"os/exec"
"runtime"
"runtime/debug"
"strconv"
"time"
)
func main() {
source := "/tmp/1G.source"
repeats, _ := strconv.Atoi(os.Args[1])
m := &runtime.MemStats{}
err := exec.Command("dd", "if=/dev/zero", "of="+source, "bs=1073741824", "count=1").Run()
if err != nil {
log.Fatalf("failed to create 1GB file: %s
", err)
}
fmt.Printf("created 1GB source file, %s
", memory_usage(m))
// read it multiple times
switch os.Args[2] {
case "1":
fmt.Println("re-using a byte slice and emptying it each time")
// var data []byte
for i := 1; i <= repeats; i++ {
data, _ := ioutil.ReadFile(source)
if len(data) > 0 { // just so we use data
data = nil
}
fmt.Printf("did read %d, %s
", i, memory_usage(m))
}
case "2":
fmt.Println("ignoring the return value entirely")
for i := 1; i <= repeats; i++ {
ioutil.ReadFile(source)
fmt.Printf("did read %d, %s
", i, memory_usage(m))
}
case "3":
fmt.Println("ignoring the return value entirely, forcing memory freeing")
for i := 1; i <= repeats; i++ {
ioutil.ReadFile(source)
debug.FreeOSMemory()
fmt.Printf("did read %d, %s
", i, memory_usage(m))
}
}
// wait incase garbage collection needs time to do something
<-time.After(5 * time.Second)
fmt.Printf("all done, %s
", memory_usage(m))
os.Exit(0)
}
func memory_usage(m *runtime.MemStats) string {
runtime.ReadMemStats(m)
return fmt.Sprintf("system memory: %dMB; heap alloc: %dMB; heap idle-released: %dMB", int((m.Sys/1024)/1024), int((m.HeapAlloc/1024)/1024), int(((m.HeapIdle-m.HeapReleased)/1024)/1024))
}
If I call this with main 7 2
I get:
created 1GB source file, system memory: 2MB; heap alloc: 0MB; heap idle-released: 1MB
ignoring the return value entirely
did read 1, system memory: 4233MB; heap alloc: 3072MB; heap idle-released: 1024MB
did read 2, system memory: 4233MB; heap alloc: 3072MB; heap idle-released: 1024MB
did read 3, system memory: 4233MB; heap alloc: 3072MB; heap idle-released: 1024MB
did read 4, system memory: 4233MB; heap alloc: 3072MB; heap idle-released: 1023MB
did read 5, system memory: 6347MB; heap alloc: 3584MB; heap idle-released: 2559MB
did read 6, system memory: 6347MB; heap alloc: 3072MB; heap idle-released: 3071MB
did read 7, system memory: 6347MB; heap alloc: 3072MB; heap idle-released: 3071MB
all done, system memory: 6347MB; heap alloc: 3072MB; heap idle-released: 3071MB
Perhaps off-topic, but is it expected that reading in a 1GB file results in 4GB of system memory usage?
Anyway, Ideally I want an unlimited number of identical loops to use a ~constant amount of memory, instead of increasing from 4GB to 6GB.
So I thought forcing freeing of memory would help, but main 7 3
gives:
created 1GB source file, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
ignoring the return value entirely, forcing memory freeing
did read 1, system memory: 4237MB; heap alloc: 0MB; heap idle-released: 0MB
did read 2, system memory: 4237MB; heap alloc: 0MB; heap idle-released: 0MB
did read 3, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
did read 4, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
did read 5, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
did read 6, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
did read 7, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
all done, system memory: 6351MB; heap alloc: 0MB; heap idle-released: 0MB
How can I keep the memory usage down for all loops?
Following suggestions in the comments, I tried a new case:
case "4":
fmt.Println("doing a streaming read")
b := make([]byte, 10000, 10000)
for i := 1; i <= repeats; i++ {
f, _ := os.Open(source)
r := bufio.NewReader(f)
for {
_, err := r.Read(b)
if err != nil {
break
}
}
fmt.Printf("did read %d, %s
", i, memory_usage(m))
}
}
But I still get memory usage increase with number of loops:
created 1GB source file, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
doing a streaming read
did read 1, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
did read 2, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
did read 3, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
did read 4, system memory: 1MB; heap alloc: 0MB; heap idle-released: 0MB
did read 5, system memory: 2MB; heap alloc: 0MB; heap idle-released: 0MB
did read 6, system memory: 2MB; heap alloc: 0MB; heap idle-released: 0MB
did read 7, system memory: 2MB; heap alloc: 0MB; heap idle-released: 0MB
all done, system memory: 2MB; heap alloc: 0MB; heap idle-released: 0MB
To generalise the question, when you're using 3rd party functions (ie. where you have no control over how they're using memory within themselves) in a loop, and are doing the exact same thing every time in the loop, is there any way to force Go to re-use the memory it has already allocated instead of requesting more from the OS?