I am working on a command line tool in Go called redis-mass that converts a bunch of redis commands into redis protocol format.
The first step was to port the node.js version, almost literally to Go. I used ioutil.ReadFile(inputFileName)
to get a string version of the file and then returned an encoded string as output.
When I ran this on a file with 2,000,000 redis commands, it took about 8 seconds, compared to about 16 seconds with the node version. I guessed that the reason it was only twice as fast was because it was reading the whole file into memory first, so I changed my encoding function to accept a pair (raw io.Reader, enc io.Writer)
, and it looks like this:
func EncodeStream(raw io.Reader, enc io.Writer) {
var args []string
var length int
scanner := bufio.NewScanner(raw)
for scanner.Scan() {
command := strings.TrimSpace(scanner.Text())
args = parse(command)
length = len(args)
if length > 0 {
io.WriteString(enc, fmt.Sprintf("*%d
", length))
for _, arg := range args {
io.WriteString(enc, fmt.Sprintf("$%d
%s
", len(arg), arg))
}
}
}
}
However, this took 12 seconds on the 2 million line file, so I used github.com/pkg/profile to see how it was using memory, and it looks like the number of memory allocations is huge:
# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912
Can I constrain the io.Writer
to use a fixed sized buffer and avoid all those allocations?
More generally, how can I avoid excessive allocations in this method? Here's the full source for more context