We've been decoding a lot of XML lately using golang and encoding/xml
. We noticed that, after quite a few files, our boxes run out of memory, start swapping, and generally die an unhappy death. So we made a test program. Here it is:
package main
import (
"encoding/xml"
"io/ioutil"
"log"
"time"
)
// this XML is for reading AWS SQS messages
type message struct {
Body []string `xml:"ReceiveMessageResult>Message>Body"`
ReceiptHandle []string `xml:"ReceiveMessageResult>Message>ReceiptHandle"`
}
func main() {
var m message
readTicker := time.NewTicker(5 * time.Millisecond)
body, err := ioutil.ReadFile("test.xml")
for {
select {
case <-readTicker.C:
err = xml.Unmarshal(body, &m)
if err != nil {
log.Println(err.Error())
}
}
}
}
All it does is repeatedly decode an XML file over and over again. Our boxes show the same symptom: the memory usage of the binary grows without bound, until the box starts swapping.
We also added in some profiling code, which fires after 20s into the above script, and got the following from pprof
's top100
:
(pprof) top100
Total: 56.0 MB
55.0 98.2% 98.2% 55.0 98.2% encoding/xml.copyValue
1.0 1.8% 100.0% 1.0 1.8% cnew
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).WriteByte
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).grow
0.0 0.0% 100.0% 0.5 0.9% bytes.makeSlice
0.0 0.0% 100.0% 55.5 99.1% encoding/xml.(*Decoder).Decode
...
Running this later on, before the box runs out of memory, yields a higher total but pretty much the same percentages. Can anyone help us out? What are we missing?
Thanks in advance!