We have transaction log files in which each transaction is a single line in JSON format. We often need to take selected parts of the data, perform a single time conversion, and feed results into another system in a specific format. I wrote a Python script that does this as we need, but I hoped that Go would be faster, and would give me a chance to start learning Go. So, I wrote the following:
package main
import "encoding/json"
import "fmt"
import "time"
import "bufio"
import "os"
func main() {
sep := ","
reader := bufio.NewReader(os.Stdin)
for {
data, _ := reader.ReadString('
')
byt := []byte(data)
var dat map[string]interface{}
if err := json.Unmarshal(byt, &dat); err != nil {
break
}
status := dat["status"].(string)
a_status := dat["a_status"].(string)
method := dat["method"].(string)
path := dat["path"].(string)
element_uid := dat["element_uid"].(string)
time_local := dat["time_local"].(string)
etime, _ := time.Parse("[02/Jan/2006:15:04:05 -0700]", time_local)
fmt.Print(status, sep, a_status, sep, method, sep, path, sep, element_uid, sep, etime.Unix(), "
")
}
}
That compiles without complaint, but I'm surprised at the lack of performance improvement. To test, I placed 2,000,000 lines of logs into a tmpfs (to ensure that disk I/O would not be a limitation) and compared the two versions of the script. My results:
$ time cat /mnt/ramdisk/logfile | ./stdin_conv > /dev/null
real 0m51.995s
$ time cat /mnt/ramdisk/logfile | ./stdin_conv.py > /dev/null
real 0m52.471s
$ time cat /mnt/ramdisk/logfile > /dev/null
real 0m0.149s
How can this be made faster? I have made some rudimentary efforts. The ffjson project, for example, proposes to create static functions that make reflection unnecessary; however, I have failed so far to get it to work, getting the error:
Error: Go Run Failed for: /tmp/ffjson-inception810284909.go
STDOUT:
STDERR:
/tmp/ffjson-inception810284909.go:9:2: import "json_parse" is a program, not an importable package
:
Besides, wouldn't what I have above be considered statically typed? Possibly not-- I am positively dripping behind the ears where Go is concerned. I have tried selectively disabling different attributes in the Go code to see if one is especially problematic. None have had an appreciable effect on performance. Any suggestions on improving performance, or is this simply a case where compiled languages have no substantial benefit over others?