Rewriting a simple program from C# to Go, I found the resulting executable 3 to 4 times slower. Expecialy the Go version use 3 to 4 times more CPU. It's surprising because the code does many I/O and is not supposed to consume significant amount of CPU.
I made a very simple version only doing sequential writes, and made benchmarks. I ran the same benchmarks on Windows 10 and Linux (Debian Jessie). The time can't be compared (not the same systems, disks, ...) but the result is interesting.
I'm using the same Go version on both platforms : 1.6
On Windows os.File.Write use cgo (see runtime.cgocall
below), not on Linux. Why ?
Here is the disk.go program :
package main
import (
"crypto/rand"
"fmt"
"os"
"time"
)
const (
// size of the test file
fullSize = 268435456
// size of read/write per call
partSize = 128
// path of temporary test file
filePath = "./bigfile.tmp"
)
func main() {
buffer := make([]byte, partSize)
seqWrite := func() error {
return sequentialWrite(filePath, fullSize, buffer)
}
err := fillBuffer(buffer)
panicIfError(err)
duration, err := durationOf(seqWrite)
panicIfError(err)
fmt.Printf("Duration : %v
", duration)
}
// It's just a test ;)
func panicIfError(err error) {
if err != nil {
panic(err)
}
}
func durationOf(f func() error) (time.Duration, error) {
startTime := time.Now()
err := f()
return time.Since(startTime), err
}
func fillBuffer(buffer []byte) error {
_, err := rand.Read(buffer)
return err
}
func sequentialWrite(filePath string, fullSize int, buffer []byte) error {
desc, err := os.OpenFile(filePath, os.O_WRONLY|os.O_CREATE, 0666)
if err != nil {
return err
}
defer func() {
desc.Close()
err := os.Remove(filePath)
panicIfError(err)
}()
var totalWrote int
for totalWrote < fullSize {
wrote, err := desc.Write(buffer)
totalWrote += wrote
if err != nil {
return err
}
}
return nil
}
The benchmark test (disk_test.go) :
package main
import (
"testing"
)
// go test -bench SequentialWrite -cpuprofile=cpu.out
// Windows : go tool pprof -text -nodecount=10 ./disk.test.exe cpu.out
// Linux : go tool pprof -text -nodecount=10 ./disk.test cpu.out
func BenchmarkSequentialWrite(t *testing.B) {
buffer := make([]byte, partSize)
err := sequentialWrite(filePath, fullSize, buffer)
panicIfError(err)
}
The Windows result (with cgo) :
11.68s of 11.95s total (97.74%)
Dropped 18 nodes (cum <= 0.06s)
Showing top 10 nodes out of 26 (cum >= 0.09s)
flat flat% sum% cum cum%
11.08s 92.72% 92.72% 11.20s 93.72% runtime.cgocall
0.11s 0.92% 93.64% 0.11s 0.92% runtime.deferreturn
0.09s 0.75% 94.39% 11.45s 95.82% os.(*File).write
0.08s 0.67% 95.06% 0.16s 1.34% runtime.deferproc.func1
0.07s 0.59% 95.65% 0.07s 0.59% runtime.newdefer
0.06s 0.5% 96.15% 0.28s 2.34% runtime.systemstack
0.06s 0.5% 96.65% 11.25s 94.14% syscall.Write
0.05s 0.42% 97.07% 0.07s 0.59% runtime.deferproc
0.04s 0.33% 97.41% 11.49s 96.15% os.(*File).Write
0.04s 0.33% 97.74% 0.09s 0.75% syscall.(*LazyProc).Find
The Linux result (without cgo) :
5.04s of 5.10s total (98.82%)
Dropped 5 nodes (cum <= 0.03s)
Showing top 10 nodes out of 19 (cum >= 0.06s)
flat flat% sum% cum cum%
4.62s 90.59% 90.59% 4.87s 95.49% syscall.Syscall
0.09s 1.76% 92.35% 0.09s 1.76% runtime/internal/atomic.Cas
0.08s 1.57% 93.92% 0.19s 3.73% runtime.exitsyscall
0.06s 1.18% 95.10% 4.98s 97.65% os.(*File).write
0.04s 0.78% 95.88% 5.10s 100% _/home/sam/Provisoire/go-disk.sequentialWrite
0.04s 0.78% 96.67% 5.05s 99.02% os.(*File).Write
0.04s 0.78% 97.45% 0.04s 0.78% runtime.memclr
0.03s 0.59% 98.04% 0.08s 1.57% runtime.exitsyscallfast
0.02s 0.39% 98.43% 0.03s 0.59% os.epipecheck
0.02s 0.39% 98.82% 0.06s 1.18% runtime.casgstatus