I've run into some mysterious behavior with the Go scheduler, and I'm very curious about what's going on. The gist is that runtime.Gosched()
doesn't work as expected in Linux unless it is preceded by a log.Printf()
call, but it works as expected in both cases on OS X. Here's a minimal setup that reproduces the behavior:
The main goroutine sleeps for 1000 periods of 1ms, and after each sleep pushes a dummy message onto another goroutine via a channel. The second goroutine listens for new messages, and every time it gets one it does 10ms of work. So without any runtime.Gosched()
calls, the program will take 10 seconds to run.
When I add periodic runtime.Gosched()
calls in the second goroutine, as expected the program runtime shrinks down to 1 second on my Mac. However, when I try running the same program on Ubuntu, it still takes 10 seconds. I made sure to set runtime.GOMAXPROCS(1)
in both cases.
Here's where it gets really strange: if I just add a logging statement before the the runtime.Gosched()
calls, then suddenly the program runs in the expected 1 second on Ubuntu as well.
package main
import (
"time"
"log"
"runtime"
)
func doWork(c chan int) {
for {
<-c
// This outer loop will take ~10ms.
for j := 0; j < 100 ; j++ {
// The following block of CPU work takes ~100 microseconds
for i := 0; i < 300000; i++ {
_ = i * 17
}
// Somehow this print statement saves the day in Ubuntu
log.Printf("donkey")
runtime.Gosched()
}
}
}
func main() {
runtime.GOMAXPROCS(1)
c := make(chan int, 1000)
go doWork(c)
start := time.Now().UnixNano()
for i := 0; i < 1000; i++ {
time.Sleep(1 * time.Millisecond)
// Queue up 10ms of work in the other goroutine, which will backlog
// this goroutine without runtime.Gosched() calls.
c <- 0
}
// Whole program should take about 1 second to run if the Gosched() calls
// work, otherwise 10 seconds.
log.Printf("Finished in %f seconds.", float64(time.Now().UnixNano() - start) / 1e9)
}
Additional details: I'm running go1.10 darwin/amd64, and compiling the linux binary with
env GOOS=linux GOARCH=amd64 go build ...
I've tried a few simple variants:
- Just making a log.Printf() call, without the Gosched()
- Making two calls to Gosched()
- Keeping the Gosched() call but replacing the log.Printf() call to a dummy function call
All of these are ~10x slower than calling log.Printf() and then Gosched().
Any insights would be appreciated! This example is of course very artificial, but the issue came up while writing a websocket broadcast server which led to significantly degraded performance.
EDIT: I got rid of the extraneous bits in my example to make things more transparent. I've discovered that without the print statement, the runtime.Gosched()
calls are still getting run, it's just that they seem to be delayed by a fixed 5ms, leading to a total runtime of almost exactly 5seconds in the example below, when the program should finish instantaneously (and does on my Mac, or on Ubuntu with the print statement).
package main
import (
"log"
"runtime"
"time"
)
func doWork() {
for {
// This print call makes the code run 20x faster
log.Printf("donkey")
// Without this line, the program never terminates (as expected). With this line
// and the print call above it, the program takes <300ms as expected, dominated by
// the sleep calls in the main goroutine. But without the print statement, it
// takes almost exactly 5 seconds.
runtime.Gosched()
}
}
func main() {
runtime.GOMAXPROCS(1)
go doWork()
start := time.Now().UnixNano()
for i := 0; i < 1000; i++ {
time.Sleep(10 * time.Microsecond)
runtime.Gosched()
}
log.Printf("Finished in %f seconds.", float64(time.Now().UnixNano() - start) / 1e9)
}