I've tried to parallelize some calculations for range of uint32 using all CPU cores on my Raspberry PI 2:
...
numcpu := runtime.NumCPU()
runtime.GOMAXPROCS(numcpu) //4 cores
c := make(chan int, numcpu) //buffered channel
results := make([]int, numcpu)
part := math.MaxUint32/numcpu
for i:=0; i<numcpu; i++ {
go calculate(c,i*part,(i+1)*part,i)
}
for i:=0; i<numcpu; i++ {
results = append(results,<-c)
}
...
func calculate(c chan int, from, to, i int) {
... //calculating result
c<-result
println("Goroutine #", i, " stopped") //just for debug
}
But during execution htop shows me that program has 4 subthreads. One of them has no processing time and others are suspended. Could it be that all goroutines migrated to main thread? If it is, how can I prevent it?
Update:
When I cross-compiled binary by the Windows using gox, the problem was solved, despite the fact that performance has rapidly decreased. But why when I using binary built by go build on my RaspPI these subthreads are suspending?
Update:
After updating from standard apt-get
ed golang v1.03 for Debian Wheezy to golang v1.5 the problems seems to be solved (despite the fact that all.bash cgo-tests was failed). Thank you all for optimization tips. I'm beginner in golang and don't know much about version-specific optimization.
But if you still want to inspect code: pastebin