I'm working on picking up a few of concurrency patterns of Go. I looked at implementing background workers using goroutines and input/output channels, and noticed that when I sending new jobs to the receiving channel (essentially enqueuing new jobs) I have to do it in a goroutine or the scheduling gets messed up. Meaning:
This crashes:
for _, jobData := range(dataSet) {
input <- jobData
}
This works:
go func() {
for _, jobData := range(dataSet) {
input <- jobData
}
}()
For something more concrete, I played with some nonsense code (here it is in go playground):
package main
import (
"log"
"runtime"
)
func doWork(data int) (result int) {
// ... some 'heavy' computation
result = data * data
return
}
// do the processing of the input and return
// results on the output channel
func Worker(input, output chan int) {
for data := range input {
output <- doWork(data)
}
}
func ScheduleWorkers() {
input, output := make(chan int), make(chan int)
for i := 0 ; i < runtime.NumCPU() ; i++ {
go Worker(input, output)
}
numJobs := 20
// THIS DOESN'T WORK
// and crashes the program
/*
for i := 0 ; i < numJobs ; i++ {
input <- i
}
*/
// THIS DOES
go func() {
for i := 0 ; i < numJobs ; i++ {
input <- i
}
}()
results := []int{}
for i := 0 ; i < numJobs ; i++ {
// read off results
result := <-output
results = append(results, result)
// do stuff...
}
log.Printf("Result: %#v
", results)
}
func main() {
ScheduleWorkers()
}
I'm trying to wrap my head around this subtle difference - help is appreciated. Thanks.