I am implementing a worker pool which can take jobs from a channel. After it kept timing out, I realised that when a panic occurs within a worker fcn, even though I have implemented a recovery mechanism, the worker still does not return to the pool again.
In the golang playground, I was able to replicate the issue:
Modified code for play ground:
package main
import "fmt"
import "time"
import "log"
func recovery(id int, results chan<- int) {
if r := recover(); r != nil {
log.Print("IN RECOVERY FUNC - Failed worker: ",id)
results <- 0
}
}
func worker(id int, jobs <-chan int, results chan<- int) {
for j := range jobs {
defer recovery(id, results)
if id == 1 {
panic("TEST")
}
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Second)
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
jobs := make(chan int, 100)
results := make(chan int, 100)
for w := 1; w <= 3; w++ {
go worker(w, jobs, results)
}
for j := 1; j <= 10; j++ {
jobs <- j
}
close(jobs)
for a := 1; a <= 10; a++ {
<-results
}
}
For testing, I have implemented a panic when worker 1 is used. When run, the func panics as expected, and goes into recovery as expected (does not push a value into the channel either), however worker 1 never seems to come back.
Output without panic:
worker 3 started job 1
worker 1 started job 2
worker 2 started job 3
worker 1 finished job 2
worker 1 started job 4
worker 3 finished job 1
worker 3 started job 5
worker 2 finished job 3
worker 2 started job 6
worker 3 finished job 5
worker 3 started job 7
worker 1 finished job 4
worker 1 started job 8
worker 2 finished job 6
worker 2 started job 9
worker 1 finished job 8
worker 1 started job 10
worker 3 finished job 7
worker 2 finished job 9
worker 1 finished job 10
Output with panic:
worker 3 started job 1
2009/11/10 23:00:00 RECOVERY Failed worker: 1
worker 2 started job 3
worker 2 finished job 3
worker 2 started job 4
worker 3 finished job 1
worker 3 started job 5
worker 3 finished job 5
worker 3 started job 6
worker 2 finished job 4
worker 2 started job 7
worker 2 finished job 7
worker 2 started job 8
worker 3 finished job 6
worker 3 started job 9
worker 3 finished job 9
worker 3 started job 10
worker 2 finished job 8
worker 3 finished job 10
How do I return worker 1 back to the pool after recovery (or in the recovery process)