douqian1296 2016-08-25 01:26
浏览 45

确保goroutine清理,最佳实践

I have a fundamental understanding problem about how to make sure that spawned goroutines are "closed" properly in the context of long-running processes. I watched talks regarding that topic and read about best practices. In order to understand my question please refer to the video "Advanced Go Concurrency Patterns" here

For the following, if you run code on your machine please export the environment variable GOTRACEBACK=all so you are able to see routine states after panic.

I put the code for the original example here: naive (it does not execute on go playground, I guess bacause a time statement is used. Please copy the code and execute it locally)

The result of the panic of the naive implementation after execution is

panic: show me the stacks goroutine 1 [running]: panic(0x48a680, 0xc4201d8480) /usr/lib/go/src/runtime/panic.go:500 +0x1a1 main.main() /home/flx/workspace/go/go-rps/playground/ball-naive.go:18 +0x16b goroutine 5 [chan receive]: main.player(0x4a4ec4, 0x2, 0xc42006a060) /home/flx/workspace/go/go-rps/playground/ball-naive.go:23 +0x61 created by main.main /home/flx/workspace/go/go-rps/playground/ball-naive.go:13 +0x76 goroutine 6 [chan receive]: main.player(0x4a4ec6, 0x2, 0xc42006a060) /home/flx/workspace/go/go-rps/playground/ball-naive.go:23 +0x61 created by main.main /home/flx/workspace/go/go-rps/playground/ball-naive.go:14 +0xad exit status 2

That demonstrates the underlying problem of leaving dangling goroutines on the system, which is especially bad for long running processes.

So for my personal understanding I tried two slightly more sophisticated variants to be found here:

for-select with default

generator pattern with quit channel

(again, not executable on the playground, cause "process takes too long")

The first solution is not fitting for various reasons, even leading to non-determinism in executed steps, depending on goroutine execution speed.

Now I thought -- and here finally comes the question! -- that the second solution with the quit channel would be appropriate to eliminate all executional traces from the system before exiting. Anyhow, "sometimes" the program exits too fast and the panic reports an additional goroutine runnable still residing on the system. The panic output:

panic: show me the stacks goroutine 1 [running]: panic(0x48d8e0, 0xc4201e27c0) /usr/lib/go/src/runtime/panic.go:500 +0x1a1 main.main() /home/flx/workspace/go/go-rps/playground/ball-perfect.go:20 +0x1a9 goroutine 20 [runnable]: main.player.func1(0xc420070060, 0x4a8986, 0x2, 0xc420070120) /home/flx/workspace/go/go-rps/playground/ball-perfect.go:27 +0x211 created by main.player /home/flx/workspace/go/go-rps/playground/ball-perfect.go:36 +0x7f exit status 2

My question is: that should not happen, right? I do use a quit channel to cleanup state before stepping forward to panicking.

I did a final try of implementing safe cleanup behavior here: artificial wait time for runnables to close

Anyhow, that solution does not feel right and may as well not be applicable to large amounts of runnables?

What would be the recommended and most idiomatic pattern to ensure correct cleanup?

Thanks for your time

  • 写回答

1条回答 默认 最新

  • dongshi2836 2016-08-25 07:17
    关注

    Your are fooled by the output: Your "generator pattern with quit channel" works perfectly fine, the two goroutines actually are terminated properly.

    You see them in the trace because you panic too early. Remember: You have to goroutines running concurrently with main. main "stops" these goroutines by signaling on the quit channel. After these two sends on line 18 and 19 the two receives on line 32 have happened. And nothing more! You still have three goroutines running: Main is between lines 19 and 20 and the player goroutines are between lines 32 and 33. If now the panic in main happens before the return in player then the player goroutines are still there and are show in the panic stacktrace. These goroutines would have ended several milliseconds later if only the scheduler would have had time to execute the return on line 33 (which it hadn't as you killed it by panicking).

    This is an instance of the "main ends to early to see concurrent goroutines do work" problem asked once a month here. You do see the concorrent goroutines doing work, but not all work. You might try sleeping 2 milliseconds before the panic and your player goroutines will have time to execute the return and everything is fine.

    评论

报告相同问题?

悬赏问题

  • ¥15 matlab有关常微分方程的问题求解决
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算
  • ¥15 java如何提取出pdf里的文字?
  • ¥100 求三轴之间相互配合画圆以及直线的算法
  • ¥100 c语言,请帮蒟蒻写一个题的范例作参考
  • ¥15 名为“Product”的列已属于此 DataTable