douzhan8303 2013-07-25 20:45
浏览 152
已采纳

我的Go程序如何使所有CPU内核繁忙?

Goroutines are light-weight processes that are automatically time-sliced onto one or more operating system threads by the Go runtime. (This is a really cool feature of Go!)

Suppose I have a concurrent application like a webserver. There is plenty of stuff happening concurrently in my hypothetical program, without much non-concurrent (Amdahl's Law) ratio.

It seems that the default number of operating system threads in use is currently 1. Does this mean that only one CPU core gets used?

If I start my program with

runtime.GOMAXPROCS(runtime.NumCPU())

will that give reasonably efficient use of all the cores on my PC?

Is there any "parallel slackness" benefit from having even more OS threads in use, e.g. via some heuristic

runtime.GOMAXPROCS(runtime.NumCPU() * 2)

?

  • 写回答

3条回答 默认 最新

  • doujia8801 2013-07-25 21:04
    关注

    From the Go FAQ:

    Why doesn't my multi-goroutine program use multiple CPUs?

    You must set the GOMAXPROCS shell environment variable or use the similarly-named function of the runtime package to allow the run-time support to utilize more than one OS thread.

    Programs that perform parallel computation should benefit from an increase in GOMAXPROCS. However, be aware that concurrency is not parallelism.

    (UPDATE 8/28/2015: Go 1.5 is set to make the default value of GOMAXPROCS the same as the number of CPUs on your machine, so this shouldn't be a problem anymore)

    And

    Why does using GOMAXPROCS > 1 sometimes make my program slower?

    It depends on the nature of your program. Problems that are intrinsically sequential cannot be sped up by adding more goroutines. Concurrency only becomes parallelism when the problem is intrinsically parallel.

    In practical terms, programs that spend more time communicating on channels than doing computation will experience performance degradation when using multiple OS threads. This is because sending data between threads involves switching contexts, which has significant cost. For instance, the prime sieve example from the Go specification has no significant parallelism although it launches many goroutines; increasing GOMAXPROCS is more likely to slow it down than to speed it up.

    Go's goroutine scheduler is not as good as it needs to be. In future, it should recognize such cases and optimize its use of OS threads. For now, GOMAXPROCS should be set on a per-application basis.

    In short: it is very difficult to make Go use "efficient use of all your cores". Simply spawning a billion goroutines and increasing GOMAXPROCS is just as likely to degrade your performance as speed it up because it will be switching thread contexts all the time. If you have a large program that is parallelizable, then increasing GOMAXPROCS to the number of parallel components works fine. If you have a parallel problem embedded in a largely non-parallel program, it may speed up, or you may have to make creative use of functions like runtime.LockOSThread() to ensure the runtime distributes everything correctly (generally speaking Go just dumbly spreads currently non-blocking Goroutines haphazardly and evenly among all active threads).

    Also, GOMAXPROCS is the number of CPU cores to use, if it's greater than NumCPU I'm fairly sure that it simply clamps to NumCPU. GOMAXPROCS isn't strictly equal to the number of threads. I'm not 100% sure of exactly when the runtime decides to spawn new threads, but one instance is when the number of blocking goroutines using runtime.LockOSThread() is greater than or equal to GOMAXPROCs -- it will spawn more threads than cores so it can keep the rest of the program running sanely.

    Basically, it's quite simple to increase GOMAXPROCS and make go use all cores of your CPU. It's quite another thing at this point in Go's development to actually get it to smartly and efficiently use all cores of your CPU, requiring a lot of program design and finagling to get right.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?