这是Go中的惯用工作线程池吗？

I'm attempting to write a simple worker pool with goroutines.

Is the code I wrote idiomatic? If not, then what should change?
I want to be able to set the maximum number of worker threads to 5 and block until a worker becomes available if all 5 are busy. How would I extend this to only have a pool of 5 workers max? Do I spawn the static 5 goroutines, and give each the work_channel?

code:

package main

import (
    "fmt"
    "math/rand"
    "sync"
    "time"
)

func worker(id string, work string, o chan string, wg *sync.WaitGroup) {
    defer wg.Done()
    sleepMs := rand.Intn(1000)
    fmt.Printf("worker '%s' received: '%s', sleep %dms
", id, work, sleepMs)
    time.Sleep(time.Duration(sleepMs) * time.Millisecond)
    o <- work + fmt.Sprintf("-%dms", sleepMs)
}

func main() {
    var work_channel = make(chan string)
    var results_channel = make(chan string)

    // create goroutine per item in work_channel
    go func() {
        var c = 0
        var wg sync.WaitGroup
        for work := range work_channel {
            wg.Add(1)
            go worker(fmt.Sprintf("%d", c), work, results_channel, &wg)
            c++
        }
        wg.Wait()
        fmt.Println("closing results channel")
        close(results_channel)
    }()

    // add work to the work_channel
    go func() {
        for c := 'a'; c < 'z'; c++ {
            work_channel <- fmt.Sprintf("%c", c)
        }
        close(work_channel)
        fmt.Println("sent work to work_channel")
    }()

    for x := range results_channel {
        fmt.Printf("result: %s
", x)
    }
}

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongwusang0314 2016-07-03 16:54
关注
Your solution is not a worker goroutine pool in any sense: your code does not limit concurrent goroutines, and it does not "reuse" goroutines (it always starts a new one when a new job is received).

Producer-consumer pattern

As posted at Bruteforce MD5 Password cracker, you can make use of the producer-consumer pattern. You could have a designated producer goroutine that would generate the jobs (things to do / calculate), and send them on a jobs channel. You could have a fixed pool of consumer goroutines (e.g. 5 of them) which would loop over the channel on which jobs are delivered, and each would execute / complete the received jobs.

The producer goroutine could simply close the jobs channel when all jobs were generated and sent, properly signalling consumers that no more jobs will be coming. The for ... range construct on a channel handles the "close" event and terminates properly. Note that all jobs sent before closing the channel will still be delivered.

This would result in a clean design, would result in fixed (but arbitrary) number of goroutines, and it would always utilize 100% CPU (if # of goroutines is greater than # of CPU cores). It also has the advantage that it can be "throttled" with the proper selection of the channel capacity (buffered channel) and the number of consumer goroutines.

^{Note that this model to have a designated producer goroutine is not mandatory. You could have multiple goroutines to produce jobs too, but then you must synchronize them too to only close the jobs channel when all producer goroutines are done producing jobs - else attempting to send another job on the jobs channel when it has already been closed results in a runtime panic. Usually producing jobs are cheap and can be produced at a much quicker rate than they can be executed, so this model to produce them in 1 goroutine while many are consuming / executing them is good in practice.}

Handling results:

If jobs have results, you may choose to have a designated result channel on which results could be delivered ("sent back"), or you may choose to handle the results in the consumer when the job is completed / finished. This latter may even be implemented by having a "callback" function that handles the results. The important thing is whether results can be processed independently or they need to be merged (e.g. map-reduce framework) or aggregated.

If you go with a results channel, you also need a goroutine that receives values from it, preventing consumers to get blocked (would occur if buffer of results would get filled).

With results channel

Instead of sending simple string values as jobs and results, I would create a wrapper type which can hold any additional info and so it is much more flexible:

type Job struct { Id int Work string Result string }

Note that the Job struct also wraps the result, so when we send back the result, it also contains the original Job as the context - often very useful. Also note that it is profitable to just send pointers (*Job) on the channels instead of Job values so no need to make "countless" copies of Jobs, and also the size of the Job struct value becomes irrelevant.

Here is how this producer-consumer could look like:

I would use 2 sync.WaitGroup values, their role will follow:

var wg, wg2 sync.WaitGroup

The producer is responsible to generate jobs to be executed:

func produce(jobs chan<- *Job) { // Generate jobs: id := 0 for c := 'a'; c <= 'z'; c++ { id++ jobs <- &Job{Id: id, Work: fmt.Sprintf("%c", c)} } close(jobs) }

When done (no more jobs), the jobs channel is closed which signals consumers that no more jobs will arrive.

Note that produce() sees the jobs channel as send only, because that's what the producer needs to do only with that: send jobs on it (besides closing it, but that is also permitted on a send only channel). An accidental receive in the producer would be a compile time error (detected early, at compile time).

The consumer's responsibility is to receive jobs as long as jobs can be received, and execute them:

func consume(id int, jobs <-chan *Job, results chan<- *Job) { defer wg.Done() for job := range jobs { sleepMs := rand.Intn(1000) fmt.Printf("worker #%d received: '%s', sleep %dms ", id, job.Work, sleepMs) time.Sleep(time.Duration(sleepMs) * time.Millisecond) job.Result = job.Work + fmt.Sprintf("-%dms", sleepMs) results <- job } }

Note that consume() sees the jobs channel as receive only; consumer only needs to receive from it. Similarly the results channel is send only for the consumer.

Also note that the results channel cannot be closed here as there are multiple consumer goroutines, and only the first attempting to close it would succeed and further ones would result in runtime panic! results channel can (must) be closed after all consumer goroutines ended, because then we can be sure no further values (results) will be sent on the results channel.

We have results which need to be analyzed:

func analyze(results <-chan *Job) { defer wg2.Done() for job := range results { fmt.Printf("result: %s ", job.Result) } }

As you can see, this also receives results as long as they may come (until results channel is closed). The results channel for the analyzer is receive only.

Please note the use of channel types: whenever it is sufficient, use only a unidirectional channel type to detect and prevent errors early, at compile time. Only use bidirectional channel type if you do need both directions.

And this is how all these are glued together:

func main() { jobs := make(chan *Job, 100) // Buffered channel results := make(chan *Job, 100) // Buffered channel // Start consumers: for i := 0; i < 5; i++ { // 5 consumers wg.Add(1) go consume(i, jobs, results) } // Start producing go produce(jobs) // Start analyzing: wg2.Add(1) go analyze(results) wg.Wait() // Wait all consumers to finish processing jobs // All jobs are processed, no more values will be sent on results: close(results) wg2.Wait() // Wait analyzer to analyze all results }

Example output:

Here is an example output:

As you can see, results are coming and getting analyzed before all the jobs would be enqueued:

worker #4 received: 'e', sleep 81ms worker #0 received: 'a', sleep 887ms worker #1 received: 'b', sleep 847ms worker #2 received: 'c', sleep 59ms worker #3 received: 'd', sleep 81ms worker #2 received: 'f', sleep 318ms result: c-59ms worker #4 received: 'g', sleep 425ms result: e-81ms worker #3 received: 'h', sleep 540ms result: d-81ms worker #2 received: 'i', sleep 456ms result: f-318ms worker #4 received: 'j', sleep 300ms result: g-425ms worker #3 received: 'k', sleep 694ms result: h-540ms worker #4 received: 'l', sleep 511ms result: j-300ms worker #2 received: 'm', sleep 162ms result: i-456ms worker #1 received: 'n', sleep 89ms result: b-847ms worker #0 received: 'o', sleep 728ms result: a-887ms worker #1 received: 'p', sleep 274ms result: n-89ms worker #2 received: 'q', sleep 211ms result: m-162ms worker #2 received: 'r', sleep 445ms result: q-211ms worker #1 received: 's', sleep 237ms result: p-274ms worker #3 received: 't', sleep 106ms result: k-694ms worker #4 received: 'u', sleep 495ms result: l-511ms worker #3 received: 'v', sleep 466ms result: t-106ms worker #1 received: 'w', sleep 528ms result: s-237ms worker #0 received: 'x', sleep 258ms result: o-728ms worker #2 received: 'y', sleep 47ms result: r-445ms worker #2 received: 'z', sleep 947ms result: y-47ms result: u-495ms result: x-258ms result: v-466ms result: w-528ms result: z-947ms

Try the complete application on the Go Playground.

Without a results channel

Code simplifies significantly if we don't use a results channel but the consumer goroutines handle the result right away (print it in our case). In this case we don't need 2 sync.WaitGroup values (the 2nd was only needed to wait for the analyzer to complete).

Without a results channel the complete solution is like this:

var wg sync.WaitGroup type Job struct { Id int Work string } func produce(jobs chan<- *Job) { // Generate jobs: id := 0 for c := 'a'; c <= 'z'; c++ { id++ jobs <- &Job{Id: id, Work: fmt.Sprintf("%c", c)} } close(jobs) } func consume(id int, jobs <-chan *Job) { defer wg.Done() for job := range jobs { sleepMs := rand.Intn(1000) fmt.Printf("worker #%d received: '%s', sleep %dms ", id, job.Work, sleepMs) time.Sleep(time.Duration(sleepMs) * time.Millisecond) fmt.Printf("result: %s ", job.Work+fmt.Sprintf("-%dms", sleepMs)) } } func main() { jobs := make(chan *Job, 100) // Buffered channel // Start consumers: for i := 0; i < 5; i++ { // 5 consumers wg.Add(1) go consume(i, jobs) } // Start producing go produce(jobs) wg.Wait() // Wait all consumers to finish processing jobs }

Output is "like" that of with results channel (but of course execution/completion order is random).

Try this variant on the Go Playground.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

这是Go中的惯用工作线程池吗？
2016-07-03 14:20

回答 2 已采纳 Your solution is not a worker goroutine pool in any sense: your code does not limit concurrent gor
golang中有惯用范围语义吗？
2015-01-27 12:11

回答 4 已采纳 I do not believe there is an idiomatic way to do this. I'm not sure why you'd want to either, is i
Go中实现Rust式类型匹配的惯用方式是什么？
2018-07-24 21:22

回答 1 已采纳 If your WebEvents share a common functionality, define an explicit interface. type WebEvent inter
并发Ruby：现代并发工具，包括代理，期货，承诺，线程池，主管等。受到Erlang，Clojure，Scala，Go，Java，JavaScript和经典并发模式的启发
2021-02-20 02:24

忠于提供灵感的语言精神但是以对Ruby有意义的方式实施尽可能使语义保持惯用的Ruby 支持在Ruby中有意义的功能排除在Ruby中没有意义的功能身材小巧，瘦弱且松散线程安全向后兼容贡献这颗宝石取决于您的，感谢...
Go中解析动态YAML的惯用方式是什么？
2017-08-25 23:56

回答 1 已采纳 Here's what I came up with instead. Much more readable. // Config represents TAXII 2.0 plugin str
在Go中获取枚举的字符串表示形式的惯用方式是什么？
2018-10-23 08:43

回答 2 已采纳 The second one is more idiomatic because it satisfies Stringer interface. func (day Day) String()
我的Go可组合性方法是惯用的吗？
2016-08-03 04:34

回答 1 已采纳 I'm not sure about idiomatic, but I think you've effectively used interfaces, structures and const
现代并发工具，包括代理，期货，承诺，线程池，主管等。受到Erlang，Clojure，Scala，Go，Java，JavaScript和经典并发模式的启发。-Ruby开发
2021-05-27 11:13

该gem的设计目标是：成为一个“无瑕疵”的工具箱，提供有用的实用程序而无需辩论哪个更好，或者为什么不保留外部gem依赖项忠于提供灵感的语言的精神，但是以一种有意义的方式实施Ruby尽可能使语义保持惯用的Ruby支持...
在Golang中处理api版本的惯用方式是什么？
2015-04-21 17:41

回答 1 已采纳 There are many routing frameworks that allow for grouping, for instance with echo (a very good fra
在Go中实现红黑树的惯用方式是什么？
2014-05-07 01:36

回答 2 已采纳 This is what I came up with. I'd rather accept another answer, but this is the best so far. The B
枚举go中的字母的惯用方式是什么？
2013-09-30 03:58

回答 4 已采纳 for _, c := range "abcdefghijklmnopqrstuvwxyz" { fmt.Println(string(c)) }
Go语言之禅
2020-02-25 08:12

Tony Bai的博客本文翻译自Go社区知名Gopher和博主Dave Cheney的文章《The Zen of Go》。本文来自我在GopherCon Israel 2020上的演讲。文章很长:) 如果您希...
Python的list.pop（）方法的Go惯用法是什么？ list
2018-09-27 23:46

回答 5 已采纳 The "Cut" trick in the linked document does what you want: xs := []int{1, 2, 3, 4, 5} i := 0 //
Go语言并发之道--笔记1
2021-01-19 10:48

liao__ran的博客当两个或对各操作必须按正确的顺序执行，当程序未保证这个顺序，就会发生竞争条件示例： var data int go func() { //使用关键字go并发的运行一个函数，即goroutine data++ }() if data==0 { fmt.Printf("the ...
可能是最晚的2020年终总结
2021-03-03 10:38

crossoverJie的博客 CPU 100% 排查优化实践》《判断一个元素在亿级数据中是否不存在》《设计一个可插拔的 IOC 容器》《一次 HashSet 所引起的并发问题》《一次内存溢出排查实践》《如何优雅的使用和理解线程池》
go语言程序设计学习笔记-1
2019-03-05 19:31

weixin_34056162的博客 go标准库文档...如果想要再本地直接查看go官方文档，可以再终端中运行： userdeMacBook-Pro:~ user$ godoc -http=:8000 然后在浏览器中运行http://localhost:8000就能够查看文档了...
Go 相关的框架，库和软件的精选清单
2020-07-03 09:37

baobaodqh的博客这是一个Go 相关的框架，库和软件的精选清单，引用自 awesome-go项目，并翻译补充而来这是一个Go 相关的框架，库和软件的精选清单，引用自 awesome-go项目，并翻译补充而来音频和音乐用于处理音频的库。 ...
没有解决我的问题, 去提问

悬赏问题

¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
¥15 （希望可以解决问题）ma和mb文件无法正常打开，打开后是空白，但是有正常内存占用，但可以在打开Maya应用程序后打开场景ma和mb格式。
¥15 绘制多分类任务的roc曲线时只画出了一类的roc，其它的auc显示为nan
¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
¥20 腾讯企业邮箱邮件可以恢复么
¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗？
¥15 错误 LNK2001 无法解析的外部符号
¥50 安装pyaudiokits失败
¥15 计组这些题应该咋做呀
¥60 更换迈创SOL6M4AE卡的时候，驱动要重新装才能使用，怎么解决？

码龄粉丝数原力等级 --

这是Go中的惯用工作线程池吗？

2条回答默认最新

码龄粉丝数原力等级 --

Producer-consumer pattern

With `results` channel

Without a `results` channel

悬赏问题

这是Go中的惯用工作线程池吗？

2条回答 默认 最新

Producer-consumer pattern

With results channel

Without a results channel

悬赏问题

2条回答默认最新

With `results` channel

Without a `results` channel