dongpao1905 2013-11-27 00:39
浏览 11
已采纳

去争取最大线程的进程?

I'm trying out Go for doing some filesystem use analysis and I went for making the code as fast as possible by spawning almost everything off as a goroutine and relying on the Go VM (and GOMAXPROCS) to manage it. I was watching this code run (pretty quickly) until it just stopped dead. I checked top and it listed my process as having 1500 threads.

I thought maybe I had hit some limit and the process was therefore deadlocked waiting on the OS. I checked my OS (FreeBSD) limits, and sure enough it was listed as 1500 threads max per process.

Surprised, I checked the Go docs and it says GOMAXPROCS is only a limit on running threads, but blocked threads don't count.

So my questions:

  • Is it fair to say I can't rely on the Go VM as a global pool to prevent hitting OS limits of these kinds?

  • Is there an idiomatic way to handle this (be nice, it's only my second day using Go)?

    • In particular, I haven't found a great way other than sync to close a channel when I'm done using it. Is there a better way?

    • I'd like to abstract away the boilerplate (parallel mapping with go routines and closing channel when done), is there a type-safe way to do this without generics?

Here's my current code:

func AnalyzePaths(paths chan string) chan AnalyzedPath {
    analyzed := make(chan AnalyzedPath)
    go func() {
        group := sync.WaitGroup{}
        for path := range paths {
            group.Add(1)
            go func(path string) {
                defer group.Done()
                analyzed <- Analyze(path)
            }(path)
        }
        group.Wait()
        close(analyzed)
    }()
    return analyzed
}

func GetPaths(roots []string) chan string {
    globbed := make(chan string)
    go func() {
        group := sync.WaitGroup{}
        for _, root := range roots {
            group.Add(1)
            go func(root string) {
                defer group.Done()
                for _, path := range glob(root) {
                    globbed <- path
                }
            }(root)
        }
        group.Wait()
        close(globbed)
    }()
    return globbed
}

func main() {
    paths := GetPaths(patterns)
    for analyzed := range AnalyzePaths(paths) {
        fmt.Println(analyzed)
    }
}
  • 写回答

1条回答 默认 最新

  • douxi8759 2013-11-27 02:47
    关注

    About 2 months ago (or more) language developers spoke about intruding of thread count control (and some other limits). So we can expect to see it soon. Month or more ago I develop the issue and found on my linux machine that GOMAXPROCS doesn't exceeds value of 256. If I sent 300 or more to it, the result was always 256. But I found that goroutines are not a threads. Goroutines can live in one thread.

    As for idiomatic syncing - I think there is no necessity to sync too much. In my code I usually use idea that goroutines are communicating through channels only. And channels should be passed as parameters for goroutines.

    func main() {
        ch1 := make(chan SomeType1)
        ch2 := make(chan SomeType2)
        go generator(ch1, ch2)
        go processor(ch1, ch2)
        // here main func becomes waiting until it capture 2 of ch2-finished-signals 
        <- ch2
        <- ch2
        // usually we don't need the exact values of ch2-signals,
        // so we assign it to nothing 
    }
    
    func generator(ch1 chan SomeType1, ch2 chan SomeType2) {
        for (YOUR_CONDITION){
            // generate something
            //....
            // send to channel
            ch1 <- someValueOfType1
        }
        ch1 <- magicStopValue
        ch2 <- weAreFinishedSignal1
    }
    
    func processor(ch1 chan SomeType1, ch2 chan SomeType2) {
        // "read" value from ch1 
        value := <-ch1
        for value != magicStopValue {
            // make some processing
            // ....
            //get next value from ch1 and replay processing
            value = <- ch1
        }
        // here we can send signal that goroutine2 is finished
        ch2 <- weAreFinishedSignal2
    }
    

    If goroutines are in one thread they are communicating faster. As for me the channel performance is far from good, but enough for many purposes.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#python#的问题:求帮写python代码
  • ¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 来真人,不要ai!matlab有关常微分方程的问题求解决,
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算
  • ¥15 java如何提取出pdf里的文字?