普通网友 2017-10-04 08:21
浏览 167
已采纳

在goroutine中运行wg.Wait()时,为什么我的代码可以正常工作?

I have a list of urls that I am scraping. What I want to do is store all of the successfully scraped page data into a channel, and when I am done, dump it into a slice. I don't know how many successful fetches I will get, so I cannot specify a fixed length. I expected the code to reach wg.Wait() and then wait until all the wg.Done() methods are called, but I never reached the close(queue) statement. Looking for a similar answer, I came across this SO answer

https://stackoverflow.com/a/31573574/5721702

where the author does something similar:

ports := make(chan string)
toScan := make(chan int)
var wg sync.WaitGroup

// make 100 workers for dialing
for i := 0; i < 100; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        for p := range toScan {
            ports <- worker(*host, p)
        }
    }()
}

// close our receiving ports channel once all workers are done
go func() {
    wg.Wait()
    close(ports)
}()

As soon as I wrapped my wg.Wait() inside the goroutine, close(queue) was reached:

urls := getListOfURLS()
activities := make([]Activity, 0, limit)
queue := make(chan Activity)
for i, activityURL := range urls {
    wg.Add(1)
    go func(i int, url string) {
        defer wg.Done()
        activity, err := extractDetail(url)
        if err != nil {
            log.Println(err)
            return
        }
        queue <- activity
    }(i, activityURL)
}
    // calling it like this without the goroutine causes the execution to hang
// wg.Wait() 
// close(queue)

    // calling it like this successfully waits
go func() {
    wg.Wait()
    close(queue)
}()
for a := range queue {
    // block channel until valid url is added to queue
    // once all are added, close it
    activities = append(activities, a)
}

Why does the code not reach the close if I don't use a goroutine for wg.Wait()? I would think that the all of the defer wg.Done() statements are called so eventually it would clear up, because it gets to the wg.Wait(). Does it have to do with receiving values in my channel?

  • 写回答

2条回答 默认 最新

  • duanlinzhen7235 2017-10-04 08:43
    关注

    You need to wait for goroutines to finish in a separate thread because queue needs to be read from. When you do the following:

    queue := make(chan Activity)
    for i, activityURL := range urls {
        wg.Add(1)
        go func(i int, url string) {
            defer wg.Done()
            activity, err := extractDetail(url)
            if err != nil {
                log.Println(err)
                return
            }
            queue <- activity // nothing is reading data from queue.
        }(i, activityURL)
    }
    
    wg.Wait() 
    close(queue)
    
    for a := range queue {
        activities = append(activities, a)
    }
    

    Each goroutine blocks at queue <- activity since queue is unbuffered and nothing is reading data from it. This is because the range loop on queue is in the main thread after wg.Wait.

    wg.Wait will only unblock once all the goroutine return. But as mentioned, all the goroutines are blocked at channel send.

    When you use a separate goroutine to wait, code execution actually reaches the range loop on queue.

    // wg.Wait does not block the main thread.
    go func() {
        wg.Wait()
        close(queue)
    }()
    

    This results in the goroutines unblocking at the queue <- activity statement (main thread starts reading off queue) and running until completion. Which in turn calls each individual wg.Done.

    Once the waiting goroutine get past wg.Wait, queue is closed and the main thread exits the range loop on it.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 想问一下树莓派接上显示屏后出现如图所示画面,是什么问题导致的
  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥15 cmd cl 0x000007b
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line
  • ¥500 火焰左右视图、视差(基于双目相机)
  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号