douguanya6399 2017-09-11 03:49
浏览 51
已采纳

“围棋之旅”网络爬虫练习中的频道说明

I'm going through 'A Tour of Go' and have been editing most of the lessons to make sure I fully understand them. I have a question regarding an answer provided to the following exercise: https://tour.golang.org/concurrency/10 which can be found here: https://github.com/golang/tour/blob/master/solutions/webcrawler.go

I have a question regarding the following section:

done := make(chan bool)
for i, u := range urls {
    fmt.Printf("-> Crawling child %v/%v of %v : %v.
", i, len(urls), url, u)
    go func(url string) {
        Crawl(url, depth-1, fetcher)
        done <- true
    }(u)
}
for i, u := range urls {
    fmt.Printf("<- [%v] %v/%v Waiting for child %v.
", url, i, len(urls), u)
    <-done
}
fmt.Printf("<- Done with %v
", url)

What purpose does adding and removing true from the the channel done and running the two separate for loops have? Is it just to block until the go routine finishes? I know this is an example exercise, but doesn't that kind of defeat the point of spinning out a new thread in the first place?

Why can't you just call go Crawl(url, depth-1, fetcher) without the 2nd for loop and the done channel? Is it because of the shared memory space for all the variables?

Thanks!

  • 写回答

1条回答 默认 最新

  • doujiu5464 2017-09-11 04:05
    关注

    The first for loop schedules multiple goroutines to run and is iterating over a slice of urls.

    The second loop blocks on each url, waiting until its corresponding Crawl() invocation has completed. All the Crawl()ers will run and do their work in parallel and block exiting until the main thread has a chance to receive a message on the done channel for each url.

    In my opinion, a better way to implement this is to use a sync.WaitGroup. This code could log the wrong thing depending on how long each Crawl() invocation takes unless fetcher locks.

    If you want to be sure of the url that finished Crawl()ing, you could change the type of the done channel to string and send the url instead of true upon a Crawl() completion. Then, we could receive the url in the second loop.

    Example:

    done := make(chan string)
    for _, u := range urls {
        fmt.Printf("-> Crawling %s
    ", u)
        go func(url string) {
            Crawl(url, depth-1, fetcher)
            done <- url
        }(u)
    }
    for range urls {
        fmt.Printf("<- Waiting for next child
    ")
        u := <-done
        fmt.Printf("  Done... %s
    ", u)
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂
  • ¥15 wordpress 产品图片 GIF 没法显示
  • ¥15 求三国群英传pl国战时间的修改方法
  • ¥15 matlab代码代写,需写出详细代码,代价私
  • ¥15 ROS系统搭建请教(跨境电商用途)
  • ¥15 AIC3204的示例代码有吗,想用AIC3204测量血氧,找不到相关的代码。
  • ¥20 CST怎么把天线放在座椅环境中并仿真