douduan2272 2019-02-13 03:17
浏览 20
已采纳

练习:Web爬网程序-打印不起作用

I'm a golang newbie and currently working on Exercise: Web Crawler.

I simply put the keyword 'go' before every place where func Crawl is invoked and hope it can be parallelized, but fmt.Printf doesn't work and prints nothing. Nothing other is changed on the original code besides this one. Would someone like to give me a hand?

func Crawl(url string, depth int, fetcher Fetcher) {
    // TODO: Fetch URLs in parallel.
    // TODO: Don't fetch the same URL twice.
    // This implementation doesn't do either:
    if depth <= 0 {
        return
    }
    body, urls, err := fetcher.Fetch(url)
    if err != nil {
        fmt.Println(err)
        return
    }
    fmt.Printf("found: %s %q
", url, body)
    for _, u := range urls {
        go Crawl(u, depth-1, fetcher)
    }
    return
}

func main() {
    go Crawl("https://golang.org/", 4, fetcher)
}
  • 写回答

1条回答 默认 最新

  • dougu4448 2019-02-13 03:49
    关注

    According to the spec

    Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.

    Therefore you have to explicitly wait for the other goroutine to end in main() function.

    One way is simply add time.Sleep() at the end of main() function until you think that the other goroutine ends (e.g. maybe 1 second in this case).

    Cleaner way is using sync.WaitGroup as follows:

    func Crawl(wg *sync.WaitGroup, url string, depth int, fetcher Fetcher) {
        defer wg.Done()
        if depth <= 0 {
            return
        }
        body, urls, err := fetcher.Fetch(url)
        if err != nil {
            fmt.Println(err)
            return
        }
        fmt.Printf("found: %s %q
    ", url, body)
        for _, u := range urls {
            wg.Add(1)
            go Crawl(wg, u, depth-1, fetcher)
        }
        return
    }
    
    func main() {
        wg := &sync.WaitGroup{}
        wg.Add(1)
        // first call does not need to be goroutine since its subroutine is goroutine.
        Crawl(wg, "https://golang.org/", 4, fetcher)
        //time.Sleep(1000 * time.Millisecond)
        wg.Wait()
    }
    

    This code stores counter in WaitGroup, increment it using wg.Add(), decrement using wg.Done() and waits until it goes zero using wg.Wait().

    Confirm it in go playground: https://play.golang.org/p/WqQBqe6iFLp

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 笔记本上移动热点开关状态查询
  • ¥85 类鸟群Boids——仿真鸟群避障的相关问题
  • ¥15 CFEDEM自带算例错误,如何解决?
  • ¥15 有没有会使用flac3d软件的家人
  • ¥20 360摄像头无法解绑使用,请教解绑当前账号绑定问题,
  • ¥15 docker实践项目
  • ¥15 利用pthon计算薄膜结构的光导纳
  • ¥15 海康hlss视频流怎么播放
  • ¥15 Paddleocr:out of memory error on GPU
  • ¥30 51单片机C语言数码管驱动单片机为AT89C52