douduan2272 2019-02-13 03:17
浏览 20
已采纳

练习:Web爬网程序-打印不起作用

I'm a golang newbie and currently working on Exercise: Web Crawler.

I simply put the keyword 'go' before every place where func Crawl is invoked and hope it can be parallelized, but fmt.Printf doesn't work and prints nothing. Nothing other is changed on the original code besides this one. Would someone like to give me a hand?

func Crawl(url string, depth int, fetcher Fetcher) {
    // TODO: Fetch URLs in parallel.
    // TODO: Don't fetch the same URL twice.
    // This implementation doesn't do either:
    if depth <= 0 {
        return
    }
    body, urls, err := fetcher.Fetch(url)
    if err != nil {
        fmt.Println(err)
        return
    }
    fmt.Printf("found: %s %q
", url, body)
    for _, u := range urls {
        go Crawl(u, depth-1, fetcher)
    }
    return
}

func main() {
    go Crawl("https://golang.org/", 4, fetcher)
}
  • 写回答

1条回答 默认 最新

  • dougu4448 2019-02-13 03:49
    关注

    According to the spec

    Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.

    Therefore you have to explicitly wait for the other goroutine to end in main() function.

    One way is simply add time.Sleep() at the end of main() function until you think that the other goroutine ends (e.g. maybe 1 second in this case).

    Cleaner way is using sync.WaitGroup as follows:

    func Crawl(wg *sync.WaitGroup, url string, depth int, fetcher Fetcher) {
        defer wg.Done()
        if depth <= 0 {
            return
        }
        body, urls, err := fetcher.Fetch(url)
        if err != nil {
            fmt.Println(err)
            return
        }
        fmt.Printf("found: %s %q
    ", url, body)
        for _, u := range urls {
            wg.Add(1)
            go Crawl(wg, u, depth-1, fetcher)
        }
        return
    }
    
    func main() {
        wg := &sync.WaitGroup{}
        wg.Add(1)
        // first call does not need to be goroutine since its subroutine is goroutine.
        Crawl(wg, "https://golang.org/", 4, fetcher)
        //time.Sleep(1000 * time.Millisecond)
        wg.Wait()
    }
    

    This code stores counter in WaitGroup, increment it using wg.Add(), decrement using wg.Done() and waits until it goes zero using wg.Wait().

    Confirm it in go playground: https://play.golang.org/p/WqQBqe6iFLp

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于大棚监测的pcb板设计
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用
  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥15 帮我写一个c++工程