douduan2272 2019-02-13 03:17
浏览 20
已采纳

练习:Web爬网程序-打印不起作用

I'm a golang newbie and currently working on Exercise: Web Crawler.

I simply put the keyword 'go' before every place where func Crawl is invoked and hope it can be parallelized, but fmt.Printf doesn't work and prints nothing. Nothing other is changed on the original code besides this one. Would someone like to give me a hand?

func Crawl(url string, depth int, fetcher Fetcher) {
    // TODO: Fetch URLs in parallel.
    // TODO: Don't fetch the same URL twice.
    // This implementation doesn't do either:
    if depth <= 0 {
        return
    }
    body, urls, err := fetcher.Fetch(url)
    if err != nil {
        fmt.Println(err)
        return
    }
    fmt.Printf("found: %s %q
", url, body)
    for _, u := range urls {
        go Crawl(u, depth-1, fetcher)
    }
    return
}

func main() {
    go Crawl("https://golang.org/", 4, fetcher)
}
  • 写回答

1条回答 默认 最新

  • dougu4448 2019-02-13 03:49
    关注

    According to the spec

    Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.

    Therefore you have to explicitly wait for the other goroutine to end in main() function.

    One way is simply add time.Sleep() at the end of main() function until you think that the other goroutine ends (e.g. maybe 1 second in this case).

    Cleaner way is using sync.WaitGroup as follows:

    func Crawl(wg *sync.WaitGroup, url string, depth int, fetcher Fetcher) {
        defer wg.Done()
        if depth <= 0 {
            return
        }
        body, urls, err := fetcher.Fetch(url)
        if err != nil {
            fmt.Println(err)
            return
        }
        fmt.Printf("found: %s %q
    ", url, body)
        for _, u := range urls {
            wg.Add(1)
            go Crawl(wg, u, depth-1, fetcher)
        }
        return
    }
    
    func main() {
        wg := &sync.WaitGroup{}
        wg.Add(1)
        // first call does not need to be goroutine since its subroutine is goroutine.
        Crawl(wg, "https://golang.org/", 4, fetcher)
        //time.Sleep(1000 * time.Millisecond)
        wg.Wait()
    }
    

    This code stores counter in WaitGroup, increment it using wg.Add(), decrement using wg.Done() and waits until it goes zero using wg.Wait().

    Confirm it in go playground: https://play.golang.org/p/WqQBqe6iFLp

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥50 comsol稳态求解器 找不到解,奇异矩阵有1个空方程返回的解不收敛。没有返回所有参数步长;pid控制
  • ¥15 怎么让wx群机器人发送音乐
  • ¥15 fesafe材料库问题
  • ¥35 beats蓝牙耳机怎么查看日志
  • ¥15 Fluent齿轮搅油
  • ¥15 八爪鱼爬数据为什么自己停了
  • ¥15 交替优化波束形成和ris反射角使保密速率最大化
  • ¥15 树莓派与pix飞控通信
  • ¥15 自动转发微信群信息到另外一个微信群
  • ¥15 outlook无法配置成功