dongrang9300
2018-01-30 21:13
浏览 30
已采纳

goroutine阻止和非阻止用法

I am trying to understand how go-routines work. Here is some code:

//parallelSum.go
func sum(a []int, c chan<- int, func_id string) {
    sum := 0
    for _, n := range a {
        sum += n
    }
    log.Printf("func_id %v is DONE!", func_id)
    c <- sum
}   
func main() {
    ELEM_COUNT := 10000000
    test_arr := make([]int, ELEM_COUNT)
    for i := 0; i < ELEM_COUNT; i++ {
        test_arr[i] = i * 2 
    }
    c1 := make(chan int)
    c2 := make(chan int)
    go sum(test_arr[:len(test_arr)/2], c1, "1")
    go sum(test_arr[len(test_arr)/2:], c2, "2")
    x := <-c1
    y := <-c2

    //x, y := <-c, <-c
    log.Printf("x= %v, y = %v, sum = %v", x, y, x+y)
}   

The above program runs fine and returns the output. I have an iterative version of the same program:

//iterSum.go
func sumIter(a []int, c *int, func_id string) {
    sum := 0
    log.Printf("entered the func %s", func_id)
    for _, n := range a { 
        sum += n
    }   
    log.Printf("func_id %v is DONE!", func_id)
    *c = sum 
}
func main() {
    */
    ELEM_COUNT := 10000000
    test_arr := make([]int, ELEM_COUNT)
    for i := 0; i < ELEM_COUNT; i++ {
        test_arr[i] = i * 2
    }
    var (
        i1 int
        i2 int
    )   
    sumIter(test_arr[:len(test_arr)/2], &i1, "1")
    sumIter(test_arr[len(test_arr)/2:], &i2, "2")
    x := i1
    y := i2

    log.Printf("x= %v, y = %v, sum = %v", x, y, x+y)
} 

I ran the program 20 times and averaged the run time for each program. I see the average almost equal? Shouldn't parallelizing make things faster? What am I doing wrong?

Here is the python program to run it 20 times:

iterCmd = 'go run iterSum.go'
parallelCmd = 'go run parallelSum.go'

runCount = 20


def analyzeCmd(cmd, runCount):
    runData = []
    print("running cmd (%s) for (%s) times" % (cmd, runCount))
    for i in range(runCount):
    ┆   start_time = time.time()
    ┆   cmd_out = subprocess.check_call(shlex.split(cmd))
        run_time = time.time() - start_time
    ┆   curr_data = {'iteration': i, 'run_time' : run_time}
    ┆   runData.append(curr_data)

    return runData

iterOut = analyzeCmd(iterCmd, runCount)
parallelOut = analyzeCmd(parallelCmd, runCount)

print("iter cmd data -->")
print(iterOut)

with open('iterResults.json', 'w') as f:
    json.dump(iterOut, f)

print("parallel cmd data -->")
print(parallelOut)

with open('parallelResults.json', 'w') as f:
    json.dump(parallelOut, f)

avg = lambda results: sum(i['run_time'] for i in results) / len(results)
print("average time for iterSum = %3.2f" % (avg(iterOut)))
print("average time for parallelSum = %3.2f" % (avg(parallelOut)))

Here is output of 1 run:

average time for iterSum = 0.27
average time for parallelSum = 0.29
  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • du5407 2018-01-30 21:26
    已采纳

    So, several problems here. Firstly, your channels aren't buffered in the concurrent example, which means the receives still may have to wait a bit on each other. Second, concurrent doesn't mean parallel. Are you sure these are actually running in parallel and not simply being scheduled on the same OS thread?

    That said, your main problem here is that your Python code is using go run for each iteration, which means the vast majority of your recorded "run time" is actually the compilation of your code (go run compiles and then runs the specified file, and it specifically by design does not cache any of that). If you want to test run time, use Go's benchmark system, not your own cobbled-together version. You'll get far more accurate results. For example, beyond the compilation bottleneck, there's also no way to identify how much of a bottleneck the Python code itself is introducing.

    Oh, and you should get out of the habit of using reference arguments to functions as a way to "return" values. Go supports multiple returns, so the C style of modifying arguments in-place is generally considered an anti-pattern unless there's a really compelling reason to do it.

    点赞 打赏 评论

相关推荐 更多相似问题