dptsivmg82908 2018-05-04 16:18
浏览 20
已采纳

利用并发对函数进行向量化

For a simple neural network I want to apply a function to all the values of a gonum VecDense.

Gonum has an Apply method for Dense matrices, but not for vectors, so I am doing this by hand:

func sigmoid(z float64) float64 {                                           
    return 1.0 / (1.0 + math.Exp(-z))
}

func vSigmoid(zs *mat.VecDense) {
    for i := 0; i < zs.Len(); i++ {
        zs.SetVec(i, sigmoid(zs.AtVec(i)))
    }
}

This seems to be an obvious target for concurrent execution, so I tried

var wg sync.WaitGroup

func sigmoid(z float64) float64 {                                           
    wg.Done()
    return 1.0 / (1.0 + math.Exp(-z))
}

func vSigmoid(zs *mat.VecDense) {
    for i := 0; i < zs.Len(); i++ {
        wg.Add(1)
        go zs.SetVec(i, sigmoid(zs.AtVec(i)))
    }
    wg.Wait()
}

This doesn't work, perhaps not unexpectedly, as Sigmoid() doesn't end with wg.Done(), as the return statement (which does all the work) comes after it.

My question is: How can I use concurrency to apply a function to each element of a gonum vector?

  • 写回答

1条回答 默认 最新

  • duanjiaoxi4928 2018-05-04 16:40
    关注

    First note that this attempt to do computation concurrenty assumes that the SetVec() and AtVec() methods are safe for concurrent use with distinct indices. If this is not the case, the attempted solution is inherently unsafe and may result in data races and undefined behavior.


    wg.Done() should be called to signal that the "worker" goroutine finished its work. But only when the goroutine finished its work.

    In your case it is not (only) the sigmoid() function that is run in the worker goroutine, but rather zs.SetVec(). So you should call wg.Done() when zs.SetVec() has returned, not sooner.

    One way would be to add a wg.Done() to the end of the SetVec() method (it could also be a defer wg.Done() at its beginning), but it wouldn't be feasible to introduce this dependency (SetVec() should not know about any wait groups and goroutines, this would seriously limit its usability).

    The easiest and cleanest way in this case would be to launch an anonymous function (a function literal) as the worker goroutine, in which you may call zs.SetVec(), and in which you may call wg.Defer() once the above mentioned function has returned.

    Something like this:

    for i := 0; i < zs.Len(); i++ {
        wg.Add(1)
        go func() {
            zs.SetVec(i, sigmoid(zs.AtVec(i)))
            wg.Done()
        }()
    }
    wg.Wait()
    

    But this alone won't work, as the function literal (closure) refers to the loop variable which is modified concurrently, so the function literal should work with its own copy, e.g.:

    for i := 0; i < zs.Len(); i++ {
        wg.Add(1)
        go func(i int) {
            zs.SetVec(i, sigmoid(zs.AtVec(i)))
            wg.Done()
        }(i)
    }
    wg.Wait()
    

    Also note that goroutines (although may be lightweight) do have overhead. If the work they do is "small", the overhead may outweight the performance gain of utilizing multiple cores / threads, and overall you might not gain performance by executing such small tasks concurrently (hell, you may even do worse than without using goroutines). Measure.

    Also you are using goroutines to do minimal work, you may improve performance by not "throwing" away goroutines once they're done with their "tiny" work, but you may "reuse" them. See related question: Is this an idiomatic worker thread pool in Go?

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
  • ¥20 Java-Oj-桌布的计算
  • ¥15 powerbuilder中的datawindow数据整合到新的DataWindow
  • ¥20 有人知道这种图怎么画吗?
  • ¥15 pyqt6如何引用qrc文件加载里面的的资源
  • ¥15 安卓JNI项目使用lua上的问题
  • ¥20 RL+GNN解决人员排班问题时梯度消失
  • ¥60 要数控稳压电源测试数据
  • ¥15 能帮我写下这个编程吗
  • ¥15 ikuai客户端l2tp协议链接报终止15信号和无法将p.p.p6转换为我的l2tp线路