利用并发对函数进行向量化

For a simple neural network I want to apply a function to all the values of a gonum VecDense.

Gonum has an Apply method for Dense matrices, but not for vectors, so I am doing this by hand:

func sigmoid(z float64) float64 {                                           
    return 1.0 / (1.0 + math.Exp(-z))
}

func vSigmoid(zs *mat.VecDense) {
    for i := 0; i < zs.Len(); i++ {
        zs.SetVec(i, sigmoid(zs.AtVec(i)))
    }
}

This seems to be an obvious target for concurrent execution, so I tried

var wg sync.WaitGroup

func sigmoid(z float64) float64 {                                           
    wg.Done()
    return 1.0 / (1.0 + math.Exp(-z))
}

func vSigmoid(zs *mat.VecDense) {
    for i := 0; i < zs.Len(); i++ {
        wg.Add(1)
        go zs.SetVec(i, sigmoid(zs.AtVec(i)))
    }
    wg.Wait()
}

This doesn't work, perhaps not unexpectedly, as Sigmoid() doesn't end with wg.Done(), as the return statement (which does all the work) comes after it.

My question is: How can I use concurrency to apply a function to each element of a gonum vector?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanjiaoxi4928 2018-05-04 16:40
关注
First note that this attempt to do computation concurrenty assumes that the SetVec() and AtVec() methods are safe for concurrent use with distinct indices. If this is not the case, the attempted solution is inherently unsafe and may result in data races and undefined behavior.

wg.Done() should be called to signal that the "worker" goroutine finished its work. But only when the goroutine finished its work.

In your case it is not (only) the sigmoid() function that is run in the worker goroutine, but rather zs.SetVec(). So you should call wg.Done() when zs.SetVec() has returned, not sooner.

One way would be to add a wg.Done() to the end of the SetVec() method (it could also be a defer wg.Done() at its beginning), but it wouldn't be feasible to introduce this dependency (SetVec() should not know about any wait groups and goroutines, this would seriously limit its usability).

The easiest and cleanest way in this case would be to launch an anonymous function (a function literal) as the worker goroutine, in which you may call zs.SetVec(), and in which you may call wg.Defer() once the above mentioned function has returned.

Something like this:

for i := 0; i < zs.Len(); i++ { wg.Add(1) go func() { zs.SetVec(i, sigmoid(zs.AtVec(i))) wg.Done() }() } wg.Wait()

But this alone won't work, as the function literal (closure) refers to the loop variable which is modified concurrently, so the function literal should work with its own copy, e.g.:

for i := 0; i < zs.Len(); i++ { wg.Add(1) go func(i int) { zs.SetVec(i, sigmoid(zs.AtVec(i))) wg.Done() }(i) } wg.Wait()

Also note that goroutines (although may be lightweight) do have overhead. If the work they do is "small", the overhead may outweight the performance gain of utilizing multiple cores / threads, and overall you might not gain performance by executing such small tasks concurrently (hell, you may even do worse than without using goroutines). Measure.

Also you are using goroutines to do minimal work, you may improve performance by not "throwing" away goroutines once they're done with their "tiny" work, but you may "reuse" them. See related question: Is this an idiomatic worker thread pool in Go?
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

利用并发对函数进行向量化
2018-05-04 16:18

回答 1 已采纳 First note that this attempt to do computation concurrenty assumes that the SetVec() and AtVec() m
python问题，请教下如何并发执行函数，并且可以自定义并发量 python
2021-09-27 17:37

回答 1 已采纳
写一个函数，可以控制最大并发数 javascript typescript 有问必答
2021-06-14 13:50

回答 2 已采纳 class Semaphore { constructor(available) { this.available = available; this.wai
vecpy:向量化Python以执行并发SIMD
2021-05-07 08:27

多种语言绑定允许从Python，C ++和Java调用矢量化内核-所有这些都从一个共享的库中进行。你好，世界！简单性是VecPy的主要设计目标之一。仅用几行代码，VecPy即可将Python函数转换并编译为高效的，数据并行的本...
使用for循环进行并发，匿名函数的行为异常
2016-04-21 17:11

回答 2 已采纳 This is covered in the faq: What happens with closures running as goroutines? In this case, none
多用户高并发插入数据怎么解决并发问题 mysql
2018-04-19 04:02

回答 4 已采纳这个是乐观锁，可以使用CAS原理，取出值后得到新值，然后插入的时候比较原值，如set count = 12 where count = 8,8是旧值，但这样容易出现ABA问题，所以需要配合你的版本ve
通道切片和并发函数执行
2015-01-07 03:43

回答 1 已采纳 In this case, you don't need to use chan. package main import ( "fmt" "sync" "time"
数据库向量化如何进行性能优化
2022-11-09 13:07

feidodo小程序的博客数据库向量化如何进行性能优化
我对操作系统并行和并发的理解正确吗 linux 问答团队
2023-01-07 21:02

回答 2 已采纳望采纳：理解没错并发是指一个处理器同时处理多个任务。并行是指多个处理器或者是多核的处理器同时处理多个不同的任务。并发是逻辑上的同时发生（simultaneous），而并行是物理上的同时发生。也可以这么
quartz定时任务并发问题
2017-12-15 02:56

回答 2 已采纳 quartz框架中防止任务并行可以有两种方案： 1、如果是通过MethodInvokingJobDetailFactoryBean在运行中动态生成的Job，配置的xml文件有个concurrent属
Javaweb项目多少用户就要考虑并发了 java
2017-07-20 09:57

回答 2 已采纳不知道你想问什么。一个可能是你想问，什么情况下应该考虑通过横向扩展（增加服务器的数目）来提高性能，一般来说，在你访问负荷最高的时候，如果系统负载超过80%，就必须考虑了。另一个你可能问，作为程
Doris向量化执行引擎原理(概述)
2023-04-05 21:06

一铭的博客 向量化执行引擎是一种高效的数据处理方式，它将数据分为多个向量进行处理，能够充分利用 CPU 的 SIMD 指令集，提高数据处理的效率。在 Doris 中，向量化执行引擎被广泛应用于查询优化、数据压缩、聚合计算等方面，...
对这些并发的golang问题感到困惑
2018-11-22 20:45

回答 1 已采纳 Here is the sequence where philosophers might eat more than three times Assume philosopher_1 has
向量化执行引擎是怎么玩的？
2022-03-09 17:56

Aiky哇的博客在比较前沿的数据库中，比如cilckhouse，polar-x，TDSQL，都提到了一个比较新的词汇，叫向量化执行引擎。 向量化执行引擎似乎已经成为了主流数据库执行器的唯一版本答案。所以本篇博客来介绍数据库的向量...
SQL查询优化原理与向量化执行引擎
2021-12-18 22:49

喜欢打篮球的普通人的博客文章目录1.SQL查询优化的目的2.SQL 查询优化的基本原理之研究如何通过关系代数...从技术的角度来说，通过对用户输入的查询进行优化，实现更优的执行步骤规划数据库可以实现更快的执行和更少的 IO 消耗。从而节约资源
没有解决我的问题, 去提问

悬赏问题

¥20 基于MSP430f5529的MPU6050驱动，求出欧拉角
¥20 Java-Oj-桌布的计算
¥15 powerbuilder中的datawindow数据整合到新的DataWindow
¥20 有人知道这种图怎么画吗？
¥15 pyqt6如何引用qrc文件加载里面的的资源
¥15 安卓JNI项目使用lua上的问题
¥20 RL+GNN解决人员排班问题时梯度消失
¥60 要数控稳压电源测试数据
¥15 能帮我写下这个编程吗
¥15 ikuai客户端l2tp协议链接报终止15信号和无法将p.p.p6转换为我的l2tp线路

利用并发对函数进行向量化

1条回答 默认 最新

悬赏问题

1条回答默认最新