使用Google Go的Goroutines创建贝叶斯网络

我有大量的哲学论证,每个论证都与其他论证联系起来,以证明或否定给定的陈述 。 一个根语句可以有许多证明和证明,其中每一个也可以有证明和证明。 语句也可以用于多个图形中,并且可以在“给定上下文”或假设下分析图形。</ p>

我需要构造一个包含相关自变量的贝叶斯网络,以便每个节点都可以传播 公平准确地影响其相关的论点; 我需要能够同时计算连接的节点链的概率,每个节点都需要进行数据存储查找,而这些查找必须阻塞才能获得结果。 该过程主要受I / O约束,并且我的数据存储区连接可以在java,go和python {google appengine}中异步运行。 每次查找完成后,它将影响传播到所有其他连接的节点,直到概率增量降至不相关阈值以下(当前为0.1%)。 流程的每个节点必须计算连接链,然后对所有查询的所有结果求和以调整有效性结果,并将结果向外链接到任何连接的参数。</ p>

为避免重复发生 无限地,我正在考虑在goroutine中使用类似A *的过程来传播对参数映射的更新,并采用基于复合影响的启发式方法,一旦影响概率降至0.1%以下,该启发式方法将忽略节点。 我曾尝试使用SQL触发器来设置计算,但是它过于复杂且混乱。 然后我搬到google appengine来利用异步nosql,它虽然更好,但仍然太慢。 我需要以足够快的速度运行更新,以获取一个活泼的UI,因此,当用户创建或投票支持或反对证明或反证时,他们可以立即看到反映在UI中的结果。</ p>

我认为Go是支持我需要的并发的首选语言,但是我愿意接受建议。 客户端是一个整体式javascript应用程序,仅使用XHR和websocket实时推送和提取参数映射{及其更新}。 我有一个Java原型,可以在10到15秒内计算大型链,但是对性能的监视显示,我的大部分运行时都浪费在ConcurrentHashMap的同步和开销上。</ p>

-值得尝试的并行语言,请让我知道。 我知道java,python,go,ruby和scala,但是会在适合我的情况下学习任何语言。 </ p>

类似地,如果存在大型贝叶斯网络的开源实现,请提出建议。</ p>
</ div>

展开原文

原文

I have a large dataset of philosophic arguments, each of which connect to other arguments as proof or disproof of a given statement. A root statement can have many proofs and disproofs, each of which may also have proofs and disproofs. Statements can also be used in multiple graphs, and graphs can be analyzed under a "given context" or assumption.

I need to construct a bayesian network of related arguments, so that each node propagates influence fairly and accurately to it's connected arguments; I need to be able to calculate the probability of chains of connected nodes concurrently, with each node requiring datastore lookups that must block to get results; the process is mostly I/O bound, and my datastore connection can run asynchronously in java, go and python {google appengine}. Once each lookup completes, it propagates the effects to all other connected nodes until the probability delta drops below a threshold of irrelevance {currently 0.1%}. Each node of the process must calculate chains of connections, then sum up all the results across all queries to adjust validity results, with results chained outward to any connected arguments.

In order to avoid recurring infinitely, I was thinking of using an A*-like process in goroutines to propagate updates to the argument maps, with a heuristic based on compounding influence which ignores nodes once probability of influence dips below, say 0.1% . I'd tried to set up the calculations with SQL triggers, but it got complex and messy way too fast. Then I moved to google appengine to take advantage of asynchronous nosql, and it was better, but still too slow. I need to be run the updates fast enough to get a snappy UI, so when a user creates or votes for or against a proof or disproof, they can see the results reflected in UI immediately.

I think Go is the language of choice to support the concurrency I need, but I'm open to suggestions. The client is a monolithic javascript app that just uses XHR and websockets to push and pull argument maps {and their updates} in real time. I have a java prototype that can compute large chains in 10~15s, but monitoring of performance shows that most of my runtime is wasted in synchronization and overhead from ConcurrentHashMap.

If there are other highly-concurrent languages worth trying out, please let me know. I know java, python, go, ruby and scala, but will learn any language if it suits my needs.

Similarly, if there are open source implementations of huge Bayesian networks, please leave a suggestion.

duandaodao6951
duandaodao6951 好吧,特别是,我想知道是否有任何先例/行业标准可用于计算巨大的贝叶斯网络,以及goroutine是否像看起来那样最适合此工作。
大约 8 年之前 回复
dpbvpgvrhwxen3222
dpbvpgvrhwxen3222 一个有趣的应用程序,但是您到底要问什么?
大约 8 年之前 回复

1个回答



我认为很难说出您要问的内容。 </ p>

Goroutine非常便宜,并且非常适合使用XHR或Websockets的现代Web应用程序(以及其他必须绑定I / O的应用程序) 等待数据库响应之类的东西)。 此外,go运行时还能够并行执行这些goroutine,因此Go也非常适合CPU绑定的任务,该任务应利用多个内核和本机编译语言的速度。</ p>

但是您还应该记住,goroutine和通道不是免费的。 它们仍然需要一定数量的内存,并且每个同步点(例如,通道发送或接收)都附带了成本。 通常,这不是问题,因为与数据库查询相比,同步非常便宜,但是它可能不适合构建高效的贝叶斯网络,尤其是如果每个goroutine /节点的实际工作与 </ p>

每个并发程序的主要目标应该是尽可能避免共享的可变性。 因此,用goroutine和通道建模的贝叶斯网络可能是一个很好的教学示例,也是衡量Go的通道实现性能的好方法,但它可能并不是最适合您的问题。</ p>
</ div>

展开原文

原文

I think it's a bit difficult to tell what you are asking about. Maybe you can elaborate on your question.

Goroutines are quite cheap, and are a perfect match for modern web applications which use XHR or Websockets heavily (and other I/O bound applications which have to wait for database responses and stuff like that). Additionally, the go runtime is also able to execute those goroutines in parallel, so that Go is also a good fit for CPU bound tasks, which should take advantage of multiple cores and the speed of a natively compiled language.

But you should also keep in mind, that goroutines and channels aren't for free. They still require some amount of memory and each synchronization point (e.g. a channel send or receive) comes with its cost. That's normally not a problem, since the synchronization is, in comparison to a database query for example, extremely cheap, but it might not be suited for building efficient Bayesian networks, especially if the actual work of each goroutine / node is negligible in comparison to the synchronization overhead.

Your primary goal for every concurrent program should be to avoid shared mutability as far as possible. So a Bayesian network modeled with goroutines and channels might be a good educational example and a great way to measure the performance of Go's channel implementation, but it's probably not the best fit for your problem.

drdr123456
drdr123456 我将更新问题以反映该过程主要受I / O约束的事实; 我将继续执行此操作,并报告所有发现/性能基准。
大约 8 年之前 回复
doucheng5209
doucheng5209 贝叶斯网络的每个节点的实际工作将需要数据存储区查找,然后进行计算,并可能需要更多的数据存储区查找,直到传播的概率下降到不相关阈值以下(当前为0.1%)为止。 每个数据存储区查找都需要阻塞,因此计算本身相当便宜,但是并发性和同步性却很昂贵。 我有一个异步Java原型,它可以在大约10秒内完成,这似乎是我无法削减的,即使有多个线程同时运行多个查询{java thread = tooweightweight}。
大约 8 年之前 回复
duanci1939
duanci1939 ...但我认为应该比SQL触发器更好。
大约 8 年之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐