共享内存与消息传递如何处理大型数据结构？

In looking at Go and Erlang's approach to concurrency, I noticed that they both rely on message passing.

This approach obviously alleviates the need for complex locks because there is no shared state.

However, consider the case of many clients wanting parallel read-only access to a single large data structure in memory -- like a suffix array.

My questions:

Will using shared state be faster and use less memory than message passing, as locks will mostly be unnecessary because the data is read-only, and only needs to exist in a single location?
How would this problem be approached in a message passing context? Would there be a single process with access to the data structure and clients would simply need to sequentially request data from it? Or, if possible, would the data be chunked to create several processes that hold chunks?
Given the architecture of modern CPUs & memory, is there much difference between the two solutions -- i.e., can shared memory be read in parallel by multiple cores -- meaning there is no hardware bottleneck that would otherwise make both implementations roughly perform the same?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

10条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doukuangxiu1621 2009-11-25 17:26
关注
Yes, shared state could be faster in this case. But only if you can forgo the locks, and this is only doable if it's absolutely read-only. if it's 'mostly read-only' then you need a lock (unless you manage to write lock-free structures, be warned that they're even trickier than locks), and then you'd be hard-pressed to make it perform as fast as a good message-passing architecture.

Yes, you could write a 'server process' to share it. With really lightweight processes, it's no more heavy than writing a small API to access the data. Think like an object (in OOP sense) that 'owns' the data. Splitting the data in chunks to enhance parallelism (called 'sharding' in DB circles) helps in big cases (or if the data is on slow storage).

Even if NUMA is getting mainstream, you still have more and more cores per NUMA cell. And a big difference is that a message can be passed between just two cores, while a lock has to be flushed from cache on ALL cores, limiting it to the inter-cell bus latency (even slower than RAM access). If anything, shared-state/locks is getting more and more unfeasible.

in short.... get used to message passing and server processes, it's all the rage.

Edit: revisiting this answer, I want to add about a phrase found on Go's documentation:

share memory by communicating, don't communicate by sharing memory.

the idea is: when you have a block of memory shared between threads, the typical way to avoid concurrent access is to use a lock to arbitrate. The Go style is to pass a message with the reference, a thread only accesses the memory when receiving the message. It relies on some measure of programmer discipline; but results in very clean-looking code that can be easily proofread, so it's relatively easy to debug.

the advantage is that you don't have to copy big blocks of data on every message, and don't have to effectively flush down caches as on some lock implementations. It's still somewhat early to say if the style leads to higher performance designs or not. (specially since current Go runtime is somewhat naive on thread scheduling)
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(9条)

报告相同问题？

关注问题

共享内存与消息传递如何处理大型数据结构？ erlang
2009-11-25 17:14

回答 10 已采纳 Yes, shared state could be faster in this case. But only if you can forgo the locks, and this is
共享内存映射的进程空间shmat函数返回是什么
2017-12-01 02:25

回答 4 已采纳啥叫物理地址？都是虚拟的啊。反正就是返回一个地址，你就用这个地址就行了，就是你进程空间的地址。如果是两个不同进程，那么地址是自己的，大家看见自己的不同虚拟地址，实际上访问的是同一块物理内存。
异步编程 - 13 高性能线程间消息传递库 Disruptor
2023-09-08 22:30

小小工匠的博客 Disruptor是一个高性能的线程间消息传递库，它源于LMAX对并发性、性能和非阻塞算法的研究，如今构成了其Exchange基础架构的核心...与队列一样，Disruptor的目的也是在同一进程内的线程之间传递数据（例如消息或事件）；
如何发布一门编程语言 开发语言
2022-02-21 12:01

回答 3 已采纳个人看法：每一个语言都有 “目的” 或者 “特定的功能” ，例如：开发网站的 php, asp.net，浏览器客户端的 Javascript，可以在不同系统都可以执行的 Java，。。。你这个语言有没
小程序全局数据共享，如何调用共享的字段和方法？ javascript 前端微信小程序
2022-10-06 17:18

回答 3 已采纳哎呀不好意识，占用大家时间了，问题已经搞明白了。1：我自己的代码中，把actions: ['updateNum1']写成了action，少了一个“s”，导致.js中使用时报错：updateNum1 i
c++ windows下创建共享内存 c++ windows
2015-03-17 02:32

回答 4 已采纳失败时，用 GetLastError() 获取一下出错的代码，可以得到出错的原因。一则简单的Windows共享内存IPC代码 Windows共享内存可以让两个进程对同一块内存进行读写。
Java高并发编程实战3，Java内存模型与Java对象结构
2022-09-19 20:53

哪吒的博客 Java高并发编程实战系列，打造精品专栏。
树莓派与windows怎么共享实时文件？ python
2022-09-30 18:17

回答 2 已采纳 Windows开启samba服务，就是文件共享，然后树莓派挂载samba文件夹
springmv controller之间如何共享数据? spring
2017-11-04 01:45

回答 3 已采纳你这样应该就直接${attr}取出来了吧？取不出来的话试试用ModelMap ，用 map.put("attr","hello"), 或者用session
消息队列位于的内存块是操作系统共享的么？
2016-08-17 09:22

回答 1 已采纳 http://blog.csdn.net/u013630349/article/details/46823335
数据结构与算法——从零开始学习（一）基础概念篇
2018-12-06 18:47

艾阳Blog的博客 数据结构：是指相互之间存在一种或多种特定关系的数据元素的集合用计算机存储、组织数据的方式。数据结构分别为逻辑结构、（存储）物理结构和数据的运算三个部分。为什么要学数据结构？首先，因为数据结构作为...
C语言申请共享内存失败 c语言
2020-06-29 09:35

回答 1 已采纳你看一下 sizeof(Small) sizeof(Big) 就应该知道为什么不对了。 sizeof(Big)的大小可不是你预期的大小 4*sizeof(int) + sizeof(
内存、数据结构之栈和堆的区别？
2018-01-15 15:56

Jaybo_的博客网上有一篇很好的文章，我差不多直接搬运...数据结构的堆栈我想很多同学学习过，今天介绍下数据结构的堆栈，但是重点是内存的堆栈整理。 数据结构的栈和堆首先在数据结构上要知道堆栈，尽管我们这么称呼它，但实际上
【编程实践】编程语言之 Smalltalk
2023-04-01 12:31

禅与计算机程序设计艺术的博客 Smalltalk，被公认为历史上第二个面向对象的程序设计语言，和第一个真正的集成开发环境（IDE）。Smalltalk由艾伦·凯，Dan Ingalls，Ted Kaehler，Adele Goldberg等于70年代初在Xerox PARC开发。Smalltalk对其它众多...
没有解决我的问题, 去提问

悬赏问题

¥15 基于卷积神经网络的声纹识别
¥15 Python中的request，如何使用ssr节点，通过代理requests网页。本人在泰国，需要用大陆ip才能玩网页游戏，合法合规。
¥100 为什么这个恒流源电路不能恒流？
¥15 有偿求跨组件数据流路径图
¥15 写一个方法checkPerson，入参实体类Person，出参布尔值
¥15 我想咨询一下路面纹理三维点云数据处理的一些问题，上传的坐标文件里是怎么对无序点进行编号的，以及xy坐标在处理的时候是进行整体模型分片处理的吗
¥15 CSAPPattacklab
¥15 一直显示正在等待HID—ISP
¥15 Python turtle 画图
¥15 stm32开发clion时遇到的编译问题

共享内存与消息传递如何处理大型数据结构？

10条回答 默认 最新

悬赏问题

10条回答默认最新