分布式服务器实例之间的数据广播

I'm trying to get some feedback on the recommendations for a service 'roster' in my specific application. I have a server app that maintains persistant socket connections with clients. I want to further develop the server to support distributed instances. Server "A" would need to be able to broadcast data to the other online server instances. Same goes for all other active instances.

Options I am trying to research:

  1. Redis / Zookeeper / Doozer - Each server instance would register itself to the configuration server, and all connected servers would receive configuration updates as it changes. What then?
    1. Maintain end-to-end connections with each server instance and iterate over the list with each outgoing data?
    2. Some custom UDP multicast, but I would need to roll my own added reliability on top of it.
  2. Custom message broker - A service that runs and maintains a registry as each server connects and informs it. Maintains a connection with each server to accept data and re-broadcast it to the other servers.
  3. Some reliable UDP multicast transport where each server instance just broadcasts directly and no roster is maintained.

Here are my concerns:

  • I would love to avoid relying on external apps, like zookeeper or doozer but I would use them obviously if its the best solution
  • With a custom message broker, I wouldnt want it to become a bottleneck is throughput. Which would mean I might have to also be able to run multiple message brokers and use a load balancer when scaling?
  • multicast doesnt require any external processes if I manage to roll my own, but otherwise I would need to maybe use ZMQ, which again puts me in the situation of depends.

I realize that I am also talking about message delivery, but it goes hand in hand with the solution I go with. By the way, my server is written in Go. Any ideas on a best recommended way to maintain scalability?

* EDIT of goal *

What I am really asking is what is the best way to implement broadcasting data between instances of a distributed server given the following:

  1. Each server instance maintains persistent TCP socket connections with its remote clients and passes messages between them.
  2. Messages need to be able to be broadcasted to the other running instances so they can be delivered to relavant client connections.
  3. Low latency is important because the messaging can be high speed.
  4. Sequence and reliability is important.

* Updated Question Summary *

If you have multiple servers / multiple end points that need to pub/sub between each other, what is a recommended mode of communication between them? One or more message brokers to re-pub messages to a roster of the discovered servers? Reliable multicast directly from each server? How do you connect multiple end points in a distributed system while keeping latency low, speed high, and delivery reliable?

udp
doucuo4413
doucuo4413 让我们继续聊天中的讨论
接近 9 年之前 回复
duanji1482
duanji1482 尽管Go可以使用epoll来选择在频道和goroutines上的选择方式,但我认为该问题仍然不是特定的。客户端通过持久的TCP套接字连接进行连接,如果我是正确的话,该套接字仍会占用文件描述符。并且也消耗端口。如果有一个像youtube这样的网站,那么可以轻松地一次有数十万人同时访问该网站,而我希望能够允许所有人都保持与服务器的连接,那么我需要能够扩展实例,仍然能够在它们之间进行交叉广播。
接近 9 年之前 回复
dounao1856
dounao1856 您似乎想问一个非常具体的体系结构问题-也许改写或阐明您的确切最终目标,以便我们更好地回答?
接近 9 年之前 回复
dpd7195
dpd7195 您要考虑哪个文件描述符限制?在Linux上,您首先受到ulimit(可以删除)的限制。Go在后端使用epoll,因此它不会继承select的1024fd限制。如果使用30,000个套接字对来达到端口限制,则可以考虑将其拆分为多个IP。(您真的有30,000个单独的客户吗?)
接近 9 年之前 回复
doujiu8918
doujiu8918 我可以看到,这个问题实际上正在朝着低延迟消息传递而不是服务发现的方向发展。发现部分更加一致,因为服务不会一直来去去去。例如,您将开始4个实例。也许一个崩溃了,需要重新启动。也许您必须在某个时候开始另一个规模。
接近 9 年之前 回复
duanlu0559
duanlu0559 它是一个高速消息服务器,因此延迟时间应较低。消息进入并被广播给频道订户,但是最终有了持久的套接字服务器,我将同时达到文件描述符限制和客户端端口范围限制。因此,我将不得不运行多个实例,并且仍然允许消息跳入其他实例中的队列,以分发给可能在那些实例中订阅了相同频道的任何人。
接近 9 年之前 回复
douxiongye5779
douxiongye5779 您的延迟要求是什么?
接近 9 年之前 回复
doukunsan5553
doukunsan5553 现在想提一提,Redis作为数据历史记录的持久存储还是不可避免地会成为系统的一部分。因此,我认为这可能是注册服务并通过其发布/订阅功能进行通知的明显途径。
接近 9 年之前 回复

1个回答

Assuming all of your client-facing endpoints are on the same LAN (which they can be for the first reasonable step in scaling), reliable UDP multicast would allow you to send published messages directly from the publishing endpoint to any of the endpoints who have clients subscribed to the channel. This also satisfies the low-latency requirement much better than proxying data through a persistent storage layer.

Multicast groups

  • A central database (say, Redis) could track a map of multicast groups (IP:PORT) <--> channels.
  • When an endpoint receives a new client with a new channel to subscribe, it can ask the database for the channel's multicast address and join the multicast group.

Reliable UDP multicast

  • When an endpoint receives a published message for a channel, it sends the message to that channel's multicast socket.
  • Message packets will contain ordered identifiers per server per multicast group. If an endpoint receives a message without receiving the previous message from a server, it will send a "not acknowledged" message for any messages it missed back to the publishing server.
  • The publishing server tracks a list of recent messages, and resends NAK'd messages.
  • To handle the edge case of a server sending only one message and having it fail to reach a server, server can send a packet count to the multicast group over the lifetime of their NAK queue: "I've sent 24 messages", giving other servers a chance to NAK previous messages.

You might want to just implement PGM.

Persistent storage

If you do end up storing data long-term, storage services can join the multicast groups just like endpoints... but store the messages in a database instead of sending them to clients.

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐