dongluxin2452 2013-12-27 13:24
浏览 93
已采纳

在Go或Node.js中处理大量请求/秒

I'm developing a web app that needs to handle bursts of very high loads, once per minute I get a burst of requests in very few seconds (~1M-3M/sec) and then for the rest of the minute I get nothing,

What's my best strategy to handle as many req /sec as possible at each front server, just sending a reply and storing the request in memory somehow to be processed in the background by the DB writer worker later ?

The aim is to do as less as possible during the burst, and write the requests to the DB ASAP after the burst.

Edit : the order of transactions in not important, we can lose some transactions but 99% need to be recorded latency of getting all requests to the DB can be a few seconds after then last request has been received. Lets say not more than 15 seconds

  • 写回答

2条回答 默认 最新

  • dpd7122 2013-12-27 21:51
    关注

    This question is kind of vague. But I'll take a stab at it.

    1) You need limits. A simple implementation will open millions of connections to the DB, which will obviously perform badly. At the very least, each connection eats MB of RAM on the DB. Even with connection pooling, each 'thread' could take a lot of RAM to record it's (incoming) state.

    If your app server had a limited number of processing threads, you can use HAProxy to "pick up the phone" and buffer the request in a queue for a few seconds until there is a free thread on your app server to handle the request.

    In fact, you could just use a web server like nginx to take the request and say "200 OK". Then later, a simple app reads the web log and inserts into DB. This will scale pretty well, although you probably want one thread reading the log and several threads inserting.

    2) If your language has coroutines, it may be better to handle the buffering yourself. You should measure the overhead of relying on our language runtime for scheduling.

    For example, if each HTTP request is 1K of headers + data, want to parse it and throw away everything but the one or two pieces of data that you actually need (i.e. the DB ID). If you rely on your language coroutines as an 'implicit' queue, it will have 1K buffers for each coroutine while they are being parsed. In some cases, it's more efficient/faster to have a finite number of workers, and manage the queue explicitly. When you have a million things to do, small overheads add up quickly, and the language runtime won't always be optimized for your app.

    Also, Go will give you far better control over your memory than Node.js. (Structs are much smaller than objects. The 'overhead' for the Keys to your struct is a compile-time thing for Go, but a run-time thing for Node.js)

    3) How do you know it's working? You want to be able to know exactly how you are doing. When you rely on the language co-routines, it's not easy to ask "how many threads of execution do I have and what's the oldest one?" If you make an explicit queue, those questions are much easier to ask. (Imagine a handful of workers putting stuff in the queue, and a handful of workers pulling stuff out. There is a little uncertainty around the edges, but the queue in the middle very explicitly captures your backlog. You can easily calculate things like "drain rate" and "max memory usage" which are very important to knowing how overloaded you are.)

    My advice: Go with Go. Long term, Go will be a much better choice. The Go runtime is a bit immature right now, but every release is getting better. Node.js is probably slightly ahead in a few areas (maturity, size of community, libraries, etc.)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥50 求解vmware的网络模式问题
  • ¥24 EFS加密后,在同一台电脑解密出错,证书界面找不到对应指纹的证书,未备份证书,求在原电脑解密的方法,可行即采纳
  • ¥15 springboot 3.0 实现Security 6.x版本集成
  • ¥15 PHP-8.1 镜像无法用dockerfile里的CMD命令启动 只能进入容器启动,如何解决?(操作系统-ubuntu)
  • ¥30 请帮我解决一下下面六个代码
  • ¥15 关于资源监视工具的e-care有知道的嘛
  • ¥35 MIMO天线稀疏阵列排布问题
  • ¥60 用visual studio编写程序,利用间接平差求解水准网
  • ¥15 Llama如何调用shell或者Python
  • ¥20 谁能帮我挨个解读这个php语言编的代码什么意思?