weixin_39619635
weixin_39619635
2020-12-09 00:43

Activities performance analysis

Model Loading

Here's a summary of things to take into account with this particular benchmark: - Data-load creates 20,000 total content items in 2 concurrent processes. Total data-load time is 48min in Test #1, 40min in Test #2 - All activities measured are either content-create or content-comment, both of which route to all content members - All content is created with it's full membership set, which is assumed to differ from a real-world scenario where it is created with some, and more are added over time - Activity routing and Activity collection are competing with each-other for resources, and activity routing is dominating because of a bug with our MQ rate limiting (fix is in PP branch). But this should not take away from the fact that regardless of who is competing for resources, the system wants to route activities at 2200 routed activities per second to keep up with the usage. - In tests #1 and #2, Cassandra is close to full capacity. In Test #3 we let the collection routine apply more pressure on Cassandra (basically doubling by doubling concurrency), and Cassandra becomes unresponsive - To control the size of aggregates, idle expiry was decreased to 5min (from default 3hrs), and max expiry was decreased to 15min (from default 1 day) - In Test #1, the last activities that were delivered to user routes were up to an hour and a half old, not acceptable - Despite this scenario being unrealistic, we anticipate collection performance will need to be improved. The improvement that is anticipated to give the best result is to offload at least the bucketing into redis (potentially dedicated server, but we'll start with the same central one), which offloads ~2000 column-writes-per-second from cassandra and offloads the very large (500 at a time) batch reads from the buckets. This should also significantly improve collection performance as there is no thrift serialization involved. Memory control will need to be considered, and TTL may need to be put in place on bucketed items to put a hard limit on how much the buckets could grow under unanticipated load. - After buckets are stored in redis, it would also be possible to store aggregate status and aggregated entities in redis, which would also reduce a very large number of reads and writes, and further improve collection performance (again, thrift serialization). However, memory becomes even more of a concern, and we'll need to be careful with the "aggregateMaxExpiry" property to ensure it is low enough to sufficiently control memory growth. If this is completed as well as the previous suggestion, then the only part of activities that are stored in cassandra are the streams themselves.

Here are the results that lead to this:


Test #1:
===========

Setup:

3 cassandra nodes
3 activity nodes
6 collection buckets
2 concurrent collections allowed per node

When activities are running at:                           18.74 activities per second   (all content-create, fully populated groups and content created with full members)
Routed activities are placed into buckets at a rate of:   1891 per second               (translates to approx. 100 routes per activity)
Routed activity collection throughput:                    135 per second


Test #2:
===========

Setup:

6 cassandra nodes
6 activity nodes
12 buckets
2 concurrent collections allowed per node

When activities are running at:                           21.4 activities per second    (all content-create, fully populated groups and content created with full members)
Routed activities are placed into buckets at a rate of:   2239 per second               (translates to approx. 100 routes per activity)
Routed activity collection throughput:                    312 per second


Test #3:
===========

(Cassandra becomes unresponsive)

Setup:

6 cassandra nodes
6 activity nodes
24 buckets
4 concurrent collections allowed per node

When activities are running at:                           20.88 activities per second   (all content-create, fully populated groups and content created with full members)
Routed activities are placed into buckets at a rate of:   2228 per second               (translates to approx. 100 routes per activity)
Routed activity collection throughput:                    297 per second

该提问来源于开源项目:oaeproject/Hilary

  • 点赞
  • 回答
  • 收藏
  • 复制链接分享

5条回答

为你推荐