对如何构建可伸缩系统的理解感到困惑

I need some guidance on how to properly build out a system that will be able to scale. I will give you some information about what I am trying to do and then ask my specific question.

I have a site where I want visitors to send some data to be processed. They input the data into a textarea or upload it in a file. Simple. The data is somewhat preprocessed on the client side before a POST request is made to a REST endpoint.

What I am stuck on is what is a good way to take this posted data store it and then associate an id with it that references the user since I cannot process the data fast enough for it to be returned to the user in a reasonable amount of time?

This question is a bit vague and open to opinion, I admit it. I just need a push in the right direction to keep moving. What I have been considering is throwing the data into a message queue and then having some workers process the data elsewhere and when the data is processed alert the user as to where to find it with some sort of link to an S3 bucket or just a URL to a file. The other idea was to just run the request for each item to be processed against another end-point that already processes individual records in some sort of loop client side. The problem is as follows with this idea:

To process the data it may take somewhere from 30 minutes to 2 hours depending upon the amount that they want processed. It's not ideal for them to just sit there and wait for that to finish depending on the amount of records they need processed, so I have ruled this out mostly.

Any guidance would be very much appreciated as I don't have any coworkers to bounce things off of, nor do I know many people with the domain knowledge that I could freely ask. If this isn't the right place to ask this, could you point me in the right direction as to where it should be asked?

Chris

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douhui5529 2017-09-15 06:19
关注
If I've got you right, your pipeline is:

Accept item from user

Possibly preprocess/validate it (?)

Put into some queue

Process data

Return result.

You man use one or several queues on stage (3). Entity from user gets added to one of the queues. If it's big enough, it could be stored in S3 or storage alike, and only info about it put into the queue: link, add date, user id (or email of alike). Processors can pull items from queue and give feedback to users.

If you have no strict requirements on order, things get much simpler: you don't need any sync between them. Treat all the components: upload acceptors, queues, storages and processors as independent pools of processes. Monitor each pool separately. If there's some bottlenecks - add machines to that pool.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

对如何构建可伸缩系统的理解感到困惑 rabbitmq
2017-09-15 05:55

回答 1 已采纳 If I've got you right, your pipeline is: Accept item from user Possibly preprocess/validate it (
可伸缩的数据存储是什么意思？ nosql 人工智能
2020-10-16 11:42

回答 1 已采纳可伸缩是一个固定名词，从scaleable翻译而来，特指一种软件架构，既可以单机运行，也可以部署在群集上，并且获得线性的性能或者吞吐量的增益。
构建具有高可伸缩性的RESTful API [关闭]
2017-04-04 10:55

回答 2 已采纳 It is doable in Go and it is doable in Node. It is doable in other languages like Erlang or Python
大数据技术学习带来的思考
2022-04-26 22:56

JavaEdge.的博客 大数据技术可分类如下：存储计算资源管理 HDFS 最基本的存储技术。日常应用把通过各种渠道得到的数据，如关系数据库、日志、埋点、爬虫数据都存储到HDFS，供后续使用。 HBase NoSQL英杰，可划分到存储类别，它...
请教系统可伸缩性的架构和技术方向
2009-01-12 13:23

回答 6 已采纳只要你不是做每天动态PAGE 5000W以上，可伸缩的意义基本不大。现代的硬件能力太强大了。基本来说来，没有成熟的公开方案，这些都是大型站点的才会有东西，几乎没有open source现
关于带伸缩性型数组的问题 c语言
2021-08-25 10:17

回答 4 已采纳如果我的回答对你有帮助，请采纳谢谢，谢谢返回值是一个 void* ，而你直接赋值，需要在前面加个强制转换
移动Wed （伸缩布局）🍑 android stylus 有问必答
2021-09-07 17:37

回答 3 已采纳你题目的解答代码如下：（如有帮助，望采纳！谢谢! 点击我这个回答右上方的【采纳】按钮） <!doctype html> <html lang="en"> <head&gt
Apache Kafka - 流式处理
2023-06-03 23:33

小小工匠的博客流式处理系统通常是指一种处理实时数据流的计算系统，能够对数据进行实时的处理和分析，并根据需要进行相应的响应和操作。与传统的批处理系统不同，流式处理系统能够在数据到达时立即进行处理，这使得它们特别适合...
vue3 echarts tree 图，伸缩后的线条残留问题 echarts javascript vue.js 有问必答
2021-12-27 14:31

回答 3 已采纳请问解决了吗
关于#硬件工程#的问题：如何使用Arduino主控板来制作一个自动伸缩装置硬件工程
2023-02-02 10:46

回答 1 已采纳要使用Arduino主控板制作一个自动伸缩装置，您需要以下步骤：明确您的需求：您需要什么样的装置，它的功能是什么，以及如何实现它的自动化。获取所需的硬件：您需要一个Arduino主控板、一些传感器
hbuilder导航条怎样能不随网页伸缩而变化? css
2021-06-24 16:21

回答 1 已采纳导航css按百分比设置宽度
3.大数据来源
2020-04-02 09:58

数学作曲家的博客 1. 大数据到底是个啥只有在那崎岖的小路上不畏艰险奋勇攀登的人,才有希望达到光辉的顶点。 ——马克思在计算机和互联网技术高度发达的今天，我们所有人每天都会在互联网上产生大量的数据，例如出去旅游，用苹果...
PHP多次运行.jar文件 - 可伸缩性？ jar java php
2012-09-26 13:59

回答 3 已采纳 It can be done You mention parameters. I assume you mean command line parameter. You are correct
11.2.5　云计算、大数据时代
2018-09-30 13:53

xiaohuanglv的博客 GFS是大规模的分布式文件系统，MapReduce是一个并行处理框架下的编程模式，BigTable是建立在GFS基础上一个按键值方式组织的非关系型数据库。由于当时的技术、产品和平台无法满足谷歌快速增长的...
【愚公系列】2023年10月 WPF+上位机+工业互联 051-线型动画案例（Loading动画、伸缩菜单栏、加速球）
2021-09-25 02:40

愚公搬代码的博客 Loading动画通常用于在网页或应用程序中展示数据加载或处理过程中的等待状态，以便让用户知道应用正在工作，避免用户因长时间等待而感到困惑和不满。加载动画可以增强用户体验，让用户感受到应用的流畅性和响应性，...
没有解决我的问题, 去提问

悬赏问题

¥15 winform的chart曲线生成时有凸起
¥15 msix packaging tool打包问题
¥15 finalshell节点的搭建代码和那个端口代码教程
¥15 用hfss做微带贴片阵列天线的时候分析设置有问题
¥15 Centos / PETSc / PETGEM
¥15 centos7.9 IPv6端口telnet和端口监控问题
¥20 完全没有学习过GAN，看了CSDN的一篇文章，里面有代码但是完全不知道如何操作
¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
¥20 海浪数据南海地区海况数据，波浪数据
¥20 软件测试决策法疑问求解答

对如何构建可伸缩系统的理解感到困惑

1条回答 默认 最新

悬赏问题

1条回答默认最新