I need some guidance on how to properly build out a system that will be able to scale. I will give you some information about what I am trying to do and then ask my specific question.
I have a site where I want visitors to send some data to be processed. They input the data into a textarea or upload it in a file. Simple. The data is somewhat preprocessed on the client side before a POST request is made to a REST endpoint.
What I am stuck on is what is a good way to take this posted data store it and then associate an id with it that references the user since I cannot process the data fast enough for it to be returned to the user in a reasonable amount of time?
This question is a bit vague and open to opinion, I admit it. I just need a push in the right direction to keep moving. What I have been considering is throwing the data into a message queue and then having some workers process the data elsewhere and when the data is processed alert the user as to where to find it with some sort of link to an S3 bucket or just a URL to a file. The other idea was to just run the request for each item to be processed against another end-point that already processes individual records in some sort of loop client side. The problem is as follows with this idea:
To process the data it may take somewhere from 30 minutes to 2 hours depending upon the amount that they want processed. It's not ideal for them to just sit there and wait for that to finish depending on the amount of records they need processed, so I have ruled this out mostly.
Any guidance would be very much appreciated as I don't have any coworkers to bounce things off of, nor do I know many people with the domain knowledge that I could freely ask. If this isn't the right place to ask this, could you point me in the right direction as to where it should be asked?
Chris