Thank you for helping.
My Spec:
- I have a very large CSV file
- I need to process the file and output another large CSV file
My environment:
- php7.0
- Symfony 3.4 framework
My current solution:
I am using Symfony Serializer component to read my file, I then process my file and output the file. All in memory. Operation takes around 20 minutes to complete.
I run the script from a Symfony command. I have a service to read the file and a service to export the file.
An improved solution 1:
- I can load the CSV file into a database table like explained here, with "LOAD DATA LOCAL INFILE" sql query. Very fast operation
- I can then process my data and save it to another table
- Then I would use "SELECT ... INTO OUTFILE 'file_name'" to output the file
Advantage: -SQL centered - No memory problem
Disavantage: -DB operations during processing might be expensive
An improved solution 2:
- I can read the CSV file line by line
- I process the line
- I write the line to the output file
Adv: No memory issue Disav: Could take a LOT of time.
An improved solution 3:
- I can load the CSV file into a database table like explained here, with "LOAD DATA LOCAL INFILE" sql query. Very fast operation
- I can then process my data in chunks of 1000 and write them to a file
What solution would you use? Or do you have any better solution?