Every week I need to load 50K~200K rows of records from a raw CSV file to my system.
Currently I am solution is to load the CVS to a temp table(empty it after the process), then run my Stored procedure to manipulate the data to different relevant tables in my system. If records already exists will run update query (80% records in CSV are already in my system table), if not exists will Insert the records.
The problem i am facing now is the tables are growing to few millions records, approx. 5~6 millions each tables. "Select Exist" seems very slow too, after that i change to left join tables by batch also slow. Even I just loaded 5K records it may took about few hours to finish the Stored Procedure process.
Any good and faster solutions to handle huge records when comparing tables to decide insert/update records?
Thanks!!
Jack