I have a "rather" large MySQL database (5,000,000+ records in some tables) with 7 tables that are queried using several table joins. The only problem I have is the data I am importing, some datasets have strings in the column that is being used to join the tables, and because of that, the query is incredibly slow.
What I am trying to do is replace all of these strings with random integers (making sure the int hasn't been used already), and that includes (ex. xxx_id column) keeping a record of old_xxx_id and new_xxx_id so that tables that are linked stay linked correctly when I update them.
My question is, what is the fastest way to do this? I would prefer php and MySQL but I am open to other languages/sql-databases. I have a script that works, but it replaces line-by-line and can take hours and sometimes a day or two to insert the new data into the database.
Thanks!