I have a set of files I am trying to import into MySQL.
Each CSV file looks like this:
Header1;Header2;Header3;Header4;Header5
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;
Data may contain spaces, periods or a full colon. They absolutely will not contain a semi-colon so that is a valid delimiter. They also will not contain or any other newline characters.
Example Data
2010.08.30 18:34:59
0.7508
String of characters with spaces in them
Each file has a unique name to it. The names all conform to the following pattern:
Token1_Token2_Token3.csv
I am interested in combining a lot of these CSV files (on the order of several hundred) into one CSV file. Files can range from 10KB to 400MB. Ultimately, I want to send it over to MySQL. Don't worry about getting rid of the individual header rows; I can do that in MySQL easily.
I would like the final CSV file to look like this:
Header1,Header2,Header3,Header4,Header5,FileName
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
I don't care about any of the other tokens. I can also live if the solution just dumps each csv filename into the Token1 field because, again, I can parse that in MySQL easily.
Please help me! I've spent over 10 hours on what should be a relatively easy problem.
Technologies available:
awk
windows batch
linux bash
powershell
perl
python
php
mysql-import
This is a server box so I won't be able to compile anything but if you give me a Java solution I will definitely try to run it on the box.