dougaoshang0237 2011-02-21 06:34
浏览 424
已采纳

将多个csv文件组合在一起,并在连接期间添加列[关闭]

I have a set of files I am trying to import into MySQL.

Each CSV file looks like this:

Header1;Header2;Header3;Header4;Header5
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;
Data1;Data2;Data3;Data4;Data5;

Data may contain spaces, periods or a full colon. They absolutely will not contain a semi-colon so that is a valid delimiter. They also will not contain or any other newline characters.

Example Data

2010.08.30 18:34:59
0.7508
String of characters with spaces in them

Each file has a unique name to it. The names all conform to the following pattern:
    Token1_Token2_Token3.csv

I am interested in combining a lot of these CSV files (on the order of several hundred) into one CSV file. Files can range from 10KB to 400MB. Ultimately, I want to send it over to MySQL. Don't worry about getting rid of the individual header rows; I can do that in MySQL easily.

I would like the final CSV file to look like this:

Header1,Header2,Header3,Header4,Header5,FileName
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1
Data1,Data2,Data3,Data4,Data5,Token1

I don't care about any of the other tokens. I can also live if the solution just dumps each csv filename into the Token1 field because, again, I can parse that in MySQL easily.

Please help me! I've spent over 10 hours on what should be a relatively easy problem.

Technologies available:

    awk
    windows batch
    linux bash
    powershell
    perl
    python
    php
    mysql-import

This is a server box so I won't be able to compile anything but if you give me a Java solution I will definitely try to run it on the box.

  • 写回答

5条回答 默认 最新

  • dongshen7561 2011-02-21 08:32
    关注

    Believe it or not, it may be as simple as:

    awk 'BEGIN{OFS = FS = ";"} {print $0, FILENAME}' *.csv > newfile.csv
    

    If you want to change the field separator from semicolons to commas:

    awk 'BEGIN{OFS = ","; FS = ";"} {$1 = $1; print $0, FILENAME}' *.csv > newfile.csv
    

    To include only the first token:

    awk 'BEGIN{OFS = ","; FS = ";"} {$1 = $1; split(FILENAME, a, "_"); print $0, a[1]}' *.csv > newfile.csv
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(4条)

报告相同问题?

悬赏问题

  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!