douxun4173 2013-05-27 19:31 采纳率: 0%
浏览 57
已采纳

大数据:逐行或按CSV处理SQL插入/更新或合并最佳?

So basically I have a bunch of 1 Gig data files (compressed) with just text files containing JSON data with timestamps and other stuff.

I will be using PHP code to insert this data into MYSQL database.

I will not be able to store these text files in memory! Therefor I have to process each data-file line by line. To do this I am using stream_get_line().

  • Some of the data contained will be updates, some will be inserts.

Question Would it be faster to use Insert / Select / Update statements, or create a CSV file and import it that way?

Create a file thats a bulk operation and then execute it from sql?

I need to basically insert data with a primary key that doesnt exist, and update fields on data if the primary key does exist. But I will be doing this in LARGE Quantities.

Performance is always and issue.

Update The table has 22,000 Columns, and only say 10-20 of them do not contain 0.

  • 写回答

1条回答 默认 最新

  • duanchifo2866 2013-05-27 20:01
    关注

    I would load all of the data to a temporary table and let mysql do the heavy lifting.

    1. create the temporary table by doing create table temp_table as select * from live_table where 1=0;

    2. Read the file and create a data product that is compatible for loading with load data infile.

    3. Load the data into the temporary table and add an index for your primary key

    4. Next Isolate you updates by doing a inner query between the live table and the temporary table. walk through and do your updates.

    5. remove all of your updates from the temporary (again using an inner join between it and the live table).

    6. process all of the inserts with a simple insert into live_table as select * from temp_table.

    7. drop the temporary table, go home and have a frosty beverage.

    This may be over simplified for your use case but with a little tweaking it should work a treat.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?