doubi8512 2014-02-12 21:05 采纳率: 0%
浏览 37
已采纳

在处理大量信息时,如何优化数据库插入时间?

I have a series of txt files that have info for around 200 people. This info is generated and exported 5 or 6 times a day. Each txt file has average 800 lines each.

I set up a cron that calls (from php command line) a codeigniter controller that makes this process:

  • constructor loads model
  • a method get txt files from folder, removes blanks and special chars from filename and renames
  • return files' paths stored in an array
  • another method loops through files array and call $this->process($file)
  • process() reads each line from file
  • ignores blank lines and builds 1 array from each line with values in each line read: array_filter(preg_split('/\s+/',$line));
  • finally it calls model->insert_line($line)

How could I:

1- optimize code so I can lower the 2min (avg) execution time for each cron call? Each execution process 5/6 txt files with 800avg. lines each

2- setup the MySQL table so it can hold a very large qtty of records w/o trouble? Fields stored are 2: "code" int(2) and "fecha" timestamp , set both in an unique index(code,fecha)

I have a fast PC, and the table is set to InnoDB

  • 写回答

2条回答 默认 最新

  • douhan8610 2014-02-13 21:18
    关注

    First approach

    Have you tried:

    $this->db->insert_batch('table', $data);
    

    Where $data is an array with the objects/information you want to insert. I do not know the internals of that method (although looking at the code should not be hard) but I'm almost sure that this method does the whole insertion in a single transaction.

    The way you are doing it right now by calling an insert for each line means openening a socket/connection, doing checks and everything that each transaction needs to do in order to do it. So doing a bulk insert is the way to go in those cases, and that function from CI does exactly that, meaning that it will generate a single insert command which is going to be executed on the same transaction.

    You even have the advantage to roll back it if one of the inserts failed so the people that generate those files can massage or fix the data.

    Second approach

    If you know that those files have a specific format you could easily use the LOAD DATA INFILE utility from mysql which is going to have better performance than any tool you can write yourself.

    The beauty of it is that you might be able to call it with:

    $this->db->query($bulk_insert_command);
    

    Where $bulk_insert_command is actually a string with something like:

    LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
        [REPLACE | IGNORE]
        INTO TABLE tbl_name
        [CHARACTER SET charset_name]
        [{FIELDS | COLUMNS}
            [TERMINATED BY 'string']
            [[OPTIONALLY] ENCLOSED BY 'char']
            [ESCAPED BY 'char']
        ]
        [LINES
            [STARTING BY 'string']
            [TERMINATED BY 'string']
        ]
        [IGNORE number {LINES | ROWS}]
        [(col_name_or_user_var,...)]
        [SET col_name = expr,...]
    

    As shown in the provided link above. Of course you'd have a function to sanitize this string and replace filename and options and whatever you need.

    And finally, make sure that whatever user you set up in database.php on your CI app has the file role permission:

    GRANT FILE on *.* TO user@localhost IDENTIFIED  BY 'password';
    

    So that the CI app does not generate an error when running such query.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料