doushai7225 2014-07-10 06:41
浏览 55
已采纳

有效地在PHP中创建大型csv文件

I have to create a big csv export file of more than 400 MB with PHP. First drafts of the export file and the PHP code allow some guesses about the performance.

To avoid extremely long processing times, I should focus on creating the export file efficiently and avoid PHP array-operations as they are too slow in that case. "Create the file efficiently" means: append big blocks of text to other big blocks in the file, each big block being created quickly.

Unfortunately, the "big blocks" are rather rectangles than lines. Building my export file will start with lots of line beginnings, like this:

Title a, Title b, Title c 

"2014",  "07",    "01" 

"2014",  "07",    "02" 

...

Then I would have to to add a "rectangle" of text to the right of the line beginnings:

Title a, Title b, Title c, extention 1, extention 2, extention 3 

"2014",  "07",    "01",    "23",        "1",         "null" 

"2014",  "07",    "02",    "23",        "1",         "null" 

...

If I have to do this line by line, it will slow me down again. So I'm hoping for a way to add "rectangles" in a file, just as you can in some text editors. Also helpful would be a concrete experience with huge text buffers in PHP, could also work.

As it is not my hosting, I'm not sure if I have permissions to invoke sed/akw.

So the question is: Can sb advice from experience how to handle big csv files in PHP efficiently (file block operations, file "rectangle" operations) or just how to handle big string buffers in PHP efficiently? There seems to be no framework for string buffers.

Thank you for your attention :-)

Note: This is not a duplicate of this: https://stackoverflow.com/questions/19725129/creating-big-csv-file-in-windows-apache2-php

  • 写回答

2条回答 默认 最新

  • dongnan1899 2014-07-12 16:20
    关注

    Encouraged by the answers/comments to my question, I've written a short benchmark test.

    Section a) creates 2 files each with 1 million lines, each line with 100 chars. It then merges them into a 3rd file, line by line like a zipper:

    line1_1 line2_1 
    line1_2 line2_2 
    line1_3 line2_3 
    

    That's what Raphael Müller suggested.

    Section b) fills 1 million rows (same size as in section 1) into a MySQL table with two columns. It fills the first column first, by 1 million insert statements. Then, with one update statement, it fills the second column. Like this, I would have used one command for handling several rows in one step ("rectangular" action as described in the question). In the table would then be the merged data file ready to readout and download.

    That's what Florin Asavoaie suggested.

    • In order to fill 1 file with 1 million lines each line 100 chars, it takes 4.2 seconds. In order to merge both files into a 3rd file, it takes 10 seconds.

    • In order to fill a MySQL table with 1 million lines each line 100 chars by single insert statements, it takes 440 seconds. So I haven't measured the second step.

    This is not a final conclusion in general about the performance of databases or file systems. Probably, the database could be optimized with some freedom at the hosting (which I don't have).

    I think for now it is somewhat safe to assume this performance order:

    1. RAM
    2. File System
    3. Database

    Which means, if your RAM is bursting at the seams because you create an export file, don't hesitate to write it in parts to files and merge them without much effort to maintain memory blocks.

    PHP is not the language to offer sophisticated low level memory block handling. But finally, you won't need it.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥70 PlayWright在Java上连接CDP关联本地Chrome启动失败,貌似是Windows端口转发问题
  • ¥15 帮我写一个c++工程
  • ¥30 Eclipse官网打不开,官网首页进不去,显示无法访问此页面,求解决方法
  • ¥15 关于smbclient 库的使用
  • ¥15 微信小程序协议怎么写
  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?