dongyinzheng6572 2019-02-05 03:05
浏览 73
已采纳

处理csv的最快方法,bash vs php vs c / c ++处理速度[关闭]

I have a csv with 5M rows. I have an option to import them at mysql database and then loop the table with php.

db_class=new MysqlDb;
$db_class->ConnectDB();
$query="SELECT * FROM mails WHERE .....";
$result=mysqli_query(MysqlDb::$db, $query);
while($arr=mysqli_fetch_array($result))
{
    //db row here 
}

So I loop all the mails from the the table and process them. IF they contain some bad string, I delete them etc.

This works but is very slow to import 5M rows, is also very slow to loop all of them one by one and edit the rows (delete when they contain bad string).

I am thinking of a better solution for skipping php/mysql at all. I will process the .csv file, line by line and check if the current row contains a specific bad string. I can do that In pure php, like:

$file = file('file.csv');
while (($data = fgetcsv($file)) !== FALSE) {
  //process line
   $data[0];
}

This is the bash script I use to loop all lines of a file

while read line; do    
    sed -i '/badstring/d' ./clean.csv
done < bac.csv

While on python I do

with open("file.csv", "r") as ins:
    array = []
    for line in ins:
      //process line here

A bad line would be like

name@baddomain.com
name@domain (without extension)

etc I have a few criterias for what a bad line is, thats why I didn't bother posting it here.

However for very big files I must try to find a better solution. What do you guys recommend? Should I learn how to do it in c/c++ or bash. Bash I know a little already, so I can make it faster. Is c/+++ much faster than bash for this situation? OR I should stick with bash?

Thank you

展开全部

  • 写回答

1条回答 默认 最新

  • duan6301 2019-02-05 03:11
    关注

    As for PHP solution, you are looking for fgetcsv. The manual includes the example of iterating the CSV file.

    Or, if you want to be fancy, you can go with league/csv library.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部