drwj4061 2015-12-13 15:47
浏览 29

Laravel - 预定的长期任务。 高记忆力

I'm currently writing an application in Laravel and am setting up some scheduled tasks that will be run periodically (artisan commands) -- maybe once or twice a month. These tasks grab some CSV files using CURL, parse them, and import them into my local MySQL database.

However, a couple of these of these CSV files are quite large -- around 500,000 lines each. So I have run into some memory issues.

Obviously I can increase the memory in my VM, but i'm wondering how what other practices I can implement to ensure that I don't run out of memory. If I try to parse the files in chunks, rather than all at once, will each chunk be treated as a separate system process? Or will it still be treated as one long execution for the server?

EDIT: My current approach. I'm using League CSV for the Parsing.

protected $offset = 1;
protected $parse_chunk = 10000;

public function parse() 
{
     $this->data = [];
     $this->reader->setOffset($this->offset)->setLimit($this->parse_chunk)->fetchAll(function($row, $offset) {
            array_push($this->data, array_combine($this->keys, $row));
            $this->offset = $offset;
     });
}

So currently, the above fetches the first 10000 rows, and does the processing I need, which stores them in a new $data array. This data then gets passed to a Import Class, but my scripts are timing out in the parsing currently.

So the above works, but if I put that within a while loop to access all 500,000 records, then it uses too much memory.

Should I be dispatching each chunk to a queue which gets processed in the background?

EDIT: Added Benchmarks

I spent some time testing how long different processes take. The Parsing had interesting results. Keep in mind, this is JUST parsing. So it is not being stored in the DB, but it is being stored in an Array in Memory.

Parsing 1000 records: .017 seconds
Parsing 10,000 records: .188 seconds
Parsing 100,000 records: 2.273 seconds
Parsing 500,000 records: Never completes.

I'm using the code above to execute this. What might cause the parsing to take so long (or perhaps fail) at 500,000 records?

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
    • ¥15 Python3.5 相关代码写作
    • ¥20 测距传感器数据手册i2c
    • ¥15 RPA正常跑,cmd输入cookies跑不出来
    • ¥15 求帮我调试一下freefem代码
    • ¥15 matlab代码解决,怎么运行
    • ¥15 R语言Rstudio突然无法启动
    • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
    • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
    • ¥15 用windows做服务的同志有吗