douqi5079 2015-04-21 06:01
浏览 77

导入大型itunes epf数据库无法正常工作

I'm trying to import EPF relational database. Itunes epf relational database is consist of details of all database ( application,music,tv shows,games etc). You can find more on this here Itunes EPF Relation Database I can be able to import all database except one database which is not processing. This file is around 6gb.

$field_separator = chr(1);
                                $record_separator = chr(2)."
";   
                                $data_appdt=explode($record_separator,file_get_contents('file_path',true));
                                    foreach ($data_appdt as $key => $value) 
                                    {
                                    if (substr($value, 0, 1) != '#')
                                        {
                                            if (!empty($value))
                                            {
                                            {
                                                $data_itu_app_dt=explode($field_separator, $value);
                                                $result=$this->admin_model->itunes_app_dt($data_itu_app_dt);
                                            }
                                            }
                                        }
                                    }

The above code is done in codeigniter which is the controller for the import process.This code works for around upto 2gb file. But for size larger than that its not working. Maybe its reading the whole file and memory doesn't allow it to do so. So i used the below code for processing higher files.

    $handle = fopen('file_path', "r") or die("Couldn't get handle");
                            if ($handle) {
                            while (!feof($handle)) {
                            $buffer = fgets($handle, 4096);
                            $data_appp=explode($record_separator,$buffer);
                            foreach ($data_appp as $key => $value) 
                            {
                            if (substr($value, 0, 1) != '#')
                            {
                                if (!empty($value))
                                {
                                    $data_itu_appp=explode($field_separator, $value);
                                    //print_r($data_itu_appp);
                                    $result=$this->admin_model->itunes_appp($data_itu_appp);
                                }
                            }   
                            }
                                }
                            fclose($handle);
                                }

It works for even 8gb files and the import is done and completed successfully. But then for a 6gb file the import is not going on. This is the sample data for the table

1426669253786|329704232|EN|iParrot Phrase Vietnamese-Italian|Translate Vietnamese Phrase to Italian Taking-with Translator for iPad/iPhone/iPod Touch|

iParrot Phrase sets a new standard for instant multi-language translation software. Designed exclusively for the iPad/iPhone/iPod Touch, it’s stocked with over 20 kinds of perfectly pronounced oral language for instant use. iParrot Phrase is organized into categories such as: Greetings, Transportation, Shopping, and Asking for helping etc. So it is enough for you to find the sentences you need instantly. Organized for instant access and ease, it is especially useful while traveling abroad. Virtual fluency available in Chinese, English, Japanese, Russian, French, German, Spanish, Italian, Korean, Portuguese, Arabic, Thai and Vietnamese.

This is the sample data from that database which is application detail ( In above the sample data that i replaced ASCII characters (SOH) for new field with |). Actually when the import is in process using the second code when the new line comes its taking it as /n and the import getting broken. So is there any ways to get around this or any other method to process such large file (6 GB) for database import ? Maybe the above things are little confusing. Is there any clarification needed then i will make things more clear. Looking for a good solution.. Thanks all.

  • 写回答

1条回答 默认 最新

  • duanshan1977 2015-06-11 17:55
    关注

    I don't have a direct answer for you in php, but the problem is very likely that you load the file in memory. The trick is to stream the file down and write it in chunks.

    In python you could use the requests library for example, that does Auth nicely (and you could write your download logic maybe in an easier fashion. It would look like something like this

    username='yourusernamehere'
    password='yourpasswordhere'
    response = requests.get('https://feeds.itunes.apple.com/feeds/', auth=(username, password), stream=True)
    

    Note that I used the stream=True mechanism because you will be downloading huge files that probably will not fit in memory, you should use chunking like so:

     with open(local_filename, 'wb') as f:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:  # filter out keep-alive new chunks
                f.write(chunk)
                f.flush()
    
    评论

报告相同问题?

悬赏问题

  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值