douluanzhao6689 2013-09-27 08:01
浏览 21
已采纳

PHP读取目录中的许多文件

I have seen that this question already exist inside stackoverflow but there isn't an answer for my problem.

I have directory with many files downloaded from another server, I don't know how much files and how much is the dimension can be 1GB I think or 100Mb depends of the external server.

Now I have done in this way:

    ini_set("memory_limit","10000M");
    $directory = "xml_uploads/hotel/";
    $xml_files = glob($directory . "*.xml");       
    foreach($xml_files as $file)
    {
        $content = file_get_contents($file, true);
        $xml = new DOMDocument();
        $xml->loadXML($content);
        if($xml){
           //parse xml and save inside database
        } 
     }

I don't know if is the better way to insert inside memory_limit a large number like that because I don't know the rela size adn if is too bigger I don't want to stop my server. Exist another way to parse all the xml files inside a directory?

Thanks

  • 写回答

3条回答 默认 最新

  • dst2007 2013-09-27 08:05
    关注

    XML Parsing

    Currently you are using DOMDocument combined with file_get_contents - that means you have to load your huge XML file to the memory first, and you will always hit a limit whenever a file is bigger than available memory - with the solution below, this is not an issue.

    If you are concerned with memory usage of the XML parsing code, you should use a pull parser - it is a type of XML parser that doesn't load everything into memory, but allows you to operate only on one entity at a time - this way the memory usage is minimal. In PHP, you may use XML Reader:

    $xml = new XMLReader();
    $xml->open("config.xml");
    
    while ($xml->read()) {
        switch ($xml->name) {
            case "myelem":
                 ...
        }
    }
    

    Huge directory tree traversal

    Sure! There is a DirectoryIterator and RecursiveDirectoryIterator

    Usage is very similar:

    foreach(new DirectoryIterator($directory) as $fileInfo)
    {
        if($fileInfo->getExtension() !== 'xml') continue;
        $content = file_get_contents($fileInfo->getPathname(), true);
        ...
    }
    

    Also, if you have a nested directory structure, you may use the other one:

    foreach(new RecursiveIteratorIterator(new RecursiveDirectoryIterator($directory)) as $fileInfo)
    {
        if($fileInfo->getExtension() !== 'xml') continue;
        $content = file_get_contents($fileInfo->getPathname(), true);
        ...
    }
    

    note that since this iterator is recursive, we have to wrap it with RecursiveIteratorIterator;

    both of these are available since PHP 5 (and you really should not use anything below that)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 Python爬取指定微博话题下的内容,保存为txt
  • ¥15 vue2登录调用后端接口如何实现
  • ¥65 永磁型步进电机PID算法
  • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?
  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥15 latex怎么处理论文引理引用参考文献
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?