dongmu3457
2009-08-10 14:19
采纳率: 0%
浏览 38
已采纳

使用PHP解析大型文本文件而不杀死服务器

I'm trying to read some large text files (between 50M-200M), doing simple text replacement (Essentially the xml I have hasn't been properly escaped in a few, regular cases). Here's a simplified version of the function:

<?php
function cleanFile($file1, $file2) {
$input_file     = fopen($file1, "r");
$output_file    = fopen($file2, "w");
  while (!feof($input_file)) {
    $buffer = trim(fgets($input_file, 4096));
    if (substr($buffer,0, 6) == '<text>' AND substr($buffer,0, 15) != '<text><![CDATA[')
    {
      $buffer = str_replace('<text>', '<text><![CDATA[', $buffer);
      $buffer = str_replace('</text>', ']]></text>', $buffer);
    }
   fputs($output_file, $buffer . "
");
  }
  fclose($input_file);
  fclose($output_file);     
}
?>

What I don't get is that for the largest of files, around 150mb, PHP memory usage goes off the chart (around 2GB) before failing. I thought that this was the most memory efficient way to go about reading large files. Is there some method I am missing that would be more efficient for memory? Perhaps some setting that's keeping things in memory when it should be being collected?

In other words, it's not working and I don't know why, and as far as I know I am not doing things incorrectly. Any direction for me to go? Thanks for any input.

图片转代码服务由CSDN问答提供 功能建议

我正在尝试读取一些大文本文件(介于50M-200M之间),进行简单的文本替换(基本上是 xml我没有在一些常规情况下正确转义)。 这是函数的简化版本:</ p>

 &lt;?php 
function cleanFile($ file1,$ file2){
 $ input_file = fopen($ file1,“r”  ); 
 $ output_file = fopen($ file2,“w”); 
 while(!feof($ input_file)){
 $ buffer = trim(fgets($ input_file,4096)); 
 if(substr  ($ buffer,0,6)=='&lt; text&gt;'AND substr($ buffer,0,15)!='&lt; text&gt;&lt;![CDATA [')
 {
 $ buffer = str_replace  ('&lt; text&gt;','&lt; text&gt;&lt;![CDATA [',$ buffer); 
 $ buffer = str_replace('&lt; / text&gt;',']]&gt;&lt; / text&gt;  ',$ buffer); 
} 
 fputs($ output_file,$ buffer。“
”); 
} 
 fclose($ input_file); 
 fclose($ output_file);  
} 
?&gt; 
 </ code> </ pre> 
 
 

我不知道的是,对于最大的文件,大约150mb,PHP内存使用量超出了图表( 失败之前,大约2GB)。 我认为这是阅读大文件最有效的内存方式。 是否有一些我遗漏的方法对内存更有效? 也许有些设置会在收集时将内容保存在内存中? </ p>

换句话说,它不起作用,我不知道为什么,据我所知,我没有做错事。 我的任何方向去? 感谢您的任何意见。 </ p> </ div>

3条回答 默认 最新

相关推荐 更多相似问题