使用PHP解析大型文本文件而不杀死服务器

I'm trying to read some large text files (between 50M-200M), doing simple text replacement (Essentially the xml I have hasn't been properly escaped in a few, regular cases). Here's a simplified version of the function:

<?php
function cleanFile($file1, $file2) {
$input_file     = fopen($file1, "r");
$output_file    = fopen($file2, "w");
  while (!feof($input_file)) {
    $buffer = trim(fgets($input_file, 4096));
    if (substr($buffer,0, 6) == '<text>' AND substr($buffer,0, 15) != '<text><![CDATA[')
    {
      $buffer = str_replace('<text>', '<text><![CDATA[', $buffer);
      $buffer = str_replace('</text>', ']]></text>', $buffer);
    }
   fputs($output_file, $buffer . "
");
  }
  fclose($input_file);
  fclose($output_file);     
}
?>

What I don't get is that for the largest of files, around 150mb, PHP memory usage goes off the chart (around 2GB) before failing. I thought that this was the most memory efficient way to go about reading large files. Is there some method I am missing that would be more efficient for memory? Perhaps some setting that's keeping things in memory when it should be being collected?

In other words, it's not working and I don't know why, and as far as I know I am not doing things incorrectly. Any direction for me to go? Thanks for any input.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dougui1977 2009-08-10 14:21
关注
PHP isn't really designed for this. Offload the work to a different process and call it or start it from PHP. I suggest using Python or Perl.

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

使用PHP解析大型文本文件而不杀死服务器 php
2009-08-10 14:19

回答 3 已采纳 PHP isn't really designed for this. Offload the work to a different process and call it or start i
PHP echos文件作为文本而不是服务器上的二进制文件 php
2018-05-23 04:57

回答 1 已采纳 Your content-type should be Content-type: image/png
php 服务器图片验证码不显示 php 服务器
2017-10-12 08:07

回答 4 已采纳你把 header('content-type:image/png'); imagepng($this->_image); imagedestroy($this->_image);
html文件不显示php代码提示错误,关于html：解析错误：语法错误，我的PHP代码中文件意外结束...
2021-04-22 06:55

麦克羊的博客我收到一个错误：Parse error: syntax error, unexpected end of file in the line使用此代码：function login(){// Login function code}if (login()){?>Welcome AdministratorUpload FilesEdit Points ...
php怎么把二维码图片的base64解析出二维码内容 php 后端有问必答
2022-01-03 19:56

回答 1 已采纳二维码识别题主可以试试这篇文章中的代码看识别率如何 php识别二维码内容_Code_Pupil的博客-CSDN博客_qrreader配置 P
如何安装ImageMagick php扩展以在Windows服务器上的IIS服务器上使用 php
2019-05-03 09:45

回答 1 已采纳 Eventually I ended up on a blogpost, which is a repost of a deleted post by PhilipD which was post
使用PHP查找并替换所有文件中的文本 php
2016-11-10 06:42

回答 3 已采纳 You can use glob to get all files in a directory, glob(__DIR__.'/*.php') will give you all php fil
消息队列服务器 轻量,PHP的轻量消息队列php-resque使用说明
2021-08-13 07:40

墨墨墨墨迹的博客消息队列处理后台任务带来的问题项目中经常会有后台运行任务的需求，比如发送邮件时，因为要连接邮件服务器，往往需要5-10秒甚至更长时间，如果能先给用户一个成功的提示信息，然后在后台慢慢处理发送邮件的操作，...
在PHP中使用标记解析文本文件 php
2011-07-13 17:51

回答 6 已采纳 Use strpos() and substr(): function parse($filename) { $lines = file($filename); $content = a
Web服务器显示php文件的内容而不是解释它 apache html php
2013-10-31 11:06

回答 5 已采纳 This kind of errors mainly happen 3 reason You may not be installed php if you have installed ph
<?php内部无法解析html怎么办 html5 php
2021-03-21 22:06

回答 1 已采纳在内部需使用echo <?php echo "this is php"; echo "<br>"; echo "yes"; ?>
php-fpm详解
2022-09-30 16:18

zh7314的博客 FastCGI(Fast Common Gateway Interface)快速通用网关接口，是 CGI 的增强版本，为了提升 CGI 的性能而生。 PHP-FPM(FastCGI Process Manager for PHP)PHP 的 FastCGI 进程管理器。FastCGI 只是一个协议规范，需要...
如何使用php中的curl将文件从本地上传到服务器？ php
2019-02-11 08:23

回答 1 已采纳 I think you miss important information. fopen ("user.com/user/fromLocal.txt", 'w+') this means
opencache的安装和使用-----以及代码不更新的问题
2021-01-30 11:53

快乐的提千万的博客 PHP作为脚本语言，效率是比较低下的。现在的加速方式基本是两种： swoole：直接常驻内存。 opencache：将编译后的脚本缓存起来。原始流程：加入cache流程配置官网注释，就在php.ini里面 [opcache] ; ...
LINUX系统上部署Apache+mysql+php,在WEB上部署ECSHOP
2022-03-23 08:51

Safe And Sound 521的博客部署LAMP，在WEB服务器上部署ecshop网站。安装Apache Apache HTTP Server（简称Apache）是Apache软件基金会的一个开放源码的网页服务器，可以在大多数计算机操作系统中运行，由于其多平台和安全性被广泛使用，是最...
没有解决我的问题, 去提问

悬赏问题

¥15 目详情-五一模拟赛详情页
¥15 有了解d3和topogram.js库的吗？有偿请教
¥100 任意维数的K均值聚类
¥15 stamps做sbas-insar，时序沉降图怎么画
¥15 买了个传感器，根据商家发的代码和步骤使用但是代码报错了不会改，有没有人可以看看
¥15 关于#Java#的问题，如何解决？
¥15 加热介质是液体，换热器壳侧导热系数和总的导热系数怎么算
¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
¥15 cmd cl 0x000007b
¥20 BAPI_PR_CHANGE how to add account assignment information for service line

使用PHP解析大型文本文件而不杀死服务器

3条回答 默认 最新

悬赏问题

3条回答默认最新