dsfs21312 2018-10-15 18:09
浏览 56
已采纳

如何管理PHP内存?

I wrote a one-off script that I use to parse PDFs saved on the database. So far it is working okay until I ran out of memory after parsing 2,700+ documents.

The basic flow of the script is as follows:

  1. Get a list of all the document IDs to be parsed and save it as an array in the session (~155k documents).
  2. Display a page that has a button to start parsing
  3. Make an AJAX request when that button is clicked that would parse the first 50 documents in the session array

$files = $_SESSION['files'];

$ids = array();

$slice = array_slice($files, 0, 50);
$files = array_slice($files, 50, null); // remove the 50 we are parsing on this request

if(session_status() == PHP_SESSION_NONE) {
  session_start();
}
$_SESSION['files'] = $files;
session_write_close();

for($i = 0; $i < count($slice); $i++) {
  $ids[] = ":id_{$i}";
}
$ids = implode(", ", $ids);

$sql = "SELECT d.id, d.filename, d.doc_content
  FROM proj_docs d
  WHERE d.id IN ({$ids})";

$stmt = oci_parse($objConn, $sql);
for($i = 0; $i < count($slice); $i++) {
  oci_bind_by_name($stmt, ":id_{$i}", $slice[$i]);
}
oci_execute($stmt, OCI_DEFAULT);
$cnt = oci_fetch_all($stmt, $data);
oci_free_statement($stmt);

# Do the parsing..
# Output a table row..

  1. The response to the AJAX request typically includes a status whether the script has finished parsing the total ~155k documents - if it's not done, another AJAX request is made to parse the next 50. There's a 5 second delay between each request.

Questions

  • Why am I running out of memory when I was expecting that peak memory usage would be when I get a list of all the document IDs on #1 since it holds all of the possible documents not a few minutes later when the session array holds 2,700 elements less?
  • I saw a few questions similar to my problem and they suggested to either set the memory to unlimited which I don't want to do at all. The others suggested to set my variables to null when appropriate and I did that but I still ran out of memory after parsing ~2,700 documents. So what other approaches should I try?

# Freeing some memory space
$batch_size = null;
$with_xfa = null;
$non_xfa = null;
$total = null;
$files = null;
$ids = null;
$slice = null;
$sql = null;
$stmt = null;
$objConn = null;
$i = null;
$data = null;
$cnt = null;
$display_class = null;
$display = null;
$even = null;
$tr_class = null;
  • 写回答

1条回答 默认 最新

  • doufan3958 2018-10-15 20:13
    关注

    So I'm not really sure why but reducing the number of documents I'm parsing from 50 down to 10 for each batch seems to fix the issue. I've gone past 5,000 documents now and the script is still running. My only guess is that when I was parsing 50 documents I must have encountered a lot of large files which used up all of the memory allotted.

    Update #1

    I got another error about memory running out at 8,500+ documents. I've reduced the batches further down to 5 documents each and will see tomorrow if it goes all the way to parsing everything. If that fails, I'll just increase the memory allocated temporarily.

    Update #2

    So it turns out that the only reason why I'm running out of memory is that we apparently have multiple PDF files that are over 300MB uploaded on to the database. I increased the memory allotted to PHP to 512MB and this seems to have allowed me to finish parsing everything.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料