duanjiao6730 2017-02-21 18:16
浏览 49
已采纳

PHP XML DOM:为什么我的大型HTML文件被截断?

I am trying to process a large HTML file using DOM. I read it in and immediately write it out to another file without making any changes, but the output file is much smaller (and shorter) than the input.

This is particularly puzzling, because I could swear I did this previously while learning to use DOM and the output looked okay.

Here is my code:

<?
    // ini_set("memory_limit", -1);
    require_once("inc/common.inc");

    $acad = "../inprogress/academy/";
    $htmFName = "$acad/mf/humanacad.htm";
    $sz = filesize($htmFName);
    echo "fname: $htmFName, $sz bytes
";

    $dom = new DOMDocument();
    $dom->loadHTML($htmFName);
    $dom->save("z");
    $sz = filesize("z");
    echo "fname: z: $sz bytes
";

And the output:

fname: ../inprogress/academy//mf/humanacad.htm, 2621622 bytes
fname: z: 219 bytes

Here is the beginning of the input file:

<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=Generator content="Microsoft Word 11 (filtered)">
<title> The Hanging Academy</title>
<style>
<!--
...
 -->
</style>
</head>
<body lang=EN-US link=blue vlink=blue>
<div class=Section1>
<p class=SectionHd>THE HANGING ACADEMY -- Part 1: Miranda</p>

And here is the entirety of the output file:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>../inprogress/academy//mf/humanacad.htm</p></body></html>
  • 写回答

1条回答 默认 最新

  • doutandusegang2961 2017-02-21 18:22
    关注

    I think it is because you were meaning to use loadHTMLFile( $filename ) not loadHTML( $html ). loadHTML( $html ) expects the string passed to be HTML content. Not a filename of where to retrieve content.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用三极管设计—个共射极放大电路
  • ¥15 请完成下列相关问题!
  • ¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像,如何解决?
  • ¥15 求daily translation(DT)偏差订正方法的代码
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示