dou426098 2011-09-04 14:46
浏览 34
已采纳

XML DomDocument优化

I have a 5MB xml file

I'm using the following code to get all nodeValue

$dom        =   new DomDocument('1.0', 'UTF-8');
if(!$dom->load($url))
return;

$games = $dom->getElementsByTagName("game");
foreach($games as $game)
{

}

This takes 76 seconds and there are around 2000 games tag. Is there any optimization or other solution to get the data?

  • 写回答

3条回答 默认 最新

  • dongzha2525 2011-09-04 16:08
    关注

    You shouldn't use the Document Object Model on large XML files, it is intended for human readable documents, not big datasets!

    If you want fast access you should use XMLReader or SimpleXML.

    XMLReader is ideal for parsing whole documents, and SimpleXML has a nice XPath function for retreiving data quickly.

    For XMLReader you can use the following code:

    <?php
    
    // Parsing a large document with XMLReader with Expand - DOM/DOMXpath 
    $reader = new XMLReader();
    
    $reader->open("tooBig.xml");
    
    while ($reader->read()) {
        switch ($reader->nodeType) {
            case (XMLREADER::ELEMENT):
            if ($reader->localName == "game") {
                 $node = $reader->expand();
                 $dom = new DomDocument();
                 $n = $dom->importNode($node,true);
                 $dom->appendChild($n);
                 $xp = new DomXpath($dom);
                 $res = $xp->query("/game/title"); // this is an example
                 echo $res->item(0)->nodeValue;
            }
        }
    }
    ?>
    

    The above will output all game titles (assuming you have /game/title XML structure).

    For SimpleXML you can use:

    $xml = file_get_contents($url);
    $sxml = new SimpleXML($xml);
    $games = $sxml->xpath('/game'); // returns an array of SXML nodes
    foreach ($games as $game)
    {
       print $game->nodeValue;
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度