dqcj32855 2015-07-29 13:46
浏览 52
已采纳

PHP XML解析 - 它可以更快吗?

I have large XML file (400 MB) and I need update it daily. For every main element I use SELECT + INSERT/UPDATE query into database. When I run script, it process 26 main elements per minute but it slows - after 500 main elements, it is much slower (10 elements per minute).

    $xml_reader = new XMLReader;
$xml_reader->open("feed.xml");


// move the pointer to the first product
while ($xml_reader->read() && $xml_reader->name != 'SHOPITEM');

// loop through the products
while ($xml_reader->name == 'SHOPITEM')
            {
            // load the current xml element into simplexml and we’re off and running!
            $feed = simplexml_load_string($xml_reader->readOuterXML());

            // now you can use your simpleXML object ($xml).
            //e.g. $feed->PRODUCTNO

            //SELECT, UPDATE/INSERT HERE
    }

    // move the pointer to the next product
    $xml_reader->next('SHOPITEM');
}

// don’t forget to close the file
$xml_reader->close();

This is the XML:

<?xml version="1.0" encoding="utf-8"?>
<SHOP>
    <SHOPITEM> 
        <ITEM_ID>2600000394161</ITEM_ID> 
        (+ 15 more elements like this) 
        <PARAM>
            <PARAM_NAME><![CDATA[some data here]]></PARAM_NAME> 
            <VAL><![CDATA[some data here]]></VAL> 
        </PARAM> 
        (+ 10 more elements like this) 
    </SHOPITEM> 
    (lot of shopitems here) 
</SHOP>

I can't use SimpleXML due to my RAM. Is there any faster PHP XML parser or what way does it big sites (e.g. price compare sites)? Better HW? My CPU is on 10% and RAM on 80% when XML processing.

  • 写回答

3条回答 默认 最新

  • duanshangying5102 2015-07-29 21:48
    关注

    Consider using an XML database (e.g. eXist or BaseX). At this sort of size, it will be much more efficient.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 怎么获取下面的: glove_word2id.json和 glove_numpy.npy 这两个文件
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示
  • ¥15 arduino 步进电机
  • ¥20 程序进入HardFault_Handler
  • ¥15 oracle集群安装出bug