duancuisan2503
2017-03-28 11:07
浏览 80

加快使用PHP中的DOMDocument类和命名空间解析XML文档

I have 6 XML documents that I need to parse with PHP. Every file has 50000 elements therefore I need fast parser so I chose DOMDocument class. Example of XML file is:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:PinsCountryCodeIds xmlns:ns2="http://apis-it.hr/umu/2015/types/kp">
    <ns2:PinCountryCodeId>
        <ns2:CountryCodeId>HR</ns2:CountryCodeId>
        <ns2:PinPrimatelja>000000000</ns2:PinPrimatelja>
    </ns2:PinCountryCodeId>
    <ns2:PinCountryCodeId>
        <ns2:CountryCodeId>HR</ns2:CountryCodeId>
        <ns2:PinPrimatelja>000000001</ns2:PinPrimatelja>
    </ns2:PinCountryCodeId>
    <ns2:PinCountryCodeId>
        <ns2:CountryCodeId>HR</ns2:CountryCodeId>
        <ns2:PinPrimatelja>000000002</ns2:PinPrimatelja>
    </ns2:PinCountryCodeId>
</ns2:PinsCountryCodeIds>

The best what I come up with is this code:

$input_file=scandir($OIB_path);//Scanning directory for files
foreach ($input_file as $input_name){
    if($input_name=="." || $input_name=="..")
        continue;
    $OIB_file=$OIB_path . $input_name;

    $doc = new DOMDocument();
    $doc->load( $OIB_file );

    $doc->saveXML();
    foreach ($doc->getElementsByTagNameNS('http://apis-it.hr/umu/2015/types/kp', 'PinPrimatelja') as $element) {
        echo  $element->nodeValue, ', <br> ';
    }           

}

But it is too slow it takes more then 20 minutes to parse 6 files.

What can I do to improve it?

图片转代码服务由CSDN问答提供 功能建议

我有6个XML文档需要用PHP解析。 每个文件都有50000个元素因此我需要快速解析器所以我选择了DOMDocument类。 XML文件的例子是:

 &lt;?xml version =“1.0”encoding =“  UTF-8“standalone =”yes“?&gt; 
&lt; ns2:PinsCountryCodeIds xmlns:ns2 =”http://apis-it.hr/umu/2015/types/kp“&gt; 
&lt; ns2:PinCountryCodeId&gt  ; 
&lt; ns2:CountryCodeId&gt; HR&lt; / ns2:CountryCodeId&gt; 
&lt; ns2:PinPrimatelja&gt; 000000000&lt; / ns2:PinPrimatelja&gt; 
&lt; / ns2:PinCountryCodeId&gt; 
&lt; ns2:PinCountryCodeId&gt; 
  &lt; ns2:CountryCodeId&gt; HR&lt; / ns2:CountryCodeId&gt; 
&lt; ns2:PinPrimatelja&gt; 000000001&lt; / ns2:PinPrimatelja&gt; 
&lt; / ns2:PinCountryCodeId&gt; 
&lt; ns2:PinCountryCodeId&gt; 
&lt; ns2  :CountryCodeId&gt; HR&lt; / ns2:CountryCodeId&gt; 
&lt; ns2:PinPrimatelja&gt; 000000002&lt; / ns2:PinPrimatelja&gt; 
&lt; / ns2:PinCountryCodeId&gt; 
&lt; / ns2:PinsCountryCodeIds&gt; 
   
 
 

我提出的最好的是这段代码:

  $ in  put_file = scandir($ OIB_path); //扫描文件目录
foreach($ input_file as $ input_name){
 if($ input_name ==“。”||  $ input_name ==“..”)
 continue; 
 $ OIB_file = $ OIB_path。  $ input_name; 
 
 $ doc = new DOMDocument(); 
 $ doc-&gt; load($ OIB_file); 
 
 $ doc-&gt; saveXML(); 
 foreach($ doc-&gt;  getElementsByTagNameNS('http://apis-it.hr/umu/2015/types/kp','PinPrimatelja')as $ element){
 echo $ element-&gt; nodeValue,',&lt; br&gt;  '; 
} 
 
} 
   
 
 

但它太慢,解析6个文件需要20多分钟。 \ n

我可以做些什么来改进它?

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • duanlipeng4136 2017-06-15 01:45
    已采纳

    Xpath queries are much faster than doing normal traversal using DOM.

    Try below code and let me know if it improves the performance.

    <?php
    
    $input_file=scandir($OIB_path);//Scanning directory for files
    
    foreach ($input_file as $input_name){
    
        if($input_name=="." || $input_name=="..")
            continue;
        $OIB_file=$OIB_path . $input_name;
    
        $doc = new DOMDocument();
        $doc->load( $OIB_file );
    
        $xpath = new DOMXPath($doc);
        $xpath->registerNameSpace('x', 'http://apis-it.hr/umu/2015/types/kp');
    
        $elements = $xpath->query('//x:PinCountryCodeId/x:PinPrimatelja');
    
        if ($elements->length > 0) {
            foreach ($elements as $element) {
                echo $element->nodeValue.'<br>';
            }
    
        }
    
    }
    
    ?>
    
    已采纳该答案
    打赏 评论

相关推荐 更多相似问题