doutao4938 2018-02-07 14:56
浏览 131
已采纳

PHP XMLReader - 读取格式不正确的XML文件

This question is related to another post I've posted here (for reference)

I'm FTP downloading log files from an open data rail project in the UK, and the log files are about 3Mb each and presented this way:

<?xml version="1.0" encoding="UTF-8"?><Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" ts="2018-02-05T21:33:59.8558288Z" version="12.0"><uR updateOrigin="Darwin"><deactivated rid="201802058015464"/></uR></Pport>
<?xml version="1.0" encoding="UTF-8"?><Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-02-05T21:33:59.8558288Z" version="12.0"><uR updateOrigin="Darwin"><TS rid="201802058709918" ssd="2018-02-05" uid="W09918"><ns3:Location tpl="DARTFD" wta="07:36"><ns3:arr delayed="true" et="21:34" src="Darwin"/><ns3:plat cisPlatsup="true" platsup="true">2</ns3:plat></ns3:Location></TS></uR></Pport>
<?xml version="1.0" encoding="UTF-8"?><Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-02-05T21:33:59.8558288Z" version="12.0"><uR updateOrigin="Darwin"><TS rid="201802058771469" ssd="2018-02-05" uid="W71469"><ns3:Location tpl="WLWYCSD" wtd="13:16"><ns3:dep delayed="true" et="21:34" src="Darwin"/></ns3:Location><ns3:Location tpl="WLWYNGC" wtp="13:18"><ns3:pass delayed="true" et="21:36" src="Darwin"/><ns3:plat cisPlatsup="true" platsup="true">3</ns3:plat></ns3:Location><ns3:Location tpl="HATFILD" wtp="13:21:30"><ns3:pass delayed="true" et="21:39" src="Darwin"/><ns3:plat cisPlatsup="true" platsrc="A" platsup="true">1</ns3:plat></ns3:Location><ns3:Location tpl="POTRSBR" wtp="13:26"><ns3:pass delayed="true" et="21:44" src="Darwin"/><ns3:plat cisPlatsup="true" platsup="true">1</ns3:plat></ns3:Location><ns3:Location tpl="ALEXNDP" wtp="13:36:30"><ns3:pass delayed="true" et="21:51" src="Darwin"/><ns3:plat cisPlatsup="true" platsup="true">2</ns3:plat></ns3:Location><ns3:Location tpl="HRGYURV" wta="13:43" wtd="13:48"><ns3:arr delayed="true" et="21:57" src="Darwin"/><ns3:dep delayed="true" et="21:58" src="Darwin"/></ns3:Location><ns3:Location tpl="HRNSYMD" wta="13:50"><ns3:arr delayed="true" et="22:00" src="Darwin"/></ns3:Location></TS></uR></Pport>

To further add, sometimes the last entry, is a broken entry, like this:

<?xml version="1.0" encoding="UTF-8"?><Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-02-05T21:34:52.2569006Z" version="12.0"><uR updateOrigin="Trust"><TS rid="201802056757064" ssd="2018-02-05" uid="C57064"><ns3:Location pta="21:34" ptd="21:34" tpl="DEVNPRT" wta="21:34" wtd="21:34:30"><ns3:arr at

I have used the advice given here and tried to implement a PHP solution using XMLReader, however the way the XML log file is setup, XMLReader throughs errors.

This is the base code I'm using:

$xmlReader->open($filename);

// While there is something to read continue reading
    while ($xmlReader->read()) { 

    // check to ensure nodeType is an Element not attribute or #Text  
        if ($xmlReader->nodeType == XMLReader::ELEMENT) {

            if ($xmlReader->hasAttributes) {
//Do something here
                }
            }
        }
    }

One solution I thought since each of the entries in the log file are single lines, I thought I could open the file, and read and load into the XMLReader, but I have not being able to do it, like this:

if ($filename = fopen("./pPortData.log", "r")) {
       while (!feof($filename)) {
            $xmlstr = fgets($filename);
            # do same stuff with the $line
            $address = new SimpleXMLElement($xmlstr) or die("Error: Cannot create object");
            echo $address->getName(), PHP_EOL;
            foreach($address as $name => $part) {
                echo "$name: $part" . "/n/r", PHP_EOL;
            }
        }    
        fclose($xmlstr);
    }

But no joy. So ...

1) Do you know a way of achieving this pls?

2) Or do you know how to load line by line from a file into XMLReader?

3) How do I fix the XML file?

Thank you

Lucio

  • 写回答

1条回答 默认 最新

  • douzhi3586 2018-02-07 15:11
    关注

    You were close with your last effort, loading each line at a time and then processing them with SimpleXML should be OK.

    I've made a few changes, I've added some error trapping which picks up the last record potentially being incomplete and just displays a message. The other part is just about how you process the XML data, so at the moment I just output the data from the loaded XML.

    if ($file = fopen("./pPortData.log", "r")) {
        while (!feof($file)) {
            $xmlstr = fgets($file);
            libxml_use_internal_errors(true);
            try {
                $xml = new SimpleXMLElement($xmlstr);
                echo $xml->getName(), PHP_EOL;
                foreach($xml->children() as $part) {
                    echo $part->asXML() . PHP_EOL;
                }
            }
            catch ( Exception $e )  {
                echo "Last part unreadable.".PHP_EOL;
            }
        }
        fclose($file);
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码