dougou6213 2018-05-27 16:15
浏览 71
已采纳

为什么SimpleXMLElement无法找到XML文件的内容?

I need to parse an xml document that I receive from a third party using php. I am not able to ask the maintainers of the document to fix its structure. When I parse the document using simplexml_load_file the XML documen is empty.

Here is a stripped down example of what I am seeing.

my-file.xml:

<?xml version="1.0" encoding="utf-8"?>
<DataSet>
  <diffgr:diffgram xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
    aaa
  </diffgr:diffgram>
</DataSet>

And I process it like this (from the command line):

php > $xml = simplexml_load_file('my-file.xml');
php > print_r($xml);
SimpleXMLElement Object
(
)

I was expecting that the xml structure is displayed through print_r.

Indeed, when I remove the namespace declaration, things seem to work (despite some expected XML parse warnings):

my-file-nonamespace.xml:

<?xml version="1.0" encoding="utf-8"?>
<DataSet>
  <diffgr:diffgram>
    aaa
  </diffgr:diffgram>
</DataSet>

Processing it the same way on the command line (with warnings removed):

php > $xml = simplexml_load_file('my-file-nonamespace.xml');

// a bunch of xml parse warnings
php > print_r($xml);
SimpleXMLElement Object
(
    [diffgr:diffgram] =>
    aaa

)

So, the problem has to do with an invalid namespace declaration. I can probably use a regular expression on the file to remove the namespace declaration before parsing, but that is not a direction I want to go.

What is the best way to properly parse the first document in PHP?

  • 写回答

1条回答 默认 最新

  • drbii0359 2018-05-27 17:03
    关注

    The issue is not that the data isn't loaded, but the fact that the child elements are in a different namespace.

    $xml = simplexml_load_file('my-file.xml');
    var_dump($xml->children("diffgr", true));
    

    This selects the children from a specific namespace from the current element.

    Note that you should use the URI as the prefix may change, but this is just to show that the data is there.

    Edit: If the XML has issues, then the first stage is to ignore the errors and then check what is loaded ...

    libxml_use_internal_errors(true);
    $xml = simplexml_load_file('my-file.xml');
    echo $xml->asXML();
    

    This will give you an idea of what state the result is and even if it loads. A quick example is...

    libxml_use_internal_errors(true);
    $xml = simplexml_load_file('my-file.xml');
    echo $xml->asXML();
    var_dump($xml->children());
    

    With..

    <?xml version="1.0" encoding="utf-8"?>
    <DataSet>
      <diffgr:diffgram>
        aaa
      </diffgr:diffgram>
    </DataSet>
    

    Notice how the namespace is there, but the namespace isn't declared. The output is...

    <?xml version="1.0" encoding="utf-8"?>
    <DataSet>
      <diffgr:diffgram>
        aaa
      </diffgr:diffgram>
    </DataSet>
    /home/nigel/workspace2/Test/t1.php:22:
    class SimpleXMLElement#2 (1) {
      public $diffgr:diffgram =>
      string(11) "
        aaa
      "
    }
    

    This outputs the children without having to use the namespace.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 已有python代码,要求做成可执行程序,程序设计内容不多
  • ¥15 目标检测项目无法读取视频
  • ¥15 GEO datasets中基因芯片数据仅仅提供了normalized signal如何进行差异分析
  • ¥15 小红薯封设备能解决的来
  • ¥100 求采集电商背景音乐的方法
  • ¥15 数学建模竞赛求指导帮助
  • ¥15 STM32控制MAX7219问题求解答
  • ¥20 在本地部署CHATRWKV时遇到了AttributeError: 'str' object has no attribute 'requires_grad'
  • ¥15 vue+element项目中多tag时,切换Tab时iframe套第三方html页面需要实现不刷新
  • ¥50 深度强化学习解决能源调度问题