dounang1974 2019-06-26 12:04
浏览 104
已采纳

解析XML文件时如何处理默认名称空间

My PHP page must parse input XML files (XLIFF, to be precise) but it does't work when a default namespace is present in the root element of the XML file.

My code assumes that a default namespace is required and that it must be urn:oasis:names:tc:xliff:document:1.2. If found in the XLIFF root element, it is fetched from there, otherwise it is added by my PHP code. I thought this was working but it seems it's not, and at the moment the only way I have to make it work is to remove the default namespace from the input XLIFF file. Of course, the PHP script should work regardless of whether the default namespace is present in the XLIFF file or not.

Under the understanding that a default namespace is necessary, in my PHP script I have:

$xml_file = file_get_contents($pathToInputFile);
if($xml_file === FALSE) {
    die("there is a problem to get contents from XLIFF file");
} 

$xliffObj = new DOMDocument();
$xliffObj->preserveWhiteSpace = true;
$xliffObj->loadXML($xml_file);

$context = $xliffObj->documentElement;
$xpath = new DOMXPath($xliffObj);

if (isSet($context->getAttributeNode('xmlns')->nodeValue)) {
    $ns = $context->getAttributeNode('xmlns')->nodeValue; 
    echo "The ns is: " . $ns;                          // line 198
}
else {
    $ns = "urn:oasis:names:tc:xliff:document:1.2";
    // this works when no default namespaces is defined in the XLIFF file
    echo "I have defined the ns as: " . $ns; 
}

$xpath->registerNamespace('ns', $ns);                 // line 208

$tus = $xpath->query('//trans-unit');
var_dump_pre($tus);die;

The parsing works fine if my input XLIFF file has:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2">

In that case, the output is

I have defined the ns as: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#12 (1) { ["length"]=> int(2) }

The $tus array contains the two trans-unit nodes in the XLIFF file.

However, when the file has

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">

then the nothing is extracted and the array where I save the contents of the file is empty (has NULL value). The output is:

The ns is: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#10 (1) { ["length"]=> int(0) }

As you can see, the $tus array is empty.

A potential solution could be to simply remove the namespace declaration before adding it again, but I would like to understand what the problem is. Thanks.

  • 写回答

1条回答 默认 最新

  • doufeixuan8882 2019-06-26 12:10
    关注

    It seems it is necessary to add the namespace to the xpath only when it is present in the XML file, thus:

    $xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//ns:trans-unit');
    

    However, I'm not sure this could backfire in other situations...

    When it is not present, it seems it's not necessary to include it in the xpath expression:

    #$xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//trans-unit');
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?
  • ¥15 为什么使用javacv转封装rtsp为rtmp时出现如下问题:[h264 @ 000000004faf7500]no frame?
  • ¥15 乘性高斯噪声在深度学习网络中的应用
  • ¥15 运筹学排序问题中的在线排序
  • ¥15 关于docker部署flink集成hadoop的yarn,请教个问题 flink启动yarn-session.sh连不上hadoop,这个整了好几天一直不行,求帮忙看一下怎么解决
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛