dounang1974 2019-06-26 04:04
浏览 104
已采纳

解析XML文件时如何处理默认名称空间

My PHP page must parse input XML files (XLIFF, to be precise) but it does't work when a default namespace is present in the root element of the XML file.

My code assumes that a default namespace is required and that it must be urn:oasis:names:tc:xliff:document:1.2. If found in the XLIFF root element, it is fetched from there, otherwise it is added by my PHP code. I thought this was working but it seems it's not, and at the moment the only way I have to make it work is to remove the default namespace from the input XLIFF file. Of course, the PHP script should work regardless of whether the default namespace is present in the XLIFF file or not.

Under the understanding that a default namespace is necessary, in my PHP script I have:

$xml_file = file_get_contents($pathToInputFile);
if($xml_file === FALSE) {
    die("there is a problem to get contents from XLIFF file");
} 

$xliffObj = new DOMDocument();
$xliffObj->preserveWhiteSpace = true;
$xliffObj->loadXML($xml_file);

$context = $xliffObj->documentElement;
$xpath = new DOMXPath($xliffObj);

if (isSet($context->getAttributeNode('xmlns')->nodeValue)) {
    $ns = $context->getAttributeNode('xmlns')->nodeValue; 
    echo "The ns is: " . $ns;                          // line 198
}
else {
    $ns = "urn:oasis:names:tc:xliff:document:1.2";
    // this works when no default namespaces is defined in the XLIFF file
    echo "I have defined the ns as: " . $ns; 
}

$xpath->registerNamespace('ns', $ns);                 // line 208

$tus = $xpath->query('//trans-unit');
var_dump_pre($tus);die;

The parsing works fine if my input XLIFF file has:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2">

In that case, the output is

I have defined the ns as: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#12 (1) { ["length"]=> int(2) }

The $tus array contains the two trans-unit nodes in the XLIFF file.

However, when the file has

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">

then the nothing is extracted and the array where I save the contents of the file is empty (has NULL value). The output is:

The ns is: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#10 (1) { ["length"]=> int(0) }

As you can see, the $tus array is empty.

A potential solution could be to simply remove the namespace declaration before adding it again, but I would like to understand what the problem is. Thanks.

展开全部

  • 写回答

1条回答 默认 最新

  • doufeixuan8882 2019-06-26 04:10
    关注

    It seems it is necessary to add the namespace to the xpath only when it is present in the XML file, thus:

    $xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//ns:trans-unit');
    

    However, I'm not sure this could backfire in other situations...

    When it is not present, it seems it's not necessary to include it in the xpath expression:

    #$xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//trans-unit');
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

悬赏问题

  • ¥30 silcavo仿真,30分钟,只需要代码
  • ¥15 FastReport 怎么实现打印后马上关闭打印预览窗口
  • ¥15 利用3支股票数据估计其均值和方差的95%置信区间。
  • ¥15 微信小程序运行一项功能时,弹出未知错误弹框,检查代码没有问题
  • ¥15 ATAC测序生成self-pseudo replicates之前是否要进行去线粒体reads
  • ¥15 python模糊字匹配函数问题
  • ¥20 谁刷目标页面的uv记录器上数据,数据只记录跳转的数值
  • ¥30 数据库软件的安装方法
  • ¥15 一道以太网数据传输题
  • ¥15 python 下载群辉文件
手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部