dounang1974 2019-06-26 12:04
浏览 104
已采纳

解析XML文件时如何处理默认名称空间

My PHP page must parse input XML files (XLIFF, to be precise) but it does't work when a default namespace is present in the root element of the XML file.

My code assumes that a default namespace is required and that it must be urn:oasis:names:tc:xliff:document:1.2. If found in the XLIFF root element, it is fetched from there, otherwise it is added by my PHP code. I thought this was working but it seems it's not, and at the moment the only way I have to make it work is to remove the default namespace from the input XLIFF file. Of course, the PHP script should work regardless of whether the default namespace is present in the XLIFF file or not.

Under the understanding that a default namespace is necessary, in my PHP script I have:

$xml_file = file_get_contents($pathToInputFile);
if($xml_file === FALSE) {
    die("there is a problem to get contents from XLIFF file");
} 

$xliffObj = new DOMDocument();
$xliffObj->preserveWhiteSpace = true;
$xliffObj->loadXML($xml_file);

$context = $xliffObj->documentElement;
$xpath = new DOMXPath($xliffObj);

if (isSet($context->getAttributeNode('xmlns')->nodeValue)) {
    $ns = $context->getAttributeNode('xmlns')->nodeValue; 
    echo "The ns is: " . $ns;                          // line 198
}
else {
    $ns = "urn:oasis:names:tc:xliff:document:1.2";
    // this works when no default namespaces is defined in the XLIFF file
    echo "I have defined the ns as: " . $ns; 
}

$xpath->registerNamespace('ns', $ns);                 // line 208

$tus = $xpath->query('//trans-unit');
var_dump_pre($tus);die;

The parsing works fine if my input XLIFF file has:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2">

In that case, the output is

I have defined the ns as: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#12 (1) { ["length"]=> int(2) }

The $tus array contains the two trans-unit nodes in the XLIFF file.

However, when the file has

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xliff PUBLIC "-//XLIFF//DTD XLIFF//EN" "http://www.oasis-open.org/committees/xliff/documents/xliff.dtd">
<xliff xmlns:pisa="http://www.ets.org/pisa" version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">

then the nothing is extracted and the array where I save the contents of the file is empty (has NULL value). The output is:

The ns is: urn:oasis:names:tc:xliff:document:1.2

object(DOMNodeList)#10 (1) { ["length"]=> int(0) }

As you can see, the $tus array is empty.

A potential solution could be to simply remove the namespace declaration before adding it again, but I would like to understand what the problem is. Thanks.

  • 写回答

1条回答 默认 最新

  • doufeixuan8882 2019-06-26 12:10
    关注

    It seems it is necessary to add the namespace to the xpath only when it is present in the XML file, thus:

    $xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//ns:trans-unit');
    

    However, I'm not sure this could backfire in other situations...

    When it is not present, it seems it's not necessary to include it in the xpath expression:

    #$xpath->registerNamespace('ns', $ns);
    $tus = $xpath->query('//trans-unit');
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 使用C#,asp.net读取Excel文件并保存到Oracle数据库
  • ¥15 C# datagridview 单元格显示进度及值
  • ¥15 thinkphp6配合social login单点登录问题
  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 虚心请教几个问题,小生先有礼了
  • ¥30 截图中的mathematics程序转换成matlab