doushouxie7064 2015-02-08 21:37
浏览 45
已采纳

PHP如何使用“; 带有DOMdocument的XML实体

I am working on modifying the contents of an XML file generated by some other library. I'm making some DOM modifications with PHP (5.3.10) and reinserting a replacement node.

The XML data I'm working with has " elements before I do the manipulation and I want to keep those elements as per http://www.w3.org/TR/REC-xml/ when I'm done with the modifications.

However I'm having problems with PHP changing the " elements. See my example.

$temp = 'Hello "XML".';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows "

Output

<?xml version="1.0" encoding="utf-8"?> 
<description>Hello "XML".</description>

Hello "XML".

Is this a PHP error or am I doing something wrong? I hope it isn't necessary to use createEntityReference in every char location.

Similar Question: PHP XML Entity Encoding issue


EDIT: As an example to show saveXML should not be converting the &quot; entities just like the &amp; which behaves properly. This $temp string should really be output as it is initially entered with the entities during saveXML().

$temp = 'Hello &quot;XML&quot; &amp;.';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element like &amp;
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows " &

Output

<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML" &amp;.</description>

Hello "XML" &.
  • 写回答

1条回答 默认 最新

  • douke1942 2015-02-09 04:39
    关注

    The answer is that it doesn't actually need any escaping according to the spec (skipping the mentions of CDATA):

    The ampersand character (&) and the left angle bracket (<) must not appear in their literal form (...) If they are needed elsewhere, they must be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively. The right angle bracket (>) may be represented using the string " &gt; " (...)

    To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " &apos; ", and the double-quote character (") as " &quot; ".

    You can verify this easily by using createTextNode() to perform the correct escaping:

    $dom = new DOMDocument;
    $e = $dom->createElement('description');
    $content = 'single quote: \', double quote: ", opening tag: <, ampersand: &, closing tag: >';
    $t = $dom->createTextNode($content);
    $e->appendChild($t);
    $dom->appendChild($e);
    
    echo $dom->saveXML();
    

    Output:

    <?xml version="1.0"?>
    <description>single quote: ', double quote: ", opening tag: &lt;, ampersand: &amp;, closing tag: &gt;</description>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵