doushouxie7064 2015-02-08 21:37
浏览 45
已采纳

PHP如何使用“; 带有DOMdocument的XML实体

I am working on modifying the contents of an XML file generated by some other library. I'm making some DOM modifications with PHP (5.3.10) and reinserting a replacement node.

The XML data I'm working with has " elements before I do the manipulation and I want to keep those elements as per http://www.w3.org/TR/REC-xml/ when I'm done with the modifications.

However I'm having problems with PHP changing the " elements. See my example.

$temp = 'Hello "XML".';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows "

Output

<?xml version="1.0" encoding="utf-8"?> 
<description>Hello "XML".</description>

Hello "XML".

Is this a PHP error or am I doing something wrong? I hope it isn't necessary to use createEntityReference in every char location.

Similar Question: PHP XML Entity Encoding issue


EDIT: As an example to show saveXML should not be converting the &quot; entities just like the &amp; which behaves properly. This $temp string should really be output as it is initially entered with the entities during saveXML().

$temp = 'Hello &quot;XML&quot; &amp;.';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element like &amp;
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows " &

Output

<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML" &amp;.</description>

Hello "XML" &.
  • 写回答

1条回答 默认 最新

  • douke1942 2015-02-09 04:39
    关注

    The answer is that it doesn't actually need any escaping according to the spec (skipping the mentions of CDATA):

    The ampersand character (&) and the left angle bracket (<) must not appear in their literal form (...) If they are needed elsewhere, they must be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively. The right angle bracket (>) may be represented using the string " &gt; " (...)

    To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " &apos; ", and the double-quote character (") as " &quot; ".

    You can verify this easily by using createTextNode() to perform the correct escaping:

    $dom = new DOMDocument;
    $e = $dom->createElement('description');
    $content = 'single quote: \', double quote: ", opening tag: <, ampersand: &, closing tag: >';
    $t = $dom->createTextNode($content);
    $e->appendChild($t);
    $dom->appendChild($e);
    
    echo $dom->saveXML();
    

    Output:

    <?xml version="1.0"?>
    <description>single quote: ', double quote: ", opening tag: &lt;, ampersand: &amp;, closing tag: &gt;</description>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大
  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端