duanlun1955 2011-03-01 10:38
浏览 42
已采纳

如何使用正则表达式从字符串中删除其他</ html>标记?

I am using php DOMDocument to replace a node and then rewrite the page. The HTML that is written back is plain text (not HTML) so I had to convert it like so:

$content = files::readFile($data['page_path']);
$content = str_replace('&lt;', '<', $content);
$content = str_replace('&gt;', '>', $content);

if (!@fwrite($handle, $content))
{
    print 'Failed to replace entities';
    return FALSE;
}

This makes the HTML proper however, for some odd reason, it adds an extra < / html > tag to the bottom of the document with some additional data after the offending < / html > tag. I am at a total loss as to why.

Anyway, I thought about using:

$content = preg_replace('#\<\/head\>*(:alphanum:)#', '</html>', $content);

to remove it but this doesn't match the way I thought it would.

Help please!

Testing example:

$html = '
   <div id="footer">
       <div class="wrap">
           <strong class="logo"><a href="#">College</a></strong>
           <ul><li><a href="#">Emergencies</a></li>
               <li><a href="#">Contact</a></li>
               <li><a href="#">Copyright</a></li>
               <li><a href="#">Terms of Use</a></li>
               <li><a href="#">Member of The Colleges</a></li>
           </ul><p>© 2010 College</p>
       </div>
   </div>
</body></html>
li>
               <li><a href="#">Contact</a></li>
               <li><a href="#">Copyright</a></li>
               <li><a href="#">Terms of Use</a></li>
               <li><a href="#">Member of The Colleges</a></li>
           </ul><p>© 2010 College</p>
       </div>
   </div>
</body></html>';

preg_match("#</head>.*#si", $html, $matches);
var_dump($matches);

展开全部

  • 写回答

3条回答 默认 最新

  • douyun8901 2011-03-04 15:12
    关注

    The problem I was experiencing has been solved: I figured out the strange bug I have been experiencing in the reusable content! I found the issue in my use of PHP’s function fwrite() when using mode ‘r+’. If you see the documentation for this function at php.net/fopen, you will see that r+ does the following: Open for reading and writing; place the file pointer at the beginning of the file. I naively assumed that this meant that since the pointer was at the beginning, it would overwrite the entire file contents. No, in fact this is not the truth. If you want that effect, you have to use mode ‘w’ which does the following: Open for writing only; place the file pointer at the beginning of the file and truncate the file to zero length. If the file does not exist, attempt to create it.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)
编辑
预览

报告相同问题?