duanchi1230 2011-03-29 06:17
浏览 42

正则表达式,仅压缩页面的某些部分

I have a function that strips out un-needed whitespaces from the output of my php page prior to saving the page to an HTML file for caching purposes.

However in some sections of my page I have source code in pre tags and these whitespaces effect how the code is displayed. My skill with regular expressions is horrible so I am basically look for a solution to stop this function from messing with code inside:

 <pre></pre>

This is the php function

function sanitize_output($buffer)
   {
      $search = array(
         '/\>[^\S]+/s', //strip whitespaces after tags, except space
         '/[^\S ]+\</s', //strip whitespaces before tags, except space
         '/(\s)+/s',  // shorten multiple whitespace sequences
           );
      $replace = array(
         '>',
         '<',
         '\\1',
         );
    $buffer = preg_replace($search, $replace, $buffer);
      return $buffer;
   }

Thanks for your help.

Heres what i found to be working :

Solution:

function stripBufferSkipPreTags($buffer){
$poz_current = 0;
$poz_end = strlen($buffer)-1;
$result = "";

while ($poz_current < $poz_end){
    $t_poz_start = stripos($buffer, "<pre", $poz_current);
    if ($t_poz_start === false){
        $buffer_part_2strip = substr($buffer, $poz_current);
        $temp = stripBuffer($buffer_part_2strip);
        $result .= $temp;
        $poz_current = $poz_end;
    }
    else{
        $buffer_part_2strip = substr($buffer, $poz_current, $t_poz_start-$poz_current);
        $temp = stripBuffer($buffer_part_2strip);
        $result .= $temp;
        $t_poz_end = stripos($buffer, "</pre>", $t_poz_start);
        $temp = substr($buffer, $t_poz_start, $t_poz_end-$t_poz_start);
        $result .= $temp;
        $poz_current = $t_poz_end;
    }
}
return $result;

}

function stripBuffer($buffer){
// change new lines and tabs to single spaces
$buffer = str_replace(array("
", "", "
", "\t"), ' ', $buffer);
// multispaces to single...
$buffer = preg_replace(" {2,}", ' ',$buffer);
// remove single spaces between tags
$buffer = str_replace("> <", "><", $buffer);
// remove single spaces around &nbsp;
$buffer = str_replace(" &nbsp;", "&nbsp;", $buffer);
$buffer = str_replace("&nbsp; ", "&nbsp;", $buffer);
return $buffer;

}

  • 写回答

2条回答 默认 最新

  • doufu9947 2011-03-29 06:45
    关注

    Regular expressions are known to be evil (see this and this) when it comes to parsing HTML.

    That said, try to do what you need in another way, like using a DOM parser and customizing its HTML output functions.

    评论

报告相同问题?

悬赏问题

  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 个人网站被恶意大量访问,怎么办
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制