drl47263 2009-09-26 06:44
浏览 53
已采纳

php - 为什么这个正则表达式将我的字符串截断为零长度?

Yesterday I tracked down a strange bug which caused a website display only a white page - no content on it, no error message visible.

I found that a regular expression used in preg_replace was the problem.

I used the regex in order to replace the title html tag in the accumulated content just before echo´ing the html. The html got rather large on the page where the bug occured (60 kb - not too large) and it seemed like preg_replace / the regex used can only handle a string of certain length - or my regex is really messed up (also possible).

Look at this sample program which reproduces the problem (tested on PHP 5.2.9):


function replaceTitleTagInHtmlSource($content, $replaceWith) {
  return preg_replace('#(<title>)([\s\S]+)(<\/title>)#i', '$1'.$replaceWith.'$3', $content);
}


$dummyStr = str_repeat('A', 6000);

$totalStr = '<title>foo</title>';

for($i = 0; $i < 10; $i++) {
  $totalStr .= $dummyStr;
}

print 'orignal: ' . strlen($totalStr);
print '<hr />';

$replaced = replaceTitleTagInHtmlSource($totalStr, 'bar');

print 'replaced: ' . strlen($replaced);
print '<hr />';

Output:

orignal: 60018
replaced: 0

So - the function gets a string of length 60000 and returns a string with 0 length. Not what I wanted to do with my regex.


Changing

for($i = 0; $i < 10; $i++) {

to

for($i = 0; $i < 1; $i++) {

in order to decrease the total string length, the output is:

orignal: 6018
replaced: 6018


When I removed the replacing, the content of the page was displayed without any problems.

  • 写回答

4条回答 默认 最新

  • douwen4401 2009-09-26 07:01
    关注

    It seems like you're running into the backtracking limit.

    This is confirmed if you print preg_last_error(): it returns PREG_BACKTRACK_LIMIT_ERROR.

    You can either increase the limit in your ini file or using ini_set() or change your regular expression from ([\s\S]+) to .*?, which will stop it from backtracking so much.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥30 帮我写一段可以读取LD2450数据并计算距离的Arduino代码
  • ¥15 C#调用python代码(python带有库)
  • ¥15 矩阵加法的规则是两个矩阵中对应位置的数的绝对值进行加和
  • ¥15 活动选择题。最多可以参加几个项目?
  • ¥15 飞机曲面部件如机翼,壁板等具体的孔位模型
  • ¥15 vs2019中数据导出问题
  • ¥20 云服务Linux系统TCP-MSS值修改?
  • ¥20 关于#单片机#的问题:项目:使用模拟iic与ov2640通讯环境:F407问题:读取的ID号总是0xff,自己调了调发现在读从机数据时,SDA线上并未有信号变化(语言-c语言)
  • ¥20 怎么在stm32门禁成品上增加查询记录功能
  • ¥15 Source insight编写代码后使用CCS5.2版本import之后,代码跳到注释行里面