dqunzip3183 2013-01-13 14:14
浏览 212
已采纳

如何优化preg_match_all或其他替代方案?

I have this code:

function toDataUri( $html )
{
  # convert css URLs to data URIs
  $html = preg_replace_callback( "#(url\([\'\"]?)([^\"\'\)]+)([\"\']?\))#", 'create_data_uri', $html );
  return $html;
}

// callback function
private function create_data_uri( $matches )
{
  $filetype = explode( '.', $matches[ 2 ] );
  $filetype = trim(strtolower( $filetype[ count( $filetype ) - 1 ] ));

  // replace ?whatever=value from extensions
  $filetype = preg_replace('#\?.*#', '', $filetype);

  $datauri = $matches[ 2 ];
  $data =  get_file_contents( $datauri );

  if (! $data) return $matches[ 0 ];

  $data = base64_encode( $data );

  //compile and return a data: URI with the encoded image data
  return $matches[ 1 ] . "data:image/$filetype;base64,$data" . $matches[ 3 ];
}

It basically searches for URLs with format url(path) in HTML file and replaces them with base 64 Data URIS.

The problem is that if input html is few kilos such as 10kb, it takes ages to return the final response. Is there any optimization we can do in such case or any other solution you have that when given html, it searches for url(path) matches and converts them to data uris ?

  • 写回答

1条回答 默认 最新

  • doulangdang9986 2013-01-13 14:16
    关注

    The expression is cheap already — starts with a fixed string and doesn't need backtracking.

    In PCRE there's S modifier that enables some regex optimisation, but it matters only for patterns without a fixed prefix.

    It shouldn't be slow — 10KB isn't much for a simple regex like this. Perhaps the bottleneck is somewhere else?

    • If you have unclosed url( in the parsed file and no ) to the end of the file, then it'll scan a bit more. [^\"\'\)]{0,1000} would limit that. But it's a minor optimisation that only makes a difference when you have pathological syntax errors in the file.
    • You can remove () around whole expression. 0th match is always capturing entire string.
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分