dspym82000 2011-10-12 16:45 采纳率: 100%
浏览 39
已采纳

preg_replace与西里尔字符

I want to replace these chars [^a-zа-з0-9_] with null, but I can't do it when its multibyte string.

I tried with mb_*, iconv, PCRE, mb_eregi_replace and u modifier (for PCRE), but none of them worked well.

The mb_eregi_replace works, but it only outputs the correct utf8 string, but it doesn't replace the characters, when preg_replace works with the same regex..

Here is my code that works with unicode, but it doesn't replace text.

function _data($data)
{
  mb_regex_encoding('UTF-8');
  return mb_eregi_replace('/[^a-zа-з0-9_]+/', '', $data);
}

var_dump(namespace\_data('Текст Removethis- and this _#$)( and also this $*@&$'));

and the result is with the special chars (#_$..) when it should replace them, if I change the function to preg_replace (and no unicode) it should replace them.

  • 写回答

1条回答 默认 最新

  • dsa1230000 2011-10-12 17:21
    关注

    As long as your input string is UTF-8 encoded (if not, re-encode it to UTF-8), you can safely use preg_replace if you use the correct regular expression.

    function _data($data)
    { 
      return preg_replace('/[^\w_]+/u', '', $data);
    }
    
    var_dump(namespace\_data('Текст Removethis- and this _#$)( and also this $*@&$'));
    

    Demo

    • \w = any word character
    • u (at then end) = enable UTF-8 for the regex.
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度