dspym82000 2011-10-12 16:45 采纳率: 100%
浏览 39
已采纳

preg_replace与西里尔字符

I want to replace these chars [^a-zа-з0-9_] with null, but I can't do it when its multibyte string.

I tried with mb_*, iconv, PCRE, mb_eregi_replace and u modifier (for PCRE), but none of them worked well.

The mb_eregi_replace works, but it only outputs the correct utf8 string, but it doesn't replace the characters, when preg_replace works with the same regex..

Here is my code that works with unicode, but it doesn't replace text.

function _data($data)
{
  mb_regex_encoding('UTF-8');
  return mb_eregi_replace('/[^a-zа-з0-9_]+/', '', $data);
}

var_dump(namespace\_data('Текст Removethis- and this _#$)( and also this $*@&$'));

and the result is with the special chars (#_$..) when it should replace them, if I change the function to preg_replace (and no unicode) it should replace them.

  • 写回答

1条回答 默认 最新

  • dsa1230000 2011-10-12 17:21
    关注

    As long as your input string is UTF-8 encoded (if not, re-encode it to UTF-8), you can safely use preg_replace if you use the correct regular expression.

    function _data($data)
    { 
      return preg_replace('/[^\w_]+/u', '', $data);
    }
    
    var_dump(namespace\_data('Текст Removethis- and this _#$)( and also this $*@&$'));
    

    Demo

    • \w = any word character
    • u (at then end) = enable UTF-8 for the regex.
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 RL+GNN解决人员排班问题时梯度消失
  • ¥15 统计大规模图中的完全子图问题
  • ¥15 使用LM2596制作降压电路,一个能运行,一个不能
  • ¥60 要数控稳压电源测试数据
  • ¥15 能帮我写下这个编程吗
  • ¥15 ikuai客户端l2tp协议链接报终止15信号和无法将p.p.p6转换为我的l2tp线路
  • ¥15 phython读取excel表格报错 ^7个 SyntaxError: invalid syntax 语句报错
  • ¥20 @microsoft/fetch-event-source 流式响应问题
  • ¥15 ogg dd trandata 报错
  • ¥15 高缺失率数据如何选择填充方式