dshmkgq558192365 2018-07-18 14:07
浏览 483
已采纳

删除表情符号/ unicode字符

My website and database is set to utf-8 and utf8mb4.

On textareas it's perfectly fine when users put utf-8 symbols/emojis.

But on certain input fields (name, address etc.) I want to remove the possibility of those "funny symbols", and only deal with basic text and numbers, including danish characters æøå, accents and symbols like -_'@()?=,.:;!"#&<> etc.

How would I go about this?

Is there some native php function to strip unicode symbols/characters, or do I have to find/make a specific regex function for it?

  • 写回答

1条回答 默认 最新

  • doutanghuan9595 2018-07-18 14:29
    关注

    There are functions for checking encoding: http://php.net/manual/en/function.mb-check-encoding.php but to strip out characters I think you would need to use regex:

    function StripNonUTF($str){
      return preg_replace('/[^\pL\pM[:ascii:]]+/g', '', $str);
    }
    
    • \pL matches any kind of letter from any language
    • \pM matches a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)
    • [:ascii:] matches a character with ASCII value 0 through 127
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料