douzoudang1511 2015-04-18 19:32
浏览 119
已采纳

删除unicode字符,但使用preg_replace保留所有特殊和英文字符

I want to use preg_replace to remove all unicode characters including Persian characters from a string and keep English and all special characters. The way I know to do it is :

preg_replace('/[^<>()/\* a-zA-Z0-9_.-]/u', '', $string);

But, I don't really want to include all special characters inside []. Is there any shorter way?!

  • 写回答

1条回答 默认 最新

  • dro44817 2015-04-18 19:59
    关注

    To remove everything but characters falling in the basic ASCII range, you may use a pattern similar to this to match the range by HEX codes.

    // Given a string with characters in and outside ASCII:
    $s = "abcde啅cde衸xtzሴbb()*&bԴ";
    
    // Match HEX 00-7F and remove characters outside that
    // by inverting with ^
    echo preg_replace('/[^\x00-\x7f]/', '', $s);
    // Prints:
    // abcdecdextzbb()*&b
    

    Using HEX 00-7F will also include the start of the ASCII range, therefore covering things like NUL, terminal bell, backspace, etc. You may consider starting with ASCII 32 (hex 20) at SPACE if you don't want your output to include those special non-printable control characters.

    echo preg_replace('/[^\x20-\x7f]/', '', $s);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 linux驱动,linux应用,多线程
  • ¥20 我要一个分身加定位两个功能的安卓app
  • ¥15 基于FOC驱动器,如何实现卡丁车下坡无阻力的遛坡的效果
  • ¥15 IAR程序莫名变量多重定义
  • ¥15 (标签-UDP|关键词-client)
  • ¥15 关于库卡officelite无法与虚拟机通讯的问题
  • ¥15 目标检测项目无法读取视频
  • ¥15 GEO datasets中基因芯片数据仅仅提供了normalized signal如何进行差异分析
  • ¥100 求采集电商背景音乐的方法
  • ¥15 数学建模竞赛求指导帮助