dselp3944 2018-02-02 18:21
浏览 79
已采纳

如何从PHP中的UTF8字符“删除变音符号”?

I need to replicate the behavior of MySQL utf8_general_ci collation in PHP. Strictly speaking I need to detect what whould be considered different and what would be considered the same. The case independent part is easy. The problem is utf_general_ci considers characters with diacritics and characters without diacritics to be equal: e = è = é etc.. To replicate that comparison, I'd need to have a way to replace è -> e, é -> e.

The method that comes to my mind is:

echo iconv("utf-8", "ascii//TRANSLIT", "é");

One problem is iconv behaves differently depending on current locale and that's asking for a problem.

The other problem is the input may also contain Cirillic letters that shouldn't be stripped or result in a PHP Notice.

echo iconv("utf-8", "ascii//TRANSLIT", "дом");

Is there a solution or do I have to create manually mapping of each character with diacritic to a one without it?

  • 写回答

2条回答 默认 最新

  • doulu4413 2018-02-02 19:07
    关注

    intl's Transliterator will let you define far more in-depth transliteration rules. The full documentation on transliteration rules can be found on icu-project.org.

    $tests = [ "é", "дом" ];
    
    $tl = Transliterator::create('Latin-ASCII;');
    foreach($tests as $str) {
        var_dump(
            $tl->transliterate($str)
        );
    }
    

    Output:

    string(1) "e"
    string(6) "дом"
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
  • ¥15 Macbookpro 连接热点正常上网,连接不了Wi-Fi。
  • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题
  • ¥15 linux驱动,linux应用,多线程
  • ¥20 我要一个分身加定位两个功能的安卓app
  • ¥15 基于FOC驱动器,如何实现卡丁车下坡无阻力的遛坡的效果
  • ¥15 IAR程序莫名变量多重定义