doufeiqiong3515 2018-06-12 14:23
浏览 92
已采纳

MySQL数据库迁移PHP的UTF-8问题

I'm migrating my existent database into another server. To achieve that I've exported and imported the database using phpMyAdmin SQL queries. Everything works fine, except that some UTF-8 characters appear broken in the website. I fetch them using the same PHP code (on a different server but with same PHP extensions and version).

Example of a string as I see it on the new website and on the databases (both old and new) (using phpMyAdmin): péri-prothétique

Example of a string as I see it in the old website péri-prothétique

As you can see, PHP used to automatically encode the characters the right way even thought the characters are mangled in the database, but doesn't do so anymore (not even if i explicitly utf8_encode or utf8_decode the result). I even tried forcing $mysqli->set_charset("UTF8") on every connection to no avail.

Both the web server, the database server,server connection, PHP and the tables use UTF-8 or utf8mb4 charset and collation, and are setup the same way as the old ones.

The only difference I see is that the new database server is MariaDB instead of MySQL and its webserver is nginx instead of Apache.

New database specs picture from phpMyAdmin:

IMAGE

Old database specs picture:

IMAGE

New webserver specs on which the website and PHP runs (same specs as old one but different server): Apache 2.4 PHP 7.0

How can I get back that old correct encoding? Why doesn't PHP automatically decode them right anymore?

UPDATE: Using mb_detect_encoding I see that PHP in both new and old version detects ASCII or UTF-8 on the query results, depending on whether there's at least an UTF-8 symbol or not. The issue is that on the new version PHP doesn't display the UTF-8 symbols right even thought it detects the string encoding as UTF-8.

UPDATE 2: thanks to this question I figured out why my entries were mangled: double encoding arose from the fact that the database collation was latin1_swedish_ci while the tables collation was utf8_general_ci. This doesn't answer the question thought since the old website was automatically "translating" those mangled characters, rendering them right in the HTML, and I want to replicate that behavior into the new website which is a different one but with the same code and php.ini settings.

  • 写回答

3条回答 默认 最新

  • duange051858 2018-06-13 17:34
    关注

    To check for double encoding, use SELECT HEX(col)... é should come back C3A9 (proper utf8), but instead shows C383C2A9 (double encoding).

    See: Trouble with UTF-8 characters; what I see is not what I stored

    If you have actually determined that you have double encoding, then the fix involves

    UPDATE tbl SET col = CONVERT(BINARY(CONVERT(col USING latin1)) USING utf8mb4);
    

    See http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases

    Yes, "double encoding" is a silent bug -- two wrongs make a right (sort of).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥20 关于#stm32#的问题:需要指导自动酸碱滴定仪的原理图程序代码及仿真
  • ¥20 设计一款异域新娘的视频相亲软件需要哪些技术支持
  • ¥15 stata安慰剂检验作图但是真实值不出现在图上
  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因