doufeiqiong3515 2018-06-12 14:23
浏览 92
已采纳

MySQL数据库迁移PHP的UTF-8问题

I'm migrating my existent database into another server. To achieve that I've exported and imported the database using phpMyAdmin SQL queries. Everything works fine, except that some UTF-8 characters appear broken in the website. I fetch them using the same PHP code (on a different server but with same PHP extensions and version).

Example of a string as I see it on the new website and on the databases (both old and new) (using phpMyAdmin): péri-prothétique

Example of a string as I see it in the old website péri-prothétique

As you can see, PHP used to automatically encode the characters the right way even thought the characters are mangled in the database, but doesn't do so anymore (not even if i explicitly utf8_encode or utf8_decode the result). I even tried forcing $mysqli->set_charset("UTF8") on every connection to no avail.

Both the web server, the database server,server connection, PHP and the tables use UTF-8 or utf8mb4 charset and collation, and are setup the same way as the old ones.

The only difference I see is that the new database server is MariaDB instead of MySQL and its webserver is nginx instead of Apache.

New database specs picture from phpMyAdmin:

IMAGE

Old database specs picture:

IMAGE

New webserver specs on which the website and PHP runs (same specs as old one but different server): Apache 2.4 PHP 7.0

How can I get back that old correct encoding? Why doesn't PHP automatically decode them right anymore?

UPDATE: Using mb_detect_encoding I see that PHP in both new and old version detects ASCII or UTF-8 on the query results, depending on whether there's at least an UTF-8 symbol or not. The issue is that on the new version PHP doesn't display the UTF-8 symbols right even thought it detects the string encoding as UTF-8.

UPDATE 2: thanks to this question I figured out why my entries were mangled: double encoding arose from the fact that the database collation was latin1_swedish_ci while the tables collation was utf8_general_ci. This doesn't answer the question thought since the old website was automatically "translating" those mangled characters, rendering them right in the HTML, and I want to replicate that behavior into the new website which is a different one but with the same code and php.ini settings.

  • 写回答

3条回答 默认 最新

  • duange051858 2018-06-13 17:34
    关注

    To check for double encoding, use SELECT HEX(col)... é should come back C3A9 (proper utf8), but instead shows C383C2A9 (double encoding).

    See: Trouble with UTF-8 characters; what I see is not what I stored

    If you have actually determined that you have double encoding, then the fix involves

    UPDATE tbl SET col = CONVERT(BINARY(CONVERT(col USING latin1)) USING utf8mb4);
    

    See http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases

    Yes, "double encoding" is a silent bug -- two wrongs make a right (sort of).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败
  • ¥15 计组这些题应该咋做呀
  • ¥60 更换迈创SOL6M4AE卡的时候,驱动要重新装才能使用,怎么解决?
  • ¥15 让node服务器有自动加载文件的功能