如何转换一个完整的 MySQL 数据库字符集和排序至 UTF-8?

How can I convert entire MySQL database character-set to UTF-8 and collation to UTF-8?

转载于:https://stackoverflow.com/questions/6115612/how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8

csdnceshi78
程序go If you use utf8_unicode_ci with utf8mb4 it gives a error so use utf8mb4_unicode_ci with utf8mb4
接近 3 年之前 回复
csdnceshi67
bug^君 Update to update... For MySQL 8.0, this is probably the preferred collation: utf8mb4_0900_ai_ci, which is based on Unicode standard version 9.0.
3 年多之前 回复
csdnceshi53
Lotus@ I believe that's the collation by default with that character set.
大约 4 年之前 回复
csdnceshi67
bug^君 Even better, collation utf8mb4_unicode_520_ci, or whatever is the latest available version.
大约 4 年之前 回复
csdnceshi71
Memor.の I forgot to mention in my comment above, if you switch to utf8mb4 you'll also need to switch collation to utf8mb4_unicode_ci
5 年多之前 回复
csdnceshi71
Memor.の If you want full UTF-8 support you'll probably also want to use a character set of utf8mb4 rather than utf8 as utf8 only supports the basic multilingual plane as opposed to the full range. It requires MySQL 5.5.3 or above.
接近 6 年之前 回复
csdnceshi56
lrony* To later visitors: Note the related questions in the sidebar and use utf8_unicode_ci, not utf8_general_ci.
7 年多之前 回复

17个回答

Use the ALTER DATABASE and ALTER TABLE commands.

ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Or if you're still on MySQL 5.5.2 or older which didn't support 4-byte UTF-8, use utf8 instead of utf8mb4:

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
csdnceshi66
必承其重 | 欲带皇冠 anything newer than 5.5.2, as mentioned in the answer.
大约 2 年之前 回复
csdnceshi62
csdnceshi62 Since which MySQL version was support for utf8mb4 added? Is it supported in 5.5.6?
大约 2 年之前 回复
weixin_41568196
撒拉嘿哟木头 changing charcter set to utf8 changes my datatype from text to mediumtext. Is it expected?
2 年多之前 回复
csdnceshi66
必承其重 | 欲带皇冠 Large production systems usually have a mirrored DB for maintenance.
接近 3 年之前 回复
weixin_41568110
七度&光 This rebuilds the table making it infeasible on large production systems. If it's certain that only ASCII chars are stored in the latin1 columns, is it possible to change the table charset/collation without rebuilding?
接近 3 年之前 回复
weixin_41568126
乱世@小熊 The CONVERT TO technique assumes that the text was correctly stored in some other charset (eg, latin1), and not mangled (such as UTF-8 bytes crammed into latin1 column without conversion to latin1).
3 年多之前 回复
weixin_41568131
10.24 Upvote for the utf8mb4 thing.
3 年多之前 回复
  1. Make a backup!

  2. Then you need to set the default char sets on the database. This does not convert existing tables, it only sets the default for newly created tables.

    ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  3. Then, you will need to convert the char set on all existing tables and their columns. This assumes that your current data is actually in the current char set. If your columns are set to one char set but your data is really stored in another then you will need to check the MySQL manual on how to handle this.

    ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    
weixin_41568208
北城已荒凉 Note that utf8_general_ci is no longer recommended best practice. Since MySQL 5.5.3 you should use utf8mb4 rather than utf8. They both refer to the UTF-8 encoding, but the older utf8 had a MySQL-specific limitation preventing use of characters numbered above 0xFFFD.
3 年多之前 回复
csdnceshi60
℡Wang Yan What broke?
大约 5 年之前 回复
csdnceshi70
笑故挽风 I should have read the backup backup backup first... but my luck was that it was on development environment. so my upvote goes to you!
接近 6 年之前 回复
csdnceshi72
谁还没个明天 Note: ALTER TABLE tablename CHARACTER SET utf8 only sets the default char set on a table which is used for newly created columns. It does not convert existing columns that already have a char set set.
9 年多之前 回复

You can create the sql to update all tables with:

SELECT CONCAT("ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CHARACTER SET utf8 COLLATE utf8_general_ci;   ",
    "ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ") 
    AS alter_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = your_database_name;

Capture the output and run it.

Arnold Daniels' answer above is more elegant.

csdnceshi54
hurriedly% I have this error #1054 - Unknown column 'webdb_playground' in 'where clause' but I'm sure my db is the correct one
2 年多之前 回复
csdnceshi78
程序go FYI: according to dev.mysql.com/doc/refman/5.5/en/alter-table.html MySQL documentation , the "CONVERT TO CHARACTER SET" version of the ALTER statement does both in one step: "To change the table default character set and all character columns (CHAR, VARCHAR, TEXT) to a new character set ...
5 年多之前 回复
weixin_41568134
MAO-EYE good question. The first alter-table query sets the default for new columns, and the second alter-table query converts the existing columns.
接近 6 年之前 回复
weixin_41568196
撒拉嘿哟木头 why you added two alter table query? one isn't sufficient?
6 年多之前 回复
mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
cp dump.sql dump-fixed.sql
vim dump-fixed.sql


:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq

mysql -uusername -ppassword < dump-fixed.sql

In case the data is not in the same character set you might consider this snippet from http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html

If the column has a nonbinary data type (CHAR, VARCHAR, TEXT), its contents should be encoded in the column character set, not some other character set. If the contents are encoded in a different character set, you can convert the column to use a binary data type first, and then to a nonbinary column with the desired character set.

Here is an example:

 ALTER TABLE t1 CHANGE c1 c1 BLOB;
 ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;

Make sure to choose the right collation, or you might get unique key conflicts. e.g. Éleanore and Eleanore might be considered the same in some collations.

Aside:

I had a situation where certain characters "broke" in emails even though they were stored as UTF-8 in the database. If you are sending emails using utf8 data, you might want to also convert your emails to send in UTF8.

In PHPMailer, just update this line: public $CharSet = 'utf-8';

Before proceeding, ensure that you: Have completed a full database backup!

Step 1: Database Level Changes

  • Identifying the Collation and Character set of your database

    SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM 
    information_schema.SCHEMATA S
    WHERE schema_name = 'your_database_name'
    AND
    (DEFAULT_CHARACTER_SET_NAME != 'utf8'
        OR
     DEFAULT_COLLATION_NAME not like 'utf8%');
    
  • Fixing the collation for the database

    ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
    

Step 2: Table Level Changes

  • Identifying Database Tables with the incorrect character set or collation

    SELECT CONCAT(
    'ALTER TABLE ',  table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci;  ', 
    'ALTER TABLE ',  table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ')
    FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
    WHERE C.collation_name = T.table_collation
    AND T.table_schema = 'your_database_name'
    AND
    (C.CHARACTER_SET_NAME != 'utf8'
        OR
     C.COLLATION_NAME not like 'utf8%')
    
  • Adjusting table columns' collation and character set

Capture upper sql output and run it. (like following)

ALTER TABLE rma CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_products CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_report_period CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_report_period CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_reservation CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_reservation CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return_product CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_product CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; 

refer to: https://confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collation+and+Character+Set+of+a+MySQL+Database

weixin_41568134
MAO-EYE stackoverflow.com/questions/10957238/… have left a fuller answer here
3 年多之前 回复
weixin_41568184
叼花硬汉 Currently this script uses 'utf8_unicode_ci' for the db but 'utf8_general_ci' for the tables - was that intentional? (I think both should use the same charset)
大约 4 年之前 回复

Inspired by @sdfor comment, here is a bash script that does the job

#!/bin/bash

printf "### Converting MySQL character set ###\n\n"

printf "Enter the encoding you want to set: "
read -r CHARSET

# Get the MySQL username
printf "Enter mysql username: "
read -r USERNAME

# Get the MySQL password
printf "Enter mysql password for user %s:" "$USERNAME"
read -rs PASSWORD

DBLIST=( mydatabase1 mydatabase2 )

printf "\n"


for DB in "${DBLIST[@]}"
do
(
    echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
    mysql "$DB" -u"$USERNAME" -p"$PASSWORD" -e "SHOW TABLES" --batch --skip-column-names \
    | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
) \
| mysql "$DB" -u"$USERNAME" -p"$PASSWORD"

echo "$DB database done..."
done

echo "### DONE ###"
exit

On the commandline shell

If you're one the commandline shell, you can do this very quickly. Just fill in "dbname" :D

DB="dbname"
(
    echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'
    mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names \
    | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;'
) \
| mysql "$DB"

One-liner for simple copy/paste

DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"
csdnceshi68
local-host If the values of that column only contain hex characters (0-9a-f), you can safely do so. That said, I recommend using VARCHAR BINARY rather than UTF-8 for UUID columns.
接近 2 年之前 回复
csdnceshi76
斗士狗 I got the ERROR 1833 (HY000): Cannot change column 'uuid': used in a foreign key constraint 'da6c45e2594e23048055efc029ad4f25' of table 'blablabla.foo_bar_user_profiles' and first I tried LOCK TABLES foo_bar_user_profiles WRITE because SET FOREIGN_KEY_CHECKS = 0; just sounds too dangerous. If I turn that off, can that screw up the DB?
接近 2 年之前 回复
csdnceshi69
YaoRaoLov Didn't work for me until I used "show full tables where Table_Type = 'BASE TABLE'" instead of "SHOW TABLES"
2 年多之前 回复
csdnceshi74
7*4 You'll probably want to disable foreign key checks on a real system: DB="db_name"; ( echo 'ALTER DATABASE '"$DB"' CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql --uuser -ppassword -hhost "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'SET foreign_key_checks = 0; ALTER TABLE '{}' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql -uuser -ppassword -hhost "$DB"
4 年多之前 回复
weixin_41568126
乱世@小熊 This code works great, just remember to add -h [hostname] -u [username] -p[password] after mysql as necessary.
6 年多之前 回复
csdnceshi68
local-host You need to run this on the command line shell. If you only have the MySQL client connection available, use the code of sdfor below.
6 年多之前 回复
csdnceshi56
lrony* Can you put more detail in this I get ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DB="dbname"
6 年多之前 回复

For databases that have a high number of tables you can use a simple php script to update the charset of the database and all of the tables using the following:

$conn = mysqli_connect($host, $username, $password, $database);

if ($conn->connect_error) {
  die("Connection failed: " . $conn->connect_error);
}

$alter_database_charset_sql = "ALTER DATABASE ".$database." CHARACTER SET utf8 COLLATE utf8_unicode_ci";
mysqli_query($conn, $alter_database_charset_sql);

$show_tables_result = mysqli_query($conn, "SHOW TABLES");
$tables  = mysqli_fetch_all($show_tables_result);

foreach ($tables as $index => $table) {
  $alter_table_sql = "ALTER TABLE ".$table[0]." CONVERT TO CHARACTER SET utf8  COLLATE utf8_unicode_ci";
  $alter_table_result = mysqli_query($conn, $alter_table_sql);
  echo "<pre>";
  var_dump($alter_table_result);
  echo "</pre>";
}
csdnceshi68
local-host Good, and how do we run it? Do we have to upload it on server and then run enter the path to its position?
2 年多之前 回复
csdnceshi50
三生石@ It shouldn't matter, as long as the IP that your executing it from has access to the database. Make sure you take a backup first!
2 年多之前 回复
csdnceshi68
local-host Where do we run this script from?
2 年多之前 回复

Use HeidiSQL. Its free and a very good db tool.

From tools menu, enter Bulk table editor

Select the complete database or pick tables to convert,

  • tick Change default collation: utf8mb4_general_ci
  • tick Convert to charset: utf8

Execute

This converts complete database from latin to utf8 in just a few seconds.

Works like a charm :)

HeidiSQL connects by default as utf8 so any special characters should now be seen as the character (æ ø å) and not as encoded when inspecting the table data.

The real pitfall when moving from latin to utf8 is to make sure pdo connects with utf8 charset. If not you will get rubbish data inserted to the utf8 table and question marks all over the place on your web page, making you think the table data is not utf8...

csdnceshi67
bug^君 I get and error when trying to convert the CHARSET: SQL Error (1025): Error on rename... but this is an amazing SQL manager tool!
大约 3 年之前 回复
csdnceshi78
程序go like this way. especially you have plenty of table.
大约 4 年之前 回复
csdnceshi55
~Onlooker Could you elaborate please? I have exactly this issue - special characters and spaces appear as question marks. I am trying to convert the database in MAMP using PHPMyAdmin. Having developed offline I now discover that my host does not support utf8mb4. I don't have Windows so cannot use HeidiSQL. Is there a way I can achieve this with PHPMyAdmin?
4 年多之前 回复
共17条数据 1 尾页
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐