To store chinese characters in mysql, is it recommended to store them as UTF8 or UCS2? (I am using char and varchar)
Also, I have seen that UTF8 uses 4 bytes of data to store values. How many does UCS2 use?
To store chinese characters in mysql, is it recommended to store them as UTF8 or UCS2? (I am using char and varchar)
Also, I have seen that UTF8 uses 4 bytes of data to store values. How many does UCS2 use?
I have seen that UTF8 uses 4 bytes of data to store values. How many does UCS2 use?
UTF-8 consists of variable length characters ranging from 1 to 3 bytes, UCS2 (UTF-16) is a fixed 2 bytes per character.
To store chinese characters in mysql, is it recommended to store them as UTF8 or UCS2?
I have no experience with chinese characters, but the top answer to this SO question answers the basic question quite nicely: Difference between UTF-8 and UTF-16?
From there:
Most reasonable characters, like Latin, Cyrillic, Chinese, Japanese can be represented with 2 bytes. Unless really exotic characters are needed, this means that the 16-bit subset of UTF-16 can be used as a fixed-length encoding, which speeds indexing.
it seems like for chinese characters, UCS-2 tends to save storage space. If this is for a web project, I would however tend to use UTF-8 because it is the more widespread encoding, and a standard in the web world. Additional arguments for UTF-8 here: Should UTF-16 be considered harmful?
mySQL Reference: 9.1.10. Unicode Support