2 yuemuhan999 yuemuhan999 于 2016.01.23 22:12 提问

问个字符编码的问题java

unicode字符编码,gbk是英文一个字节中文2个字节,utf-8是英文一个字节,中文3个字节.可在java中,char型数据无论中英文都是2个字节,char型数据是对应什么什么编码?还有什么字符编码可否简单的介绍一二?

3个回答

caozhy
caozhy   Ds   Rxr 2016.01.23 22:25

编码多呢。这是我电脑上安装的所有编码

IBM EBCDIC (US-Canada)
OEM United States
IBM EBCDIC (International)
Arabic (ASMO 708)
Arabic (DOS)
Greek (DOS)
Baltic (DOS)
Western European (DOS)
Central European (DOS)
OEM Cyrillic
Turkish (DOS)
OEM Multilingual Latin I
Portuguese (DOS)
Icelandic (DOS)
Hebrew (DOS)
French Canadian (DOS)
Arabic (864)
Nordic (DOS)
Cyrillic (DOS)
Greek, Modern (DOS)
IBM EBCDIC (Multilingual Latin-2)
Thai (Windows)
IBM EBCDIC (Greek Modern)
Japanese (Shift-JIS)
Chinese Simplified (GB2312)
Korean
Chinese Traditional (Big5)
IBM EBCDIC (Turkish Latin-5)
IBM Latin-1
IBM EBCDIC (US-Canada-Euro)
IBM EBCDIC (Germany-Euro)
IBM EBCDIC (Denmark-Norway-Euro)
IBM EBCDIC (Finland-Sweden-Euro)
IBM EBCDIC (Italy-Euro)
IBM EBCDIC (Spain-Euro)
IBM EBCDIC (UK-Euro)
IBM EBCDIC (France-Euro)
IBM EBCDIC (International-Euro)
IBM EBCDIC (Icelandic-Euro)
Unicode
Unicode (Big-Endian)
Central European (Windows)
Cyrillic (Windows)
Western European (Windows)
Greek (Windows)
Turkish (Windows)
Hebrew (Windows)
Arabic (Windows)
Baltic (Windows)
Vietnamese (Windows)
Korean (Johab)
Western European (Mac)
Japanese (Mac)
Chinese Traditional (Mac)
Korean (Mac)
Arabic (Mac)
Hebrew (Mac)
Greek (Mac)
Cyrillic (Mac)
Chinese Simplified (Mac)
Romanian (Mac)
Ukrainian (Mac)
Thai (Mac)
Central European (Mac)
Icelandic (Mac)
Turkish (Mac)
Croatian (Mac)
Unicode (UTF-32)
Unicode (UTF-32 Big-Endian)
Chinese Traditional (CNS)
TCA Taiwan
Chinese Traditional (Eten)
IBM5550 Taiwan
TeleText Taiwan
Wang Taiwan
Western European (IA5)
German (IA5)
Swedish (IA5)
Norwegian (IA5)
US-ASCII
T.61
ISO-6937
IBM EBCDIC (Germany)
IBM EBCDIC (Denmark-Norway)
IBM EBCDIC (Finland-Sweden)
IBM EBCDIC (Italy)
IBM EBCDIC (Spain)
IBM EBCDIC (UK)
IBM EBCDIC (Japanese katakana)
IBM EBCDIC (France)
IBM EBCDIC (Arabic)
IBM EBCDIC (Greek)
IBM EBCDIC (Hebrew)
IBM EBCDIC (Korean Extended)
IBM EBCDIC (Thai)
Cyrillic (KOI8-R)
IBM EBCDIC (Icelandic)
IBM EBCDIC (Cyrillic Russian)
IBM EBCDIC (Turkish)
IBM Latin-1
Japanese (JIS 0208-1990 and 0212-1990)
Chinese Simplified (GB2312-80)
Korean Wansung
IBM EBCDIC (Cyrillic Serbian-Bulgarian)
Cyrillic (KOI8-U)
Western European (ISO)
Central European (ISO)
Latin 3 (ISO)
Baltic (ISO)
Cyrillic (ISO)
Arabic (ISO)
Greek (ISO)
Hebrew (ISO-Visual)
Turkish (ISO)
Estonian (ISO)
Latin 9 (ISO)
Europa
Hebrew (ISO-Logical)
Japanese (JIS)
Japanese (JIS-Allow 1 byte Kana)
Japanese (JIS-Allow 1 byte Kana - SO/SI)
Korean (ISO)
Chinese Simplified (ISO-2022)
Japanese (EUC)
Chinese Simplified (EUC)
Korean (EUC)
Chinese Simplified (HZ)
Chinese Simplified (GB18030)
ISCII Devanagari
ISCII Bengali
ISCII Tamil
ISCII Telugu
ISCII Assamese
ISCII Oriya
ISCII Kannada
ISCII Malayalam
ISCII Gujarati
ISCII Punjabi
Unicode (UTF-7)
Unicode (UTF-8)

enpterexpress
enpterexpress   2016.01.23 22:25

图片说明搜一下字符集

caozhy
caozhy   Ds   Rxr 2016.01.23 22:26

IBM EBCDIC (US-Canada) 37
OEM United States 437
IBM EBCDIC (International) 500
Arabic (ASMO 708) 708
Arabic (DOS) 720
Greek (DOS) 737
Baltic (DOS) 775
Western European (DOS) 850
Central European (DOS) 852
OEM Cyrillic 855
Turkish (DOS) 857
OEM Multilingual Latin I 858
Portuguese (DOS) 860
Icelandic (DOS) 861
Hebrew (DOS) 862
French Canadian (DOS) 863
Arabic (864) 864
Nordic (DOS) 865
Cyrillic (DOS) 866
Greek, Modern (DOS) 869
IBM EBCDIC (Multilingual Latin-2) 870
Thai (Windows) 874
IBM EBCDIC (Greek Modern) 875
Japanese (Shift-JIS) 932
Chinese Simplified (GB2312) 936
Korean 949
Chinese Traditional (Big5) 950
IBM EBCDIC (Turkish Latin-5) 1026
IBM Latin-1 1047
IBM EBCDIC (US-Canada-Euro) 1140
IBM EBCDIC (Germany-Euro) 1141
IBM EBCDIC (Denmark-Norway-Euro) 1142
IBM EBCDIC (Finland-Sweden-Euro) 1143
IBM EBCDIC (Italy-Euro) 1144
IBM EBCDIC (Spain-Euro) 1145
IBM EBCDIC (UK-Euro) 1146
IBM EBCDIC (France-Euro) 1147
IBM EBCDIC (International-Euro) 1148
IBM EBCDIC (Icelandic-Euro) 1149
Unicode 1200
Unicode (Big-Endian) 1201
Central European (Windows) 1250
Cyrillic (Windows) 1251
Western European (Windows) 1252
Greek (Windows) 1253
Turkish (Windows) 1254
Hebrew (Windows) 1255
Arabic (Windows) 1256
Baltic (Windows) 1257
Vietnamese (Windows) 1258
Korean (Johab) 1361
Western European (Mac) 10000
Japanese (Mac) 10001
Chinese Traditional (Mac) 10002
Korean (Mac) 10003
Arabic (Mac) 10004
Hebrew (Mac) 10005
Greek (Mac) 10006
Cyrillic (Mac) 10007
Chinese Simplified (Mac) 10008
Romanian (Mac) 10010
Ukrainian (Mac) 10017
Thai (Mac) 10021
Central European (Mac) 10029
Icelandic (Mac) 10079
Turkish (Mac) 10081
Croatian (Mac) 10082
Unicode (UTF-32) 12000
Unicode (UTF-32 Big-Endian) 12001
Chinese Traditional (CNS) 20000
TCA Taiwan 20001
Chinese Traditional (Eten) 20002
IBM5550 Taiwan 20003
TeleText Taiwan 20004
Wang Taiwan 20005
Western European (IA5) 20105
German (IA5) 20106
Swedish (IA5) 20107
Norwegian (IA5) 20108
US-ASCII 20127
T.61 20261
ISO-6937 20269
IBM EBCDIC (Germany) 20273
IBM EBCDIC (Denmark-Norway) 20277
IBM EBCDIC (Finland-Sweden) 20278
IBM EBCDIC (Italy) 20280
IBM EBCDIC (Spain) 20284
IBM EBCDIC (UK) 20285
IBM EBCDIC (Japanese katakana) 20290
IBM EBCDIC (France) 20297
IBM EBCDIC (Arabic) 20420
IBM EBCDIC (Greek) 20423
IBM EBCDIC (Hebrew) 20424
IBM EBCDIC (Korean Extended) 20833
IBM EBCDIC (Thai) 20838
Cyrillic (KOI8-R) 20866
IBM EBCDIC (Icelandic) 20871
IBM EBCDIC (Cyrillic Russian) 20880
IBM EBCDIC (Turkish) 20905
IBM Latin-1 20924
Japanese (JIS 0208-1990 and 0212-1990) 20932
Chinese Simplified (GB2312-80) 20936
Korean Wansung 20949
IBM EBCDIC (Cyrillic Serbian-Bulgarian) 21025
Cyrillic (KOI8-U) 21866
Western European (ISO) 28591
Central European (ISO) 28592
Latin 3 (ISO) 28593
Baltic (ISO) 28594
Cyrillic (ISO) 28595
Arabic (ISO) 28596
Greek (ISO) 28597
Hebrew (ISO-Visual) 28598
Turkish (ISO) 28599
Estonian (ISO) 28603
Latin 9 (ISO) 28605
Europa 29001
Hebrew (ISO-Logical) 38598
Japanese (JIS) 50220
Japanese (JIS-Allow 1 byte Kana) 50221
Japanese (JIS-Allow 1 byte Kana - SO/SI) 50222
Korean (ISO) 50225
Chinese Simplified (ISO-2022) 50227
Japanese (EUC) 51932
Chinese Simplified (EUC) 51936
Korean (EUC) 51949
Chinese Simplified (HZ) 52936
Chinese Simplified (GB18030) 54936
ISCII Devanagari 57002
ISCII Bengali 57003
ISCII Tamil 57004
ISCII Telugu 57005
ISCII Assamese 57006
ISCII Oriya 57007
ISCII Kannada 57008
ISCII Malayalam 57009
ISCII Gujarati 57010
ISCII Punjabi 57011
Unicode (UTF-7) 65000
Unicode (UTF-8) 65001

附上所有的代码页编码

yuemuhan999
yuemuhan999 回复caozhy: 怎么拆分字符的
接近 2 年之前 回复
yuemuhan999
yuemuhan999 回复caozhy: 小白表示看不懂,我想知道几个常见的字符编码是怎么样拆分字符的?
接近 2 年之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
准确详细的回答,更有利于被提问者采纳,从而获得C币。复制、灌水、广告等回答会被删除,是时候展现真正的技术了!