donglianjiang9321 2011-12-16 16:37 采纳率: 0%
浏览 306

字符编码问题UTF-8和ISO-8859-1

I have a web application that I'm having problems getting Japanese/Chinese characters to display properly. The thing being that i can display these characters properly when I am hard coding them into an HTML document.

Characters such as:

アイヌの工芸 : ペンシルバニア大学考古学人類学博物館ヒラーコレクション

But when I grab them out of this proprietary database it comes out as junk:

ã¢ã¤ãã®å·¥è¸ : ãã³ã·ã«ããã¢å¤§å­¦èå¤å­¦äººé¡å­¦åç©é¤¨ãã©ã¼ã³ã¬ã¯ã·ã§ã³

Now i have the html document encoded in utf-8

<meta http-equiv="content-type" content="text/html; charset=utf-8"/>

The actual html file itself is saved as "Encoded in utf-8" and not ISO-8859-1 or Western Latin etc.

So the weird thing is that when I use iconv to take the junk character string and convert it from utf-8 to ISO-8859-1 it displays correctly.

iconv("UTF-8", "ISO-8859-1//TRANSLIT", $junk_string)

It seems like the junk string is UTF-8 and when I convert the string to ISO-8859-1 it then displays the characters correctly. This doesn't make sense to me at all.

So I sort of have an answer to my problem but I do not know why it works. I thought that having encoding in UTF-8 was supposed to fix this kind of thing. And I am using Verdana but have tried a couple of other fonts with no success. And the weird thing being that I can hard code the characters with no problem into the html page and they display fine. But when get the same data from the database it is displayed as junk without me changing the encoding to ISO-8859-1.

Anyone have any insight here? And instead of doing this to every piece of data gotten from the database is there a way I can change this on the individual page level? I also tried to change the encoding to

<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/>

And the characters from the database still do not display correctly.

  • 写回答

3条回答 默认 最新

  • douque2016 2011-12-16 16:41
    关注

    The answer would be you have wrong data in the database. What probably happened is that you did a conversion ISO-8859-1 -> UTF-8 on data that's already in UTF-8. Therefore, doing a conversion UTF-8 -> ISO-8859-1 gives you the original UTF-8 data back.

    Make sure you're not calling utf8_encode (which does an ISO-8859-1 -> UTF-8 conversion) on UTF-8 data!

    Since every UTF-8 string is also a valid ISO-8859-1 string (well, not quite, but it's commonly extended so that that's the case), you have no errors on the ISO-8859-1 -> UTF-8 conversion over UTF-8 data.

    评论

报告相同问题?

悬赏问题

  • ¥20 sub地址DHCP问题
  • ¥15 delta降尺度计算的一些细节,有偿
  • ¥15 Arduino红外遥控代码有问题
  • ¥15 数值计算离散正交多项式
  • ¥30 数值计算均差系数编程
  • ¥15 redis-full-check比较 两个集群的数据出错
  • ¥15 Matlab编程问题
  • ¥15 训练的多模态特征融合模型准确度很低怎么办
  • ¥15 kylin启动报错log4j类冲突
  • ¥15 超声波模块测距控制点灯,灯的闪烁很不稳定,经过调试发现测的距离偏大