donglianjiang9321 2011-12-16 16:37 采纳率: 0%
浏览 306

字符编码问题UTF-8和ISO-8859-1

I have a web application that I'm having problems getting Japanese/Chinese characters to display properly. The thing being that i can display these characters properly when I am hard coding them into an HTML document.

Characters such as:

アイヌの工芸 : ペンシルバニア大学考古学人類学博物館ヒラーコレクション

But when I grab them out of this proprietary database it comes out as junk:

ã¢ã¤ãã®å·¥è¸ : ãã³ã·ã«ããã¢å¤§å­¦èå¤å­¦äººé¡å­¦åç©é¤¨ãã©ã¼ã³ã¬ã¯ã·ã§ã³

Now i have the html document encoded in utf-8

<meta http-equiv="content-type" content="text/html; charset=utf-8"/>

The actual html file itself is saved as "Encoded in utf-8" and not ISO-8859-1 or Western Latin etc.

So the weird thing is that when I use iconv to take the junk character string and convert it from utf-8 to ISO-8859-1 it displays correctly.

iconv("UTF-8", "ISO-8859-1//TRANSLIT", $junk_string)

It seems like the junk string is UTF-8 and when I convert the string to ISO-8859-1 it then displays the characters correctly. This doesn't make sense to me at all.

So I sort of have an answer to my problem but I do not know why it works. I thought that having encoding in UTF-8 was supposed to fix this kind of thing. And I am using Verdana but have tried a couple of other fonts with no success. And the weird thing being that I can hard code the characters with no problem into the html page and they display fine. But when get the same data from the database it is displayed as junk without me changing the encoding to ISO-8859-1.

Anyone have any insight here? And instead of doing this to every piece of data gotten from the database is there a way I can change this on the individual page level? I also tried to change the encoding to

<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/>

And the characters from the database still do not display correctly.

  • 写回答

3条回答 默认 最新

  • douque2016 2011-12-16 16:41
    关注

    The answer would be you have wrong data in the database. What probably happened is that you did a conversion ISO-8859-1 -> UTF-8 on data that's already in UTF-8. Therefore, doing a conversion UTF-8 -> ISO-8859-1 gives you the original UTF-8 data back.

    Make sure you're not calling utf8_encode (which does an ISO-8859-1 -> UTF-8 conversion) on UTF-8 data!

    Since every UTF-8 string is also a valid ISO-8859-1 string (well, not quite, but it's commonly extended so that that's the case), you have no errors on the ISO-8859-1 -> UTF-8 conversion over UTF-8 data.

    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题