dpkk8687 2010-05-21 15:42
浏览 38
已采纳

php中的utf-8编码问题

Another utf-8 related problem I believe...

I am using php to update data in a mysql db then display that data elsewhere in the site. Previously I have run into utf-8 problems before where special characters are displayed as question marks when viewed in a browser but this one seems slightly different.

I have a number of records to enter that contain the è character. If I enter this directly in the db then it appears correctly on the page so I take this to mean that utf-8 content is being output correctly.

However when I try and update the values in the db through php, then the è character is replaced. What appears instead is & Atilde ; & uml ; (without the spaces) which appears in the browser as è

I have the tables in the database set to use UTF-8. I believe this is correct cos, as mentioned, if I update the db through phpMyAdmin, its all ok. Similarly I have set the character encoding for the page which seems to be correct. I am also running the sql statement "SET NAMES 'utf8';" before trying to update the db.

Anyone have any other ideas as to where the problem may lie?

Many thanks

  • 写回答

4条回答 默认 最新

  • doudiza9154 2010-05-21 16:00
    关注

    Yup.

    The character you have is LATIN SMALL LETTER E WITH GRAVE. As you can see, in UTF-8 that character is encoded into two bytes 0xC3 and 0xA8.

    But in many default, western encodings (such as ISO-8859-1) which are single-byte only, this multi-byte character is decoded as two separate characters, LATIN CAPITAL LETTER A WITH TILDE and DIAERESIS. Notice how they are both encoded as C3 and A8 in ISO-8859-1?

    Furthermore, it looks like PHP is processing these characters through htmlentities() which result in the à and ¨ respectively.

    So, where exactly is the problem in your code? Well, htmlentities() could be doing it all by itself since its 3rd argument is a encoding name - which you may not have properly set to 'UTF-8'. But it could be some other string processing function as well. (Note: As a general rule, it's a bad idea to store HTML entities in the database - this step should be reserved for time of display)

    There are a bunch of other ways to trip yourself up with UTF-8 in php - I suggest hitting up the cheatsheet and make sure you're in good shape.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?