duanju8431 2016-09-12 11:10
浏览 81

如何使用PHP正确处理MYSQL数据库的UTF8结果? [重复]

This question already has an answer here:

For a few days now I've been looking for a solution to display UTF8 on my webpage. The character currently causing trouble is į (unicode: \u012f decimal: 303) however, there are over 10,000 records in my database and I cannot guarantee that all others are displaying correctly. So I'm looking for a solution that should cover all characters.

The į is displaying as a ? in the HTML.

My setup is a HTML page, which uses AJAX to send a request to a PHP file. The PHP then queries a MYSQL database to find a specific entry, it then takes a lithuanian word from that entry and echoes it as a response to AJAX. Back in the Javascript, the response is set as the innerHTML of a HTML element. This current setup is not using JQuery.

Below is my progress on attempting to fix the issue.

First, I verified that all files I was working with are correctly encoded to UTF8, not UTF8BOM.

Then I opened the MYSQL database in phpMyAdmin to view the entries. Seeing characters replaced with ? in the entries, I done some research and found the database had the wrong collation. After changing the collation to utf8_general_ci for the database/table nothing changed, so I looked into it further and found that changing it for individual columns of a table was another solution. This worked and my database is now displaying the characters correctly.

Next the character š (unicode: \u0161 decimal: 353) would not display in my webpage, I fixed this by using the following code in PHP which I found on stackoverflow.

function encode_string($string){ 
    $encoded = ""; 
    for ($n=0;$n<strlen($string);$n++){ 
        $check = htmlentities($string[$n],ENT_QUOTES); 
       $string[$n] == $check ? $encoded .= "&#".ord($string[$n]).";" : $encoded .= $check; 
    } 
    return $encoded; 
} 

I can't say I completely understand this code but it caused the character š to display correctly when it got to my HTML. However this did not work for the character į.

I have also tried $conn->set_charset('utf8'); to set the connection to use utf8 however this resulted in į being displayed as į instead, same result for $conn->query("SET NAMES UTF8;");

I have found that hardcoding the į into the Javascript or PHP, allow it to be sent back and displayed correctly, for example echo "į"; works. So I believe the issue may be related to the database or in the PHP before the echo. However I don't have the knowledge to identify the problem.

Here is my php code below:

<?php
header('Content-Type: text/html charset=utf-8');
//Connection to database is made. Referred to as $conn

$sql = "SELECT * FROM Words";
$result = $conn->query($sql);

if ($result->num_rows > 0) {

    //Loop through the results to find a word with the status of 1
    while($row = $result->fetch_assoc()) {

        $status = $row["status"];

        if($status == 1){
            //respond to AJAX with the word

            $ltword = trim($row["lt"]);


            echo utf8_encode(encode_string($ltword));
            //Has also been tested as 
            //echo encode_string($ltword);
            //with no noticeable difference.


            break;
        }
    }

}


function encode_string($string){ 
    $encoded = ""; 
    for ($n=0;$n<strlen($string);$n++){ 
        $check = htmlentities($string[$n],ENT_QUOTES); 
       $string[$n] == $check ? $encoded .= "&#".ord($string[$n]).";" : $encoded .= $check; 
    } 
    return $encoded; 
}

?>

At the core my question is, given my current setup, how do I correctly get an encoded UTF8 character from my database to display on my webpage?

EDIT: The mb_check_encoding() function of php, verifies that the data received from the database is valid utf8.

php.ini is using utf8 as it's default charset.

Using $conn->character_set_name(); returns the result latin1. Using $conn->set_charset("utf8"); causes it return utf8, however į is then displayed as į which is still incorrect.

</div>
  • 写回答

3条回答 默认 最新

  • doulu7258 2016-09-12 11:17
    关注

    in your case problem was collation, which was modified later. As a good practice try to set table collation as well as column collation same ie. utf8_unicode_ci (general is faster but unicode is much better for sort/display).

    Now coming back to problem, the problem lies with already added data which was stored wrong due to non proper collation. For that you need to look & resolve method as you cant be sure it was stored properly.

    评论

报告相同问题?

悬赏问题

  • ¥15 基于单片机的靶位控制系统
  • ¥15 AT89C51控制8位八段数码管显示时钟。
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错