dongsuo0517
2015-07-15 12:56
浏览 77
已采纳

PHP UTF-8 mb_convert_encode和Internet-Explorer

Since some days I read about Character-Encoding, I want to make all my Pages with UTF-8 for Compability. But I get stuck when I try to convert User-Input to UTF-8, this works on all Browsers, expect Internet-Explorer (like always).

I don't know whats wrong with my code, it seems fine to me.

  • I set the header with char encoding
  • I saved the file in UTF-8 (No BOM)

This happens only, if you try to access to the page via $_GET on the internet-Explorer myscript.php?c=äüöß When I write down specialchars on my site, they would displayed correct.

This is my Code:

// User Input
$_GET['c'] = "äüöß"; // Access URL ?c=äüöß
//--------
header("Content-Type: text/html; charset=utf-8");
mb_internal_encoding('UTF-8');

$_GET = userToUtf8($_GET);

function userToUtf8($string) {
    if(is_array($string)) {
        $tmp = array();
        foreach($string as $key => $value) {
            $tmp[$key] = userToUtf8($value);
        }
        return $tmp;
    }

    return userDataUtf8($string);
}

function userDataUtf8($string) {
    print("1: " . mb_detect_encoding($string) . "<br>"); // Shows: 1: UTF-8
    $string = mb_convert_encoding($string, 'UTF-8', mb_detect_encoding($string)); // Convert non UTF-8 String to UTF-8
    print("2: " . mb_detect_encoding($string) . "<br>"); // Shows: 2: ASCII
    $string = preg_replace('/[\xF0-\xF7].../s', '', $string);
    print("3: " . mb_detect_encoding($string) . "<br>"); // Shows: 3: ASCII

    return $string;
}
echo $_GET['c']; // Shows nothing
echo mb_detect_encoding($_GET['c']); // ASCII
echo "äöü+#"; // Shows "äöü+#"

The most confusing Part is, that it shows me, that's converted from UTF-8 to ASCII... Can someone tell me why it doesn't show me the specialchars correctly, whats wrong here? Or is this a Bug on the Internet-Explorer?

Edit: If I disable converting it says, it's all UTF-8 but the Characters won't show to me either... They are displayed like "????"....

Note: This happens ONLY in the Internet-Explorer!

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • doujiling4377 2015-07-15 14:15
    已采纳

    Although I prefer using urlencoded strings in address bar but for your case you can try to encode $_GET['c'] to utf8. Eg.

    $_GET['c'] = utf8_encode($_GET['c']);
    
    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • douling1936 2015-07-15 13:33

    An approach to display the characters using IE 11.0.18 which worked:

    • Retrieve the Unicode of your character : example for 'ü' = 'U+00FC'

    • According to this post, convert it to utf8 entity

    • Decode it using utf8_decode before dumping

    The line of code illustrating the example with the 'ü' character is :

    var_dump(utf8_decode(html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", 'U+00FC'), ENT_NOQUOTES, 'UTF-8')));
    

    To summarize: For displaying purposes, go from Unicode to UTF8 then decode it before displaying it.

    Other resources: a post to retrieve characters' unicode

    评论
    解决 无用
    打赏 举报

相关推荐 更多相似问题