dongraa1986 2012-01-19 18:10
浏览 139
已采纳

检查字符串是UTF-8还是UCS-2

If I have a list of data:

$a = "hello";

$b= "4f60";

$c = "hi";

$d = "00480065006C006C006F";

$b and $d are UCS-2 string. I wish to display all these data in a table,so how can I know whether which data is UCS-2 so that I can convert it before being displayed ? Is it possible ? I tried mb_detect_encoding and preg_match for unicode found at php.net , but even it is an unknown symbol it still considered as unicode.

Thank you.

  • 写回答

1条回答 默认 最新

  • dongman5539 2012-01-19 18:14
    关注

    First of all, the strings you show are hexadecimal representations, not the actual UCS-2 or UTF-8 encodings.

    That said, there are some pretty huge differences between UCS-2 and UTF-8 that would allow you to write code that correctly detects the encoding with a really high success rate. But before doing that, show us how you are using mb_detect_encoding and it's not working. No sense in reinventing a worse wheel than already exists.

    Update: Your input strings are not actually the encoded byte values; they are hex representations of the values. To undo this, you can use

    $proper_string = pack('H*', $hex_encoded_string);
    

    After this, mb_detect_encoding should work fine.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?