dourong4031 2014-12-29 09:26
浏览 88

bin2hex使用psot函数返回不同的结果

I am working on my own project that require the conversion from Chinese char to Unicode.

Currently, i am using the code below with no problem

base_convert(bin2hex(iconv("utf-8", "ucs-4", '人')), 16, 16) // Return 4eba

However, as I trying to add a form to convert the char that user input, the result were way different

base_convert(bin2hex(iconv("utf-8", "ucs-4", $_POST["char"])), 16, 16) // Return 2600000023000000000000000000000000000000000000000000000000

Thanks in advance!

  • 写回答

1条回答 默认 最新

  • douya2007 2014-12-30 15:21
    关注

    If you want to get UTF-8 in the $_POST array you need to tell the browser that the form is to be submitted in UTF-8.

    Generally the way to achieve this is to serve the page containing the form with an indicator that the page is encoded as UTF-8. Otherwise, the browser will arbitrarily guess which encoding is in use, and that guess probably won't be UTF-8. To indicate UTF-8 set the Content-Type header or include in the <head>:

    <meta charset="utf-8"/>
    

    If you include the character in a form field and the browser thinks the encoding is one (like cp1252 Western European) that does not include the character , it will panic and send instead an HTML-character-reference-encoded version, &#20154;. This is a non-useful data mangling as you can't tell whether the original input was or &#20154;, but it's an historical browser quirk we will now never get rid of.

    This is why you get 2600000023000000: characters U+0026,U+0023 are the leading &# part of that mangled version. The rest of that string is 00 and not the subsequent characters because base_convert deals with floating-point numbers and 0x2600000023000000000000000000000000000000000000000000000000 is far too ludicrously large a number to retain precision.

    If you are trying to convert UTF-8-encoded characters into numeric code points, try uniord/unichr.

    评论

报告相同问题?