douyu4535 2012-08-21 21:52
浏览 283
已采纳

PHP和Informix:CLIENT_LOCALE和DB_LOCALE无法按预期工作 - 编码相关

I am using the PHP PDO_Informix driver v1.2.7 and the Informix client version is 3.70. I have some code in UTF-8 that makes queries to a Latin1 database (the Informix server is 9.21).

The thing is that the driver is chopping some values of the return strings. It's like special characters counts double. If a column 'name' has type varchar(2) and the value of name is 'áa' the value returned when queried is 'á' instead of 'áa'. If I resize the column to varchar(3) the result is correct. Below I attach a short script to reproduce the bug. I included the DSN so you can see the encoding settings.

Test script:

$dsn = "informix:database=base;server=ol_server;host=192.168.123.123;client_locale=en_us.utf8;db_locale=en_us.819;service=1526;protocol=olsoctcp;EnableScrollableCursors=1";
$db = new \PDO($dsn, 'user', 'pass');
$db->exec("CREATE TABLE ticket82 ( name VARCHAR(2) );");
$db->exec("INSERT INTO ticket82 VALUES ('aa');");

$statement = $db->query("select name from ticket82;");
$value = $statement->fetchAll(\PDO::FETCH_ASSOC);
echo "expected 'aa' got '{$value[0]['NAME']}'
";

$db->exec("update ticket82 set name='áa';");
$statement = $db->query("select name from ticket82;");
$value = $statement->fetchAll(\PDO::FETCH_ASSOC);
echo "expected 'áa' got '{$value[0]['NAME']}'
";

$db->exec("ALTER TABLE ticket82 MODIFY (name varchar(3));");
$statement = $db->query("select name from ticket82;");
$value = $statement->fetchAll(\PDO::FETCH_ASSOC);
echo "expected 'áa' got '{$value[0]['NAME']}'
";

$db->exec("DROP TABLE ticket82;");

Expected result:

expected 'aa' got 'aa'
expected 'áa' got 'áa'
expected 'áa' got 'áa'

Actual result:

expected 'aa' got 'aa'
expected 'áa' got 'á'
expected 'áa' got 'áa'

Any ideas?

  • 写回答

1条回答 默认 最新

  • duanke9540 2012-08-22 14:40
    关注

    In a slightly weird way, I think that is the 'expected' or 'working as designed' behaviour.

    The column size is specified in bytes rather than characters, but for the database code set (ISO 8859-1 aka Latin-1) there is no difference. The client-side code (PDO Informix) assumes that the variable holding it should allow for the same number of bytes storage.

    However, the client-side code set is UTF-8 rather than 8859-1, and some of the character codes for 8859-1 characters require 2 bytes in UTF-8. To be precise, the 'ASCII' range U+0000..U+007F require 1 byte in UTF-8, but the 'accented' range U+0080..U+00FF require 2 bytes. Because the client-side has limited its variables to 2 bytes (rather than 2 characters), you will only be able to select a single accented character from a VARCHAR(2) column.

    The codeset conversion between UTF-8 and 8859-1 occurs in a library called GLS (Global Language Support) inside the Informix ClientSDK (CSDK) code that is used by PDO Informix.

    This is an interesting setup with the client and database server using different code sets. There's room to think that the client could usefully use bigger variable sizes when there is a code set conversion going on. Since the database is storing Latin-1, all the characters fall in the Unicode range U+0000..U+00FF. (If it was Latin-15, the Euro symbol € U+20AC requires 3 bytes in UTF-8, for instance; most of the other 8859-x series code sets require one or two bytes per character, I believe.) Handling that sensibly in the codeset conversion environment would require some care, but could be done if the code were aware of the issue. The fix probably belongs in PDO Informix. It is telling the CSDK how much space to use for storing the data, using the byte-count information provided by CSDK and the Informix server.


    FYI: Informix 9.21 has been out of support for a long time now (so has 9.30, 9.40 and 10.00 — even 11.10 is out of support, though that is a relatively recent change). However, that is not a factor in this problem.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog