drjmrg8766 2016-09-21 12:59
浏览 131

php trim mb是否安全

I know that there is no mb_trim version of the trim. I have links to the dozen of articles for how to implement one using preg_replace.

The question I have, is the usual trim with default chars mb safe? That is, is there any example of multibyte character that ends with single byte whitespace char code?

  • 写回答

2条回答 默认 最新

  • duan7264 2016-09-21 13:26
    关注

    It depends on the encoding you're talking about. Both UTF-16LE and UTF-32LE have tons of characters ending in null bytes, for example, which trim removes by default.

    The string "a" in UTF-16LE consists of the bytes 0x61 0x00, and trim will remove the null byte leaving just 0x61.

    Note that this problem goes the other way too, trim strips bytes from the beginning of strings as well as the end. If your string "a" is in UTF-16BE it will be encoded as 0x00 0x61 - with trim again leaving you with just 0x61.


    Example:

    $utf16le = iconv("ASCII", "UTF-16LE", "a"); 
    $utf16be = iconv("ASCII", "UTF-16BE", "a");
    
    var_dump(
      bin2hex($utf16le),
      bin2hex(trim($utf16le)),
      bin2hex($utf16be),
      bin2hex(trim($utf16be))
    );
    

    Output:

    string(4) "6100"
    string(2) "61"
    string(4) "0061"
    string(2) "61"
    

    If you're only worried about UTF-8 then no, there aren't any conflicts. It is ASCII compatible and all single byte characters in UTF-8 are in the form of 0xxx xxxx while all bytes of a multibyte character have their most significant bit set, 1xxx xxxx, so there is no ambiguity. With UTF-8 trim using its default character mask is safe.

    If you're concerned about other encodings then it's going to depend on what they are. If you try using multibyte characters as part of trim's character mask you'll definitely run into problems as each byte will be treated individually.

    评论

报告相同问题?

悬赏问题

  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!
  • ¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像,如何解决?
  • ¥15 求daily translation(DT)偏差订正方法的代码
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示