donglu9898 2011-10-14 01:36
浏览 29
已采纳

在utf-8数据上使用stripslashes的推荐方法?

Im converting my site to utf, which is mostly done except there is legacy code which needs to make use of stripslashes()

I've heard reports that stripslashes can corrupt utf data, but Im not sure I understand why. utf sets the upper bit for all non-first characters (to be compatble with ASCII), so is it safe to run on utf data or not?

Are there potential security vulnerabilities if I try to run stripslashes on utf data. I ran a few tests using invalid utf code with slashes, but wasnt able to come up with any

  • 写回答

1条回答 默认 最新

  • douyi1963 2011-10-14 02:08
    关注

    I don't see a problem with UTF-8. In fact, most ASCII functions are UTF-8-safe because it is ASCII-compatible. (You only have to worry about lengths and mid-string insertion and deletion.)

    UTF-16 and -32, however, are a problem because they may use characters with ASCII values (<0x80) to represent higher codepoints, which may be misinterpreted as ASCII slashes or quotes.

    Example: "⁜!" (U+205C U+21) in UTF-16BE is 20 5c 00 21 which may be interpreted as " \0!" (where 0 is the NUL byte) and subsequently have its second character removed, corrupting the string.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)
  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法
  • ¥15 可否在不同线程中调用封装数据库操作的类
  • ¥15 微带串馈天线阵列每个阵元宽度计算
  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据