doudeng3008 2011-12-24 19:01
浏览 18
已采纳

是否有其他序列浏览器解释为HTML特殊字符?

In HTML, there are several special characters < > & ' " which have significance to the DOM parser. These are the characters the popular functions such as PHP's htmlspecialchars convert to HTML entities so they don't accidentally trigger something when parsed.

The translations performed are:

  • '&' (ampersand) becomes &amp;
  • " (double quote) becomes &quot; when ENT_NOQUOTES is not set.
  • ' (single quote) becomes &#039; only when ENT_QUOTES is set.
  • '<' (less than) becomes &lt;
  • '>' (greater than) becomes &gt;

However, I remember that in older browsers like IE6, there were also other byte sequences that caused the browser's DOM parser to interpret content as HTML.

Is this still a problem today? If you filter these 5 alone is that enough to prevent XSS?

For example, here are all the known combinations of the character "<" in HTML and JavaScript (in UTF-8).

<
%3C
&lt
&lt;
&LT
&LT;
&#60
&#060
&#0060
&#00060
&#000060
&#0000060
&#60;
&#060;
&#0060;
&#00060;
&#000060;
&#0000060;
&#x3c
&#x03c
&#x003c
&#x0003c
&#x00003c
&#x000003c
&#x3c;
&#x03c;
&#x003c;
&#x0003c;
&#x00003c;
&#x000003c;
&#X3c
&#X03c
&#X003c
&#X0003c
&#X00003c
&#X000003c
&#X3c;
&#X03c;
&#X003c;
&#X0003c;
&#X00003c;
&#X000003c;
&#x3C
&#x03C
&#x003C
&#x0003C
&#x00003C
&#x000003C
&#x3C;
&#x03C;
&#x003C;
&#x0003C;
&#x00003C;
&#x000003C;
&#X3C
&#X03C
&#X003C
&#X0003C
&#X00003C
&#X000003C
&#X3C;
&#X03C;
&#X003C;
&#X0003C;
&#X00003C;
&#X000003C;
\x3c
\x3C
\u003c
\u003C
  • 写回答

3条回答 默认 最新

  • doutu1889 2011-12-24 19:15
    关注

    No. I actually looked into this when I was researching using CSS and attributes to automatically assign styles based on content (my question), and the short answer is no. Modern browsers do not allow 'byte sequences' to be used as HTML. I use 'byte sequences' lightly because the most at risk code does not use byte encoded values.

    The examples listed on the XSS site are about using attributes and having the javascript interpreted as a string that would need execution. But also listed is things like &{alert('XSS')} which runs the code within the brackets, and that code does not work in modern browsers.

    But to answer your second question, no, filtering those 5 is not enough to prevent an XSS attack. Throw your code through the PHP HTML special characters code always but there a hundreds of byte codes that can be used and you won't really be able to guarantee anything. Sending it through a PHP filter (especially htmlentities()) will give you the exact text entered when you output it to HTML (IE &laquo; instead of «). That said, in most cases, depending your usage, using htmlspecialchars is enough to cover most attacks. Depends on how you will be using the input, but for the most part it will be safe.

    XSS is a tricky thing to account for. A general good rule is always filter everything that a user will enter. And use white-listing instead of black-listing. What your talking about here would be black-listing these values, when it is always safer to assume your users are malicious and only allow certain things.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料