doulang1945
2008-09-12 11:05 阅读 30

我如何HTML编码Web应用程序中的所有输出?

I want to prevent XSS attacks in my web application. I found that HTML Encoding the output can really prevent XSS attacks. Now the problem is that how do I HTML encode every single output in my application? I there a way to automate this?

I appreciate answers for JSP, ASP.net and PHP.

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

11条回答 默认 最新

  • 已采纳
    douan6815 douan6815 2008-09-12 11:31

    You don't want to encode all HTML, you only want to HTML-encode any user input that you're outputting.

    For PHP: htmlentities and htmlspecialchars

    点赞 评论 复制链接分享
  • duangenshi9836 duangenshi9836 2008-09-12 11:18

    If you do actually HTML encode every single output, the user will see plain text of <html> instead of a functioning web app.

    EDIT: If you HTML encode every single input, you'll have problem accepting external password containing < etc..

    点赞 评论 复制链接分享
  • dtf0925 dtf0925 2008-09-12 11:57

    A nice way I used to escape all user input is by writing a modifier for smarty wich escapes all variables passed to the template; except for the ones that have |unescape attached to it. That way you only give HTML access to the elements you explicitly give access to.

    I don't have that modifier any more; but about the same version can be found here:

    http://www.madcat.nl/martijn/archives/16-Using-smarty-to-prevent-HTML-injection..html

    In the new Django 1.0 release this works exactly the same way, jay :)

    点赞 评论 复制链接分享
  • douji9816 douji9816 2008-09-12 12:11

    You could wrap echo / print etc. in your own methods which you can then use to escape output. i.e. instead of

    echo "blah";
    

    use

    myecho('blah');
    

    you could even have a second param that turns off escaping if you need it.

    In one project we had a debug mode in our output functions which made all the output text going through our method invisible. Then we knew that anything left on the screen HADN'T been escaped! Was very useful tracking down those naughty unescaped bits :)

    点赞 评论 复制链接分享
  • duancan1900 duancan1900 2008-09-12 13:23

    My personal preference is to diligently encode anything that's coming from the database, business layer or from the user.

    In ASP.Net this is done by using Server.HtmlEncode(string) .

    The reason so encode anything is that even properties which you might assume to be boolean or numeric could contain malicious code (For example, checkbox values, if they're done improperly could be coming back as strings. If you're not encoding them before sending the output to the user, then you've got a vulnerability).

    点赞 评论 复制链接分享
  • dongzi8191 dongzi8191 2008-09-12 14:19

    The only way to truly protect yourself against this sort of attack is to rigorously filter all of the input that you accept, specifically (although not exclusively) from the public areas of your application. I would recommend that you take a look at Daniel Morris's PHP Filtering Class (a complete solution) and also the Zend_Filter package (a collection of classes you can use to build your own filter).

    PHP is my language of choice when it comes to web development, so apologies for the bias in my answer.

    Kieran.

    点赞 评论 复制链接分享
  • duanjia2772 duanjia2772 2008-09-13 17:02

    One thing that you shouldn't do is filter the input data as it comes in. People often suggest this, since it's the easiest solution, but it leads to problems.

    Input data can be sent to multiple places, besides being output as HTML. It might be stored in a database, for example. The rules for filtering data sent to a database are very different from the rules for filtering HTML output. If you HTML-encode everything on input, you'll end up with HTML in your database. (This is also why PHP's "magic quotes" feature is a bad idea.)

    You can't anticipate all the places your input data will travel. The safe approach is to prepare the data just before it's sent somewhere. If you're sending it to a database, escape the single quotes. If you're outputting HTML, escape the HTML entities. And once it's sent somewhere, if you still need to work with the data, use the original un-escaped version.

    This is more work, but you can reduce it by using template engines or libraries.

    点赞 评论 复制链接分享
  • dqitk20644 dqitk20644 2008-09-13 17:54

    there was a good essay from Joel on software (making wrong code look wrong I think, I'm on my phone otherwise I'd have a URL for you) that covered the correct use of Hungarian notation. The short version would be something like:

    Var dsFirstName, uhsFirstName : String;
    
    Begin
    
    uhsFirstName := request.queryfields.value['firstname'];
    
    dsFirstName := dsHtmlToDB(uhsFirstName);
    

    Basically prefix your variables with something like "us" for unsafe string, "ds" for database safe, "hs" for HTML safe. You only want to encode and decode where you actually need it, not everything. But by using they prefixes that infer a useful meaning looking at your code you'll see real quick if something isn't right. And you're going to need different encode/decode functions anyways.

    点赞 评论 复制链接分享
  • dongrong8972 dongrong8972 2008-09-16 02:48

    For JSPs, you can have your cake and eat it too, with the c:out tag, which escapes XML by default. This means you can bind to your properties as raw elements:

    <input name="someName.someProperty" value="<c:out value='${someName.someProperty}' />" />
    

    When bound to a string, someName.someProperty will contain the XML input, but when being output to the page, it will be automatically escaped to provide the XML entities. This is particularly useful for links for page validation.

    点赞 评论 复制链接分享
  • douxin2011 douxin2011 2011-02-01 00:06

    Output encoding is by far the best defense. Validating input is great for many reasons, but not 100% defense. If a database becomes infected with XSS via attack (i.e. ASPROX), mistake, or maliciousness input validation does nothing. Output encoding will still work.

    点赞 评论 复制链接分享
  • dongy44342 dongy44342 2012-01-16 11:51

    OWASP has a nice API to encode HTML output, either to use as HTML text (e.g. paragraph or <textarea> content) or as an attribute's value (e.g. for <input> tags after rejecting a form):

    encodeForHTML($input) // Encode data for use in HTML using HTML entity encoding
    encodeForHTMLAttribute($input) // Encode data for use in HTML attributes.
    

    The project (the PHP version) is hosted under http://code.google.com/p/owasp-esapi-php/ and is also available for some other languages, e.g. .NET.

    Remember that you should encode everything (not only user input), and as late as possible (not when storing in DB but when outputting the HTTP response).

    点赞 评论 复制链接分享

相关推荐