How to protect ASP .NET web app from XSS while preserving entered data?

Question

My colleagues and I have been debating how to best protect ourselves from XSS attacks but still preserve HTML characters that get entered into fields in our software.

To me, the ideal solution is to accept the data (turn off ASP .NET request validation) as the user enters it, throw it in the database exactly as they entered it. Then, whenever you display the data on the web, HTML-encode it. The problem with this approach is that there's a high likelihood that a developer somewhere someday will forget to HTML-encode the display of a value somewhere. Bam! XSS vulnerability.

Another solution that was proposed was to turn request validation off and strip out any HTML users enter before it is stored in the database using a regex. Devs will still have to HTML-encode things for display, but since you've stripped out any HTML tags, even if a dev forgets, we think it would be safe. The drawback to this is that users can't enter HTML tags into descriptions and fields and things, even if they explicitly want to, or they may accidentally paste in an email address surrounded by < > and the regex doesn't pick it up...whatever. It screws with the data, and it's not ideal.

The other issue we have to keep in mind is that the system has been built in the fear of commitment to any one strategy around this. And at one point, some devs wrote some pages to HTML encode data before it gets entered into the database. So some data may be already HTML encoded in the database, some data is not - it's a mess. We can't really trust any data that comes from the database as safe for display in a browser.

My question is: What would be the ideal solution if you were building an ASP .NET web app from the ground up, and what would be a good approach for us, given our situation?

Kirk Woll · Accepted Answer · 2011-03-09 00:30:03Z

2

Assuming you go ahead and store the HTML directly in the database, in ASP.NET/MVC Razor, HTML-encoding is done automatically, so your negligent developer would have to really go above and beyond the call of duty to introduce the XSS. With standard webforms (or the webform view engine), you can force developers to use the <%: syntax, which will accomplish the same thing. (albeit with more risk that the developer will be negligent)

Furthermore, you could consider only selectively disabling request validation. Do you really need to support it for every request? The vast majority of requests, presumably, would not need to preserve (or allow) the HTML.

answered Mar 9, 2011 at 0:30

Kirk Woll

77.8k23 gold badges190 silver badges200 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Paul Alexander · Accepted Answer · 2011-03-09 00:35:53Z

Using a regex to strip html is fairly easy to defeat and very difficult to get correct. If you want to clean HTML input it's better to use an actual parser to enforce strict XML compliance.

What I would do in this situation is store two fields in the database: clean and raw for the data. When the user wants to edit their content, you send them the raw data. When they submit changes, you sanitize it and store it in the clean field. Developers then only ever use the clean field when outputting the content to the page. I would even go so far as to name the raw field dangerousRawContent so it's obvious that care must be taken when referencing that field.

The added benefit of this technique is that you can re-sanitize the raw data with improved parsers at a later date without every loosing the originally intended content.

Collectives™ on Stack Overflow

How to protect ASP .NET web app from XSS while preserving entered data?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related