23

Learning security these days :)
I need to allow users to enter text in a form and allow them some HTML tags: bold, italic, list etc. and to prevent them to add some dangerous JavaScript code.
So I have used this whitelist implementation to sanitize HTML.
But I am still confused about how to save and display it in the right way.
So here what I did:
Model:

public class Post
    {
        [AllowHtml]
        public string Data { get; set; }
    }

Controller:

[HttpPost, ActionName("Create")]
        [ValidateAntiForgeryToken]
        public ActionResult Create(Post model)
        {
            // Decode model.Data as it is Encoded after post
            string decodedString = HttpUtility.HtmlDecode(model.Data);
            // Clean HTML
            string sanitizedHtmlText =  HtmlUtility.SanitizeHtml(decodedString);

            string encoded = HttpUtility.HtmlEncode(sanitizedHtmlText);

View:

@using (Html.BeginForm("Create", "Home", FormMethod.Post)) {    
    @Html.AntiForgeryToken()
    @Html.TextAreaFor(a=>a.Data)
    <input type="submit" value="submit" />
}

So when I post a form I see:

<p>Simple <em><strong>whitelist</strong> </em>test:</p>
<ul>
<li>t1</li>
<li>t2</li>
</ul>
<p>Image:</p>
<p>&lt;img src="http://metro-portal.hr/img/repository/2010/06/medium/hijena_shutter.jpg" /&gt;</p>

Becaouse of <p>&lt; I think that I need to decode it first:

<p>Simple <em><strong>whitelist</strong> </em>test:</p>
<ul>
<li>t1</li>
<li>t2</li>
</ul>
<p>Image:</p>
<p><img src="http://metro-portal.hr/img/repository/2010/06/medium/hijena_shutter.jpg" /></p>

Then I sanitize it against whitelist and I get sanitized HTML:

<p>Simple <em><strong>whitelist</strong> </em>test:</p>
<ul>
<li>t1</li>
<li>t2</li>
</ul>
<p>Image:</p>
<p>

1) Should I save it like this in database?
2) Or I need to Encode this result and then save it to database (encoded bellow)?

&lt;p&gt;Simple &lt;em&gt;&lt;strong&gt;whitelist&lt;/strong&gt; &lt;/em&gt;test:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;t1&lt;/li&gt;
&lt;li&gt;t2&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Image:&lt;/p&gt;
&lt;p&gt;

Here I am confused if I put it on the view like this:

@Model.Data

I get this on the view:

&lt;p&gt;Simple &lt;em&gt;&lt;strong&gt;whitelist&lt;/strong&gt; &lt;/em&gt;test:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;t1&lt;/li&gt; &lt;li&gt;t2&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Image:&lt;/p&gt; &lt;p&gt;

or

<p>Simple <em><strong>whitelist</strong> </em>test:</p> <ul> <li>t1</li> <li>t2</li> </ul> <p>Image:</p> <p>

So what to do to display this HTML properly (bold, list etc.)?

3 Answers 3

29

The rule of thumb is the following:

  1. Store in your database the RAW HTML without any encodings or sanitizings. A SQL server doesn't care if you store some string containing XSS code.
  2. When displaying this output to your page make sure that it is sanitized.

So:

[HttpPost, ActionName("Create")]
[ValidateAntiForgeryToken]
public ActionResult Create(Post model)
{
    // store model.Data directly in your database without any cleaning or sanitizing
}

and then when displaying:

@Html.Raw(HtmlUtility.SanitizeHtml(Model.Data))

Notice how I used the Html.Raw helper here to ensure that you don't get double HTML encoded output. The HtmlUtility.SanitizeHtml function should already take care of sanitizing the value and return a safe string that you could display in your view and it will not be further encoded. If on the other hand you used @HtmlUtility.SanitizeHtml(Model.Data), then the @ razor function would HTML encode the result of the SanitizeHtml function which might not be what you are looking for.

Sign up to request clarification or add additional context in comments.

4 Comments

Aha... clear :) so after post basically I can save this to database <p>This is <strong>safe</strong> text and <em>danger</em> follows &gt; &lt;script&gt;alert('attack');&lt;/script&gt;</p> and just use SanitizeHtml on the view. I thought that I must not save html tags in database. This value I get is passed model from view to Create method (it adds &qt, lt etc).
Yes, your understanding is correct. There's nothing wrong in storing raw HTML in your database.
@DarinDimitrov How do i get HtmlUtility.SanitizeHtml() method. I do not know what to reference. I did some google search and found here link that HtmlUtility is only available for Windows and Windows Phone. How can get this referenced in my MVC Web Project?
For sanitizing HTML (i.e. remove tags not in whitelist) see Is there a good solution for a C# html sanitizer?
2

To framework 4.5, Using MVC 5, use @Html.Raw(WebUtility.HtmlDecode(item.ADITIONAL_INFORMAtION))

Comments

1

You can save HTML file in database by datatype VARBINARY(MAX) for htmlcolumn .

  1. Convert a html file in binary file (code project link)

  2. insert data in a column like this sample code :

Declare @HTML   Varbinary(MAX) = Set HTML Varbinary code here 

Insert into table_name (htmlcoulmn)
Value @HTML
  1. Load data on the database ,when you need load file , you should convert htmlcolumn to Nvarchar(max) by this code :
Select CAST(htmlcolumn as nvarchar(MAX)) As HTMLCODE
FROM Table_Name

If this solution has a problem, thank you for writing me a comment.

I hope you for the best

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.