3

I'm using a textarea to get input from the user and display it on the screen. How can I make sure that if they put in something like

<h1>YAY, I hacked in</h1>

I only display it as it is, and it doesn't display as an <h1>. There must be a function for this. Help? :D

6
  • 1
    Check the following question: stackoverflow.com/questions/129677/… Commented May 28, 2013 at 14:00
  • 1
    Use a XML Parser on your server and strip / validate the input. You don't use RegEx, do you!? Commented May 28, 2013 at 14:02
  • 1
    Create a text node, set its value as the user's input, and then append it to the page Commented May 28, 2013 at 14:02
  • 1
    possible duplicate of What are the common defenses against XSS? Commented May 28, 2013 at 14:04
  • 1
    be careful: sanitising/validating in the browser can be bypassed fairly easily if someone wants to hack you. You must also do similar checks in your server-side code as well. Commented May 28, 2013 at 14:24

2 Answers 2

2

As I commented, if you're about to send that data to a server, you should use one of the various XML Parsers available and strip + validate the input.

If you however, need to purely validate on the client, I suggest you use document.implementation.createHTMLDocument, which creates an fully fledged DOM Object on the stack. You can then operate in there and return your validated data.

Example:

function validate( input ) {
    var doc   = document.implementation.createHTMLDocument( "validate" );

    doc.body.innerHTML = input;

    return [].map.call( doc.body.querySelectorAll( '*' ), function( node ) {
        return node.textContent;
    }).join('') || doc.body.textContent;
}

call it like

validate( "<script>EVIL!</script>" );
Sign up to request clarification or add additional context in comments.

2 Comments

How is using document.implementation.createHTMLDocument better than using a plain DOM element or a document fragment?
@FlorianMargaine its in fact very similar to a document fragment. However you can use anything in here, that you would do in your default document. You can literally load entire HTML documents into this thing and operate on it. Should be way more lightweight than an <iframe> at least.
1

You need to address this on the server side. If you filter with JavaScript at form submission time, the user can subvert your filter by creating their own page, using telnet, by disabling JavaScript, using the Chrome/FF/IE console, etc. And if you filter at display time, you haven't mitigated anything, you've only moved the breakin-point around on the page.

In PHP, for instance, if you wish to just dump the raw characters out with none of the user's formatting, you can use:

print htmlentities($user_submitted_data, ENT_NOQUOTES, 'utf-8');

In .NET:

someControl.innerHTML = Server.HtmlEncode(userSubmittedData);

If you're trying to sanitize the content client-side for immediate/preview display, this should be sufficient:

out.innerHTML = user_data.replace(/</g, "&lt;").replace(/>/g, "&gt;");

3 Comments

Bear in mind, the last suggestion doesn't sanitize the text for sending to other visitors. It's only legitimate purpose is for giving the text-entering user an accurate pre-submission preview of their entry.
Okay. If you're using a PHP form and submitting the information via GET, mysql_real_escape_string would be a legitimate way to sanitize the string, right?
It's a legitimate way to sanitize a string for interpolation in a [My]SQL query. You still need to perform HTML/JavaScript sanitization on inserted values before sending them to the client.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.