7

I have a small confusion. When I type on Google, almost all articles suggest Filter input, escape output. If I didn't confuse the terms escaping and filtering all my life, it should be the opposite.

You get loads of articles which does something like

$username = htmlentities(htmlspecialchars(strip_tags($_POST['username')));

and suggest doing it.

  1. We should not filter input. We should escape it (previously we did it with mysql_real_escape_string, nowadays prepared statements handle them for us.) We should insert user's submitted data to database as-is, without changing it using functions like htmlspecialchars. We should always keep the original input in our database, so htmlspecialchars during input is wrong. HTML is not harmful for database.

  2. We should filter output, so malicious code (html, js, whatever) won't run on the browser. This is called XSS filtering, not XSS escaping. For example, {{{ $var }}} on Laravel 4 is called as XSS filtering and this should always be used on user submitted content's output.

If the statement Filter input escape output is correct, why it is not mysql_real_filter_string() and preventing XSS isn't being called as XSS escaping?

Also, ircmaxell once said:

Filtering is not about preventing security vulnerabilities, it's about not populating your database with garbage. If you're expecting a date, make sure it at least looks like a date prior to storing it.

This is called validation, and you can't rely on validation only. (Especially on older versions of PHP) You need to both escape and validate input. Filtering may not be used for security vulnerabilities but escaping is.

Well, this sums my confusion. Can someone explain this to me?

4 Answers 4

7

Looks like my confusion was simple. I thought output layer was the layer when we started using echo's, such as view layer.

According to Anthony Ferrara, output is the layer when data leaves your application, and input is the layer when data enters your application.

As such, Input layer is not only limited to user provided content, but reading from config files, reading from file system, retrieving data from 3rd party API's etc. are all considered as Input.

Output is not limited to echo or print on the view layer. SQL queries also count as output, because data leave our application and enter database's scope. As such, writing to a file also count as output, doing a shell command also count as output.

So basically, querying database is Output, while retrieving results from the database is Input.

When you think like that, Filter input, escape output sounds correct. If anybody else were confused like me, this really makes sense.

Sign up to request clarification or add additional context in comments.

Comments

3

First off: htmlentities or htmlspecialchars actually don't escape a string, they convert specific characters to html entities!

  1. First you should take your user input and remove the pseudo/automatic "safety" like magic quotes.

    if (get_magic_quotes_gpc())
    {
        $lastname = stripslashes($_POST['lastname']);
        // ...
    }
    

    This is so you have the "pure" or raw user input.

  2. Then filtering means for example not allowing something like fooBar as an email address!

    if (!my_own_email_validity_check($_POST['email'])) die(); // maybe a bit extreme
    
  3. Afterwards escaping the user input to be stored (eg in your database)

    $city = $mysqli->real_escape_string($city);
    

    Or preferably use PDO, which does it "automaticly" :-)

  4. But the really important part is when displaying that data from your database to the user, to make sure you run it all through htmlspecialchars() since you can't be sure that anything in there is sane!!!

Now there are other opinions saying you should run htmlentities immediately when you get your raw data, but that makes working with it horrible, and is not the recommended way. But it might even depend on WHAT you are doing, like with so many things.

So to summarize it, in general:

  • You need to escape user input when storing it so you are safe against Injection
  • You need to convert stored data when displaying it to be safe against XSS

Edit: There is also a lot of naming differences, sometimes people call it filtering when something is being escaped, or call it in general escaping when something is being sanitized, etc. So don 't be confused by the naming, just understand what is happening and you will be fine ;-)

Edit 2: To answer your question:

It is called "filter input, escape output" because ...

  • Filter in this case actually means not to allow "wrong" data in your database. (Like point 2, email validation, ZIP codes, things you CAN rule out. Things that might also mess up your data processing later on!)
  • Escape output is meant to prevent XSS -> so actually converting to html entities, so here it is a case of naming
  • In "filter input, escape output" there is either no concern for the escaping to prevent SQL Injection, or it is even summarized with "filtering" (which would not be the correct term (imho), just as you said)

In my opinion the problem is, that naming is not consistent.

7 Comments

To be honest, this is what I said in my topic.
Yes, I agree with you! I was just going through the steps and naming everyone of those for clearity. For example on your question why it is not called "mysql_real_filter_string()", it of course does escape it (3) not filter. So you are right! I will edit my answer to supply an actual answer to your question, instead of just a demonstration ...
To be honest...Levit said it much better.
Can't we do trim, strip tags and htmlspecialchars and then escape and insert in database. Or that is not correct approach?
@Luka: Well htmlspecialchars or htmlentities is actually for html to prevent xss attacks etc. If you have it like this in the database, you have to realize that all "non-output-to-html operations" with the data, might not be as expected (e.g. searching within that text, or comparison, etc.).
|
1

filter input and escape output to prevent storing untrusted, injected bad data while preventing cross-site scripting (XSS)

Comments

0

To see the context the phrase was originally used in might help: http://shiflett.org/blog/2005/filter-input-escape-output http://shiflett.org/blog/2005/more-on-filtering-input-and-escaping-output

Filtering input doesn't mean that you don't sanitize, say, SQL inserts by escaping. It's just a catchy, succinct best practice to remember to be conscientious. Chris Shiflett didn't say that you'd never be escaping input or filtering output.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.