Javascript - Sanitize Malicious code from file (string)

Question

I have a data javascript file, which is being dynamically added to website via some custom code. This file comes from a third party vendor, who could potentially add malicious code in the file

Before this file is added to the website, I would like to parse through it, and look for malicious code, such as redirects or alerts, that inherently get executed upon a files inclusion in the project/website.

For example, my js file could look like this :

alert ('i am malicious');
var IAmGoodData = 
[
{ Name :'test', Type:'Test2 },
{ Name :'test1', Type:'Test21' },
{ Name :'test2', Type:'Test22' }
]

I load this file into a object via a XMLHttpRequest call, and when this call returns, I can use the variable (which is my file text) and search it for words:

var client = new XMLHttpRequest();
client.open('GET', 'folder/fileName.js');

client.onreadystatechange = function() 
{
        ScanText(client.responseText);
}
client.send();

function ScanText(text)
{
        alert(text);
        var index = text.search('alert');  //Here i can search for keywords
}

The last line would return index of 0, as the word alert is found at index 0 in the file.

Questions:

Is there a more efficient way to search for keywords in the file?
What specific keywords should i be searching for to prevent malicious code being run? ie redirects, popups, sounds etc.....

Your approach is useless. You will never be able to detect every possible malicious action. You need to reconsider how you're doing things. Perhaps provide more context on what this file is and how you need to use it? — Dark Falcon
– Dark Falcon, Commented Aug 30, 2013 at 17:02
Javascript is dynamic enough that this is extremely difficult. Use Google Caja. — SLaks
– SLaks, Commented Aug 30, 2013 at 17:02
@DarkFalcon It is a file that is fileld with map nodes and info (location, name etc..) about those nodes. We display them on a map. These nodes are delivered to us via a third party. — ayla
– ayla, Commented Aug 30, 2013 at 17:07
@SLaks thanks for the recommendation i will look up Google Caja. in the mean time, do you have any examples of malicious code? — ayla
– ayla, Commented Aug 30, 2013 at 17:08

Ian · Accepted Answer · 2013-08-30 17:41:29Z

5

Instead of having them include var IAmGoodData =, make them simply provide JSON (which is basically what the rest of the file is, or seems to be). Then you parse it as JSON, using JSON.parse(). If it fails, they either didn't follow the JSON format well, or have external code, and in either case you would ignore the response.

For example, you'd expect data from the external file like:

[
{ Name :'test', Type:'Test2' },
{ Name :'test1', Type:'Test21' },
{ Name :'test2', Type:'Test22' }
]

which needs to be properly serialized as JSON (double quotes instead of single quotes, and double quotes around the keys). In your code, you'd use:

var json;
try {
    json = JSON.parse(client.responseText);
catch (ex) {
    // Invalid JSON
}

if (json) {
    // Do something with the response
}

Then you could loop over json and access the Name and Type properties of each.

Random Note:

In your client.onreadystatechange callback, make sure you check client.readyState === 4 && client.status === 200, to know that the request was successful and is done.

answered Aug 30, 2013 at 17:41

Ian

51k13 gold badges104 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ayla Over a year ago

Hey Ian, What does setting the readyState and status of the client object do explicitly? Is there a listing of the states and status and what they mean somewhere?

ayla Over a year ago

Also, i cannot seem to get the JSON.parse to ever work, it always falls through into the catch clause. Is there something i have to include to use it?

ayla Over a year ago

Oops nevermind, it was just picking up incorrect json and failing. The problem is even when i just have json data, it still doesn't work. it sees the first name node and bombs

federicot · Accepted Answer · 2013-08-30 17:10:09Z

0

This is extremely difficult to do. There are no intrinsically malicious keywords or functions in JavaScript, there are malicious applications. You could be getting false positives for "malicious" activity and prevent a legitimate code with a real purpose from being executed. And at the same time, anyone with a little bit of imagination could bypass any "preventive" method you may implement.

I'd suggest you look for a different approach. This is one of those problems (like CAPTCHA) in which it's trivial for a human to solve while for a machine is practically impossible to do so. You could try having a moderator or some human evaluator to interpret the code and accept it.

answered Aug 30, 2013 at 17:10

federicot

12.3k19 gold badges71 silver badges111 bronze badges

2 Comments

ayla Over a year ago

Hey thanks for the info. This file only consists of data nodes for a map. There will never be the case where we will have legitimate code with a real purpose. So i can safley search for code and if found, always assume its malicious

federicot Over a year ago

@jordan.peoples Ah, I thought it was code in general. Then I'd suggest, instead of "black listing" all the possible malicious code, "white listing" what it is that you expect. For example, using a RegEx to match data nodes, and if it does match it then accept it.

SLaks · Accepted Answer · 2013-08-30 17:30:09Z

0

You should have them provide valid JSON rather than arbitrary Javascript.
You can then call JSON.parse() to read their data without any risk of code execution.

In short, data is not code, and should not be able to contain code.

answered Aug 30, 2013 at 17:30

SLaks

891k182 gold badges1.9k silver badges2k bronze badges

4 Comments

ayla Over a year ago

We all know this, but what if the other group turns malicious (like getting fired, which happens everyday) and on their last upload they change the data file to contain code, as it is simply text.... what then huh?

SLaks Over a year ago

@jordan.peoples: That's exactly why you should always treat as text, so that it cannot contain code.

ayla Over a year ago

anything can contain code and everything is text in javascript.

SLaks Over a year ago

@jordan.peoples: As long as you never treat it as code – never call eval, and always concatenate safely, it doesn't matter.

Niet the Dark Absol · Accepted Answer · 2013-08-30 17:02:06Z

-3

You shouldn't. The user should be allowed to type whatever they want, and it's your job to display it.

It all depends on where it is being put, of course:

Database: mysql_real_escape_string or equivalent for whatever engine you're using.
HTML: htmlspecialchars in PHP, createTextNode or .replace(/</g,"<") in JavaScript
JavaScript: json_encode in PHP, JSON.stringify in JavaScript.

At the end of the day, just don't be Yahoo

answered Aug 30, 2013 at 17:02

Niet the Dark Absol

326k86 gold badges480 silver badges604 bronze badges

Collectives™ on Stack Overflow

Javascript - Sanitize Malicious code from file (string)

4 Answers 4

3 Comments

2 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

2 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related