1

I am planning to load JS files using AJAX and then eval them to execute the code. But I am worried of using eval. Just to see how jQuery implements the getScript method I went through its source code and found this:

rcleanScript = /^\s*<!(?:\[CDATA\[|\-\-)/;

jQuery.globalEval( ( elem.text || elem.textContent || elem.innerHTML || "" )
      .replace( rcleanScript, "/*$0*/" ) );

globalEval is a method which evaluates the script in global (window) context and takes care or cross-browser compatibility. But I did not understand the replace part. By the name it look like rcleanScript is used to clean the script so that it is secure to execute it. But I did not understand how it works.

Can someone explain this?

EDIT: I know it is replacing some CDATA section with /$0/. But how does that make it secure? In essence how would it be insecure to execute the script without replacing the CDATA part?

3 Answers 3

2

Basically rcleanScript is a regular expression that finds parts of the code that could be harmful and the "/*$0*/" means replace what rcleanScript finds with a comment section using /**/.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes I know this. But what is it replacing and how does replacing that statement with a comment make it secure? And what is $0 in /*$0*/ ?
It is replacing any CDATA or comment sections. $0 is the first match the regular expression finds.
1

That regex matches one of two different things:

<![CDATA[
<!--

These constructs are frequently used at the beginning of script elements to make sure the page validates and that it renders properly in browsers that don't support Javascript. The regex comments them out by putting them within Javascript comment blocks /* ... */. This prevents them causing errors -- obviously they are not valid Javascript so they can't be evaluated as such.

In the second argument to replace, $0 represents the whole substring matched by the regular expression. So /*$0*/ says "put everything matched by that regex within comments".

3 Comments

But I have only seen CDATA used in HTML files. Do JavaScript files also contain CDATA sections? If not, why is that being replaced?
@Cracker The code you quote is from the evalScript method, which is used when appending script elements with JS content to the page. The cleanup is necessary then. So you could do $('<script/>').html(yourCode).appendTo(document.body) and jQuery would sort all this out for you.
Oh.. my bad. I picked the wrong thing then! :( I wanted to know how jQuery loads the scripts dynamically using getScript. Does it just eval the script to execute it. I got confused and have picked the wrong code from the source. Should I ask another question ?
1

I just stumbled upon this question, and I don't think anyone really answered what you asked. They explained what the regex does, but not "how does that make it secure?".

The answer to that is it doesn't. The point of the regex is not to strip out any type of javascript attack, but to work around a bug in IE that causes errors if you include HTML comments or CDATA sections inside of script tags in some HTML you want to add to the page via javascript.

For instance, in most javascript engines (including the one used in IE) "<!--" is the start of a valid javascript single-line comment, but IE will still give an error if you try to call eval on a script that includes "<!--" (which is what the jQuery DOM manipulation methods do if you add a script element to your page). The purpose of rcleanscript is to work around that.

If you actually look at the javascript spec, HTML style comments are not mentioned and therefore not valid, but the jQuery guys wanted to catch the common case of people wrapping the contents of script tags in HTML comments (which is a method of hiding js code from legacy browsers that don't support it).

Also replacing the comment with "/*$0*/" was just a bug - for more than one reason. What the developer actually wanted to say was "/*$&*/", which is the proper syntax for replacing the entire match for a js regex hit ($0 is the syntax in other languages). Furthermore, that doesn't even work around the IE issue since it still complains about "<!--" even if it is wrapped in a comment. In a later version both of these issues were fixed by just deleting the match.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.