How does a browser render this inline JavaScript within an encoded tag?

Question

I was trying to perform a Reflective XSS attack on a tutorial website. The webpage basically consists of a form with an input field and a submit button. On submitting the form, the content of the input field are displayed on the same webpage.

I figured out that the website is blacklisting script tag and some of the JavaScript methods in order to prevent an XSS attack. So, I decided to encode my input and then tried submitting the form. I tried 2 different inputs and one of them worked and the other one didn't.

When I tried:

<body onload="&#97lert('Hi')"></body>

It worked and an alert box was displayed. However, I when encoded some characters in the HTML tag, something like:

&#60body onload="&#97lert('Hi')"&#62&#60/body&#62

It didn't work! It simply printed <body onload="alert('Hi')"></body> as it is on the webpage!

I know that the browsers execute inline JavaScript as they parse an HTML document (please correct me if I'm wrong). But, I'm not able to understand why did the browser show different behavior for the different inputs that I've mentioned.

-------------------------------------------------------------Edit---------------------------------------------------------

I tired the same with a more basic XSS tutorial with no XSS protection. Again:

<script>alert("Hi")</script> -> Worked!

&#60s&#99ript&#62&#97lert("Hi")&#60/s&#99ript&#62 -> Didn't work! (Got printed as string on the Web Page)

So basically, if I encode anything in JavaScript, it works. But if I'm encoding anything that is HTML, it's not executing the JavaScript within that HTML!

"I know that the browsers execute inline JavaScript as they parse an HTML document" that is correct, but what you have isn't inline javascript, it's an onload event. <script>alert("foobar!")</script> would be inline javascript. Attributes do get converted to a string with html entities replaced with the actual characters, which is why your alert on page load works. — user400654
– user400654, Commented May 14, 2014 at 14:10
-1?! Too broad?! Seriously?! I don't think the question is too broad. I just need an answer with-respect to the case mentioned in the question. I'm not asking for the complete rendering process! — Rahil Arora
– Rahil Arora, Commented May 14, 2014 at 15:15

user400654 · Accepted Answer · 2014-05-14 14:21:32Z

2

I can't come up with words to describe the properly, so i'll just give you an example. Lets say we have this string:

<div>Hello World! &lt;span id="foo"&gt;Foobar&lt;/span&gt;</div>

When this gets parsed, you end up with a div element that contains the text:

Hello World! <span id="foo">Foobar</span>

Note, while there is something that looks like html inside the text, it is still just text, not html. For that text to become html, it would have to be parsed again.

Attributes work a little bit differently, html entities in attributes do get parsed the first time.

tl;dr:

if the service you are using is stripping out tags, there's nothing you can do about it unless the script is poorly written in a way that results in the string getting parsed twice.

Demo: http://jsfiddle.net/W6UhU/ note how after setting the div's inner html equal to it's inner text, the span becomes an html element rather than a string.

answered May 14, 2014 at 14:21

user400654

95.1k16 gold badges168 silver badges188 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Rahil Arora Over a year ago

I understand that. But is there any difference in the way in which browser is parsing an HTML tag and an encoded HTML tag? Because, the alert in the non-encoded HTML body tag is getting executed, but the one in encoded tag is not!

user400654 Over a year ago

the encoded html tag is left as is, as you have already found out by performing said test. attributes of html tags get parsed, strings that are not html tags do not.

Rahil Arora Over a year ago

Okay. I think I got it this time. Since, the tag is not encoded, its attribute will be parsed and this will execute the alert method. However, in the other case, since the tag is encoded, the it's leaving the tag as it is and because of this, the attribute is not being parsed. Please correct me if I'm wrong.

user400654 Over a year ago

That is correct. the attribute isn't an attribute if the tag isn't a tag.

Rahil Arora Over a year ago

Thanks! Let me just wait for sometime more for more explanation from others. Will accept your answer in case I'm not able to find a better one.

Ruan Mendes · Accepted Answer · 2014-05-14 15:16:04Z

1

When an HTML page says &#60body It treats it the same as if it said <body

That is, it just displays the encoded characters, doesn't parse them as HTML. So you're not creating a new tag with onload attributes http://jsfiddle.net/SSfNw/1/

alert(document.body.innerHTML);
// When an HTML page says &lt;body It treats it the same as if it said &lt;body

So in your case, you're never creating a body tag, just content that ends up getting moved into the body tag http://jsfiddle.net/SSfNw/2/

alert(document.body.innerHTML)
// &lt;body onload="alert('Hi')"&gt;&lt;/body&gt;

In the case <body onload="&#97lert('Hi')"></body>, the parser is able to create the body tag, once within the body tag, it's also able to create the onload attribute. Once within the attribute, everything gets parsed as a string.

edited May 14, 2014 at 15:16

answered May 14, 2014 at 14:45

Ruan Mendes

92.7k31 gold badges162 silver badges225 bronze badges

1 Comment

Rahil Arora Over a year ago

So, the problem is that the encoded HTML encoded characters are not being parsed? However, if I encode the alert method, it is able to detect it as JavaScript. That's what I find a bit confusing.

Collectives™ on Stack Overflow

How does a browser render this inline JavaScript within an encoded tag?

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related