0

Main Question

Is parsing HTML files with XMLHttpRequest, using responseType = "document", a potential security issue?

Examples can be found on MDN here: HTML in XMLHttpRequest

When setting documentType = "document", it will try to parse the url (HTML file in our case) into DOM nodes, and retrieve that.

Let's say we have a Man-in-Middle attack situation (i.e. not using HTTPS), and the HTML file is swapped out. Are we at risk?

Bonus Question

Let's say we are loading a JSON file instead of a HTML file. Is using documentType = "text" as safe as JSON.parse, i.e. the code is not evaluated?

1 Answer 1

1

I am not a developer, but a security practitioner, so please excuse any inaccuracies. Short answer from my side is yes, when you fetch and interpret external data there will be security risks. This is not only for HTML, but also when parsing XML, or including any form of content that goes through an interpreter. For example, in AJAX the XMLHttpRequest result may perform some action on behalf of the user. If the file is swapped out, something like that could happen.

When building an application you will not be able to eliminate all risk, but you want to bring it down to acceptable levels. For example, instead of including external code, host the code yourself.

This applies also to your XMLHttpRequest fetch - where does the data come from? More risk comes with third parties, and across domains. Avoid if you can. You should consider blocking cross origin resource sharing by policy, though Access-Control-Allow-Origin. HTTPS does not eliminate risk either, as you possibly can not trust the third party anyway, and HTTPS does not completely eliminate MIM-attacks.

If however you are fetching something that you are hosting yourself and have a trusted channel to obtain, you may argue that the remaining risk is small.

As for the bonus question I am not sure whether this will make a big difference. I assume with documentType = "text" you will end up with a long string of text which is actully an HTML document. Then what? If you still plan to parse it, scripts may run. JSON.parse is a text parser, which will not load scripts, but here as far as I can understand you need to expose yourself to the parsing of HTML anyway. The solution is probably to make sure you can trust the source.

Sign up to request clarification or add additional context in comments.

4 Comments

Awesome, thanks. I don't fully understand why hosting the file on the same server as the requesting code is safe(r). I mean, let's say I'm Sneaky Pete. I go to a web application, let's say www.example.com. There, I somehow figure out the the code loads a local HTML file. I somehow manage to get a hold of the request and swap out the package before it reaches the requesting code. Is that possible? Also, sidequestion, I thought we were restricted to same origin in any case, by the "same origin policy". Is that not correct? For security reasons, browsers restrict cross-origin HTTP requests..`
This article seems to state that "same origin policy" applies to browsers by default: developer.mozilla.org/en-US/docs/Web/HTTP/CORS
Maybe I was not quite on point with what form of risk you see here. My point was that generally you want to fetch code only when you can trust the source. Obviously a third party controlling the information you fetch, is less safe than when you fetch something you host and control access to. Simply because you cannot trust the security in the other end, but you can trust your own.
As for the attack you picture - this seems a bit unclear to me. Sneaky Pete intercepts a response, from another users session you mean? This seems infeasible without access to additional vulnerabilities in the application, or the network infrastructure. Unless of course, the attacker can take control of the data source, which is why third party fetching is less safe (previous comment)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.