1

I encountered this seemingly safe function to extract text content from html

function getText(html) {
  const div = document.createElement('div')
  div.innerHTML = html
  return div.textContent
}

It uses innerHTML but the div is never appended to the DOM so I would guess that it isn't harmful

And it indeed works fine normally:

const text = getText('<b>some text</b>')
console.log(text) // prints "some text"

function getText(html) {
  const div = document.createElement('div')
  div.innerHTML = html
  return div.textContent
}

But it can also lead to xss:

// opens an alert
const text = getText('<b>some text</b><img src="" onerror="alert(1)">')
// prints "some text"
console.log(text)

function getText(html) {
  const div = document.createElement('div')
  div.innerHTML = html
  return div.textContent
}

Even weirder things start to happen when we prepend the html with "<script></script>"

// throws an error and injects html into the page
const text = getText('<script></script><b>some text</b>')
console.log(text)

function getText(html) {
  const div = document.createElement('div')
  div.innerHTML = html
  return div.textContent
}

Why does it load the html if it's not appended to the DOM?

Why <script></script> causes it to inject html to the page?

1 Answer 1

3

Why <script></script> causes it to inject html to the page?

Because the actual source of the snippet that gets run (via "Run Snippet" -> "Full Page" -> View Source) is

<!DOCTYPE html>
<html>
<head>
    <style>
        
    </style>
    <script src="/scripts/snippet-javascript-console.min.js?v=1"></script>
</head>
<body>

<script type="text/javascript">
        // throws an error and injects html into the page
const text = getText('<script></script><b>some text</b>')
console.log(text)

function getText(html) {
  const div = document.createElement('div')
  div.innerHTML = html
  return div.textContent
}
    </script>
</body>
</html>

and that </script> before <b>some text closes the <script type="text/javascript"> tag, and the rest is interpreted as HTML.

This would not happen if the script was in an external file.

Sign up to request clarification or add additional context in comments.

4 Comments

If you can't use an external file, you can also escape the closing script tag. Note that this wouldn't happen with non-constant strings (since those aren't in the page source = not parsed as html), so no need to escape those with a replace.
Should this be considered a StackSnippets bug?
@Bergi This isn't exclusive to StackSnippets. Ye-olde workaround (not had to use it for xx years..) is to break the script tags so they are not interpreted, eg '<script></sc' + 'ript>'
@fdomn-m Sure, that's what I'd do if I were to write HTML. But the StackSnippets editor has a separate pane for JavaScript, and I would expect that code to be properly loaded into the snippet page - escaped if necessary.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.