1

I need to strip any script tags from a string, but keeping style.

If I sanitize the style of this string:

getSanitized(s: string) {
    const safeStyle: any = this.sanitizer.bypassSecurityTrustStyle(s);
    return safeStyle.changingThisBreaksApplicationSecurity;
}

const s = '<span style="font-size:18px;color:blue">This is a title</span>';
console.log(this.getSanitized(s));

I get the same string, as it only contains styles, and that seems to work fine.

But if the string contains a script, such as

const s = `<script>alert(1);</script>  
           <span onclick="javascript:alert(2);" 
                 style="font-size:18px;color:blue">This is a title</span>';`
console.log(this.getSanitized(s));

The script tag and the onclick attribute are not eliminated from the string. Why is it not eliminated if I'm sanitizing at the style level?

4
  • 2
    bypass means that an object is created that will allow the template to render the contents and bypass the sanitizer. It does not modify anything. You should only use it with trusted data. This does not sound like you trust the data. Commented Jun 5, 2019 at 18:05
  • I let users customize the Highcharts tooltip, that's HTML markup they can enter, but I need to allow only styles and filter any scripts. Any suggestions? Commented Jun 5, 2019 at 20:30
  • That's a tough XSS question, because there are many attack vectors in CSS. People can get JavaScript to run if you give them access to run CSS on a page. Commented Jun 5, 2019 at 20:34
  • dougrathbone.com/blog/2013/10/30/… Commented Jun 5, 2019 at 20:34

1 Answer 1

2
+50

The Angular Sanitizer does not modify the passed HTML content. It does not extract anything from it. You need to do this manually. For example, you can parse the passed HTML content, remove unnecessary code from it and serialize it to a string again.

I know the htmlparser2 package that can build an AST from HTML. You can use it to parse your HTML. To serialize an AST to a string, you can use the dom-serializer package.

Thus, using these packages or similar, your getSanitized function logic may follow:

async getSanitized(s: string): Promise<string> {
  // 1. make an AST from HTML in a string format
  const dom = await this.getAST(s);
  // 2. remove unwanted nodes from the AST
  const filteredDOM = this.filterJS(dom);
  // 3. serialize the AST back to a string
  const result: string = serializer(filteredDOM);
  return result;
}

The getAST function just uses the htmlparser2 API to get an AST from a string:

getAST(s: string): Promise<DomElement[]> {
  return new Promise((res, rej) => {
    const parser = new Parser(
      new DomHandler((err, dom) => {
        if (err) {
          rej(err);
        } else {
          res(dom);
        }
      })
    );
    parser.write(s);
    parser.end();
  });
}

The filterJS function removes unnecessary nodes. There is an online visualizer for an AST htmlparser2 generates: https://astexplorer.net/. You can easily see what conditions you need to use to filter nodes. The filterJS function may be implemented as:

filterJS(dom: DomElement[]): DomElement[] {
  return dom.reduce((acc, node) => {
    if (node.type === 'tag') {
      node.attribs = this.filterAttribs(node.attribs);
      node.children = this.filterJS(node.children);
    }
    if (node.type !== 'script') {
      acc.push(node);
    }
    return acc;
  }, []);
}

In short, it removes script tags and calls the filterAttribs function to remove JavaScript from event handlers. The filterAttribs function may be:

filterAttribs(attribs: { [s: string]: string }) {
  return Object.entries(attribs).reduce((acc, [key, value]) => {
    if (!key.startsWith('on')) {
      acc[key] = value;
    }
    return acc;
  }, {});
}

Basically, it removes attributes starting from 'on', i.e. event handlers.

The serializer function is a call to the dom-serializer library.

Don't forget to import htmlparser2 and dom-serializer:

import { DomHandler, Parser, DomElement } from 'htmlparser2';
import serializer from 'dom-serializer';

For better TypeScript experience, the htmlparser2 library provides type definitions by using the @types/htmlparser2 package.

You can find a working example at https://stackblitz.com/edit/angular-busvys.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.