How can I remove a <style type="text/css"> ... </style> inside my variable with Javascript and regex

Question

I have a javascript variable that contains the contents of a HTML page. I would like to remove a inline <style type="text/css"> ... </style> from this. I asked before and it was suggested that I add this to the DOM.

Is there a simpler way that I could remove this using a regular expression. I need to match <style> as a start and </style> as a finish. I heard about regex but not even sure if this can be used with javascript.

javascript has its own regex for sure, but why don't you make a or multiple CSS class(also make them more reusable) contains everything in the <style></style>, therefore you can remove them easily by jQuery removeClass() function — Venzentx
– Venzentx, Commented Jun 19, 2014 at 15:12
If all else fails, you can always use substring to remove it — Huangism
– Huangism, Commented Jun 19, 2014 at 15:12

Joseph Myers · Accepted Answer · 2014-06-19 15:52:17Z

2

Ingmars has the right idea, except it's missing an important question mark, some additional HTML/XML possibilities (such as whitespace allowed after the tag name in both cases, and attributes in the first case), and also replacing it with a message (I'm assuming that you just wanted to delete it completely).

This will work except if attributes contain ">" which is a calculated risk. The code is written given that htmlString is the actual variable that you have containing the HTML document.

htmlString = htmlString.replace(/<style\b[^<>]*>[\s\S]*?<\/style\s*>/gi, '');

edited Jun 19, 2014 at 15:52

answered Jun 19, 2014 at 15:39

Joseph Myers

6,56229 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

jupenur Over a year ago

Your first * still looks a bit too greedy. And it'll match <styleasdfg....

Joseph Myers Over a year ago

It's OK for the first [^<>] to be greedy, because there is no chance for it to get beyond the end of the tag since both > and < are not allowed (the second one is also illegal). As far as matching your example, there is no tag name beginning with the substring style other than style, so we are safe in isolating the matching of style tags. You are right that no validation is being done of the HTML in the document, but it is well known that such a task is impossible in regular expressions as they are.

jupenur Over a year ago

This: "such a task is impossible in regular expressions". See the comment I left on the question?

Samantha J T Star Over a year ago

@JosephMyers - I just rechecked and there is <style type="text/css">. I missed out the type="text/css" by accident. Will your version also check for this?

Joseph Myers Over a year ago

@SamanthaJ Yes, my version will also check for this (as well as any other attributes there might be like media.

|

Matt · Accepted Answer · 2014-06-19 15:16:38Z

1

If it's just one set of <style> tags, then a Javascript Reg Exp would work just fine:

var re = /(<style\b[^>]*>)[^<>]*(<\/style>)/i; // To remove ALL style tags, change the i at the end to gi.
var html = "!<DOCTYPE html>..."; // Your HTML string;

html = html.replace(re, "");

This solution isn't practical where you want to target specific <style> tags though (i.e. You can only remove the first match, or all matches).

answered Jun 19, 2014 at 15:16

Matt

3,1204 gold badges32 silver badges36 bronze badges

5 Comments

Samantha J T Star Over a year ago

can you explain you mention. You can only remove the first match or all matches. In your example would it remove the first or all ?

jupenur Over a year ago

What about something like <style>.foo > .bar { color: red; }</style>? See the > there?

Matt Over a year ago

Sure. Regular Expressions will return at the first expression that they match, unless you specify a g (or gi) at the end of the statement. If a g is specified, it will continue even after the first match and find everything in the string that matches.

Matt Over a year ago

@jupenur Well spotted. Didn't consider that character.

Samantha J T Star Over a year ago

@Matt - I just rechecked and there is <style type="text/css">. I missed out the type="text/css" by accident. Will your version also check for this?

Ingmars · Accepted Answer · 2014-06-19 16:22:40Z

1

Simple regex which will wipe it with no regrets:

var a = 'aaaa <style type="text/css" favouriteAnimal="horse">style</StYlE> bbbbb <styLE>another style</STyle> cccc';
var b = a.replace( /<style[\s\S]*?>[\s\S]*?<\/style>/gi, '' );
console.log( b );

EDIT: updating my answer to handle current question specifics.

edited Jun 19, 2014 at 16:22

answered Jun 19, 2014 at 15:29

Ingmars

9985 silver badges10 bronze badges

3 Comments

Joseph Myers Over a year ago

Your regexp needs to be lazy [\s\S]*? or you are going to gobble up everything from the first stylesheet on the page until the end of the last one. One some web pages this will devour the entire web page as well, because they have stylesheets at the top and at the bottom.

Ingmars Over a year ago

@JosephMyers: good catch, I've updated my code, and learned a bit myself. Thanks!

Joseph Myers Over a year ago

Thanks. In fact, my code isn't perfect, either. @jupenur has a good point at his link, that there are always failure cases when trying to do anything with HTML without actually parsing it, and parsing it is impossible with regular expressions.

Community · Accepted Answer · 2017-05-23 11:49:34Z

Following the advice of bobince (as recommended by jupenur), use an XML parser. Then you can find all <style> tags, remove them, and retrieve the HTML. It'll work every time. Here's an example:

var im = document.implementation;
var doc = 'createHTMLDocument' in im ?
    im.createHTMLDocument('') : new ActiveXObject("htmlfile");
if(!doc.body)
    doc.write('<body></body>');
doc.body.innerHTML = '<p><style type="text/css"></style></p><p>Hii</p>';
var temp=doc.getElementsByTagName('style');
while(temp.length)
    temp[0].parentNode.removeChild(temp[0]);
console.log(doc.body.innerHTML); // '<p></p><p>Hii</p>'

If you don't do that, you could unintentionally remove stuff from other tags, like in comments or very necessary text from script tags (ie. $('body').append('<style>p { color: blue; }</style>');).

May the <center> tag hold.

Collectives™ on Stack Overflow

How can I remove a <style type="text/css"> ... </style> inside my variable with Javascript and regex

4 Answers 4

7 Comments

5 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

7 Comments

5 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related