How to use preg_replace outside <script></script> in php

Question

I've got a string :

$source = '&
<script type="text/javascript">&</script>
&
<script type="text/javascript">&</script>
&';

The desired result is :

&amp;
<script type="text/javascript">&</script>
&amp;
<script type="text/javascript">&</script>
&amp;

I try with :

echo preg_replace("#&(?!amp;)(?!<\/script>)(?![^<]script.*?>)#i",
                  "&amp;", $source);

But I can only replace the first "&" or they are all replaced.

How can I get this result ?

Edit 1 :

Now if I've got a string :

$source = '&
<script type="text/javascript">text&text</script>
&
<script type="text/javascript">&</script>
&';

The desired result is :

&amp;
<script type="text/javascript">text&text</script>
&amp;
<script type="text/javascript">&</script>
&amp;

Why do you need to encode things that might contain a <script> tag? If that is user input, you're wide open to all sorts of XSS nastiness. — Thomas
– Thomas, Commented Jan 28, 2010 at 21:12
I use Yahoo Yui's library and "post request" in XmlHttpRequest for datasources don't work — Kevin Campion
– Kevin Campion, Commented Jan 28, 2010 at 21:16

Cristian Toma · Accepted Answer · 2010-01-28 21:24:52Z

2

Try this

$output = preg_replace("/&(?!amp;)(?!<\/script>)(?![^<]script.*?>)/", "&amp;", $source);

answered Jan 28, 2010 at 21:24

Cristian Toma

5,8892 gold badges38 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Cristian Toma Over a year ago

@Kevin - I tried it on my server and it works as you would expect. What version are you using?

Kevin Campion Over a year ago

I found how it don't work, I have update my question. I add "text&text". When "&" is between other characters, the regex don't work.

Kevin Campion Over a year ago

Ok I found the answer for my last comment. It's "/^&(?!amp;)(?![^<]script(.*?)>)(?!<\/script>)/"

Lucas Oman · Accepted Answer · 2010-01-28 21:32:51Z

1

Stop it with the regexes already. Please. I can't take it anymore. My head hurts, but only because I'm banging it on my desk.

I would suggest using DOMDocument or SimpleXmlElement to parse the string and then loop through each non-script tag to encode each ampersand.

answered Jan 28, 2010 at 21:32

Lucas Oman

15.9k2 gold badges47 silver badges45 bronze badges

5 Comments

Kevin Campion Over a year ago

I totally understand what you mean, I plan to use XSLT but for now I'm forced to use this case... sorry for your head ;)

Lucas Oman Over a year ago

@Christina Toma Why not? If it's as small a document as he shows, then it will require minimal processing for parsing. If, however, the string grows (likelihood of which is inversely proportional to how much the dev insists it won't happen), then this solution will scale well. And what dev wants to come in later and maintain that regex?

Cristian Toma Over a year ago

@Lucas - But why not use the regex provided in the accepted answer, which is faster than all the DOMDocument processing ? What would be the advantages of using DOMDocument in your opinion ?

Lucas Oman Over a year ago

@Christina Toma I've already listed some good reasons in my previous comment, but here are a couple specific examples: What if he decides, later, that he also wants to escape angled brackets? Or what if he decides he also wants to skip embed tags? In a large application, maintainability and scalability are far more important than negligible performance improvements.

Cristian Toma Over a year ago

@Lucas - You are absolutely right about maintainability and scalability, but speed is also very important in a large application.

gregseth · Accepted Answer · 2010-01-28 21:10:15Z

0

Using the g modifier replaces your match globally (every occurence).

echo preg_replace("#&(?!amp;)(?!<\/script>)(?![^<]script.*?>)#ig",
                  "&amp;", $source);

answered Jan 28, 2010 at 21:10

gregseth

13.5k17 gold badges67 silver badges96 bronze badges

1 Comment

Kevin Campion Over a year ago

Don't work : preg_replace() [function.preg-replace]: Unknown modifier 'g'

Collectives™ on Stack Overflow

How to use preg_replace outside <script></script> in php

3 Answers 3

3 Comments

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related