Using regex to remove empty paragraph tags (standard str_replace on "space" not working)

Question

TinyMCE creates empty paragraph tags when you hit enter twice. like:

<p> </p>

Which is SPACE

In FireBug it calls this space a " " but the html code/DB backend just shows a space. When I do "str_replace(' '....." it doesnt find the block... basically I think the "space" is somehow not a standard space and some sort of borked encoded space. Is there a regex I can run that will remove this tag? I've been stuck on this for hours... or even something like

regex('LESS THAN THREE CHARS'...)

would probably work

Thank you

regex('.'...) a period may work for this character? — Smandoli
– Smandoli, Commented Jul 20, 2012 at 2:56
Non-breaking space does have a different code point. Can you loop through the string and check print the numeric equivalent of the character? — nhahtdh
– nhahtdh, Commented Jul 20, 2012 at 3:01
Ok, when I do utf8_encode( ) I am getting Â ...turns out the DB was ISO, but now has been changed to UTF8 encoding... so now how do I get rid of this garbage data? — Chad
– Chad, Commented Jul 20, 2012 at 4:50
possible duplicate of How to remove empty paragraph tags from string? — feeela
– feeela, Commented Jan 14, 2014 at 14:03

inhan · Accepted Answer · 2012-07-20 03:12:36Z

5

I would use:

$str = preg_replace('~<p>\s*<\/p>~i','',$str);

where \s signifies a white space of any kind (tab, space, etc.) and * indicates 0 or more occurence of this (space). So ,  , {multiple spaces here} will all be replaced by an empty string. The additional i flag is for case-insensitivity, just in case 's might instead be 's.

edited Jul 20, 2012 at 3:12

answered Jul 20, 2012 at 3:00

inhan

7,5002 gold badges26 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Chad Over a year ago

no dice... Warning: preg_replace() [<a href='function.preg-replace'>function.preg-replace</a>]: Unknown modifier 'g'

TurdPile Over a year ago

@inhan There is no g flag, that is only for str_replace(). preg_replace automatically replaces everything from the input. Chad, remove the g flag and it should work.

inhan Over a year ago

Sorry confused Javascript and PHP flags :) Editing my post. Thanks @TurdPile

qwazix · Accepted Answer · 2012-09-19 08:34:18Z

2

$text = preg_replace('#<p>&nbsp;</p>#i','<p></p>', $text);

worked for me, as the variable contains the actual string " " and not the non-breaking space unicode character. Thus neither #.#i worked nor copying the non-breaking-space character from character map.

edited Sep 19, 2012 at 8:34

answered Sep 18, 2012 at 14:30

qwazix

97611 silver badges19 bronze badges

Comments

Jimski · Accepted Answer · 2016-01-01 14:04:43Z

1

The answers above won't work if  tag has any inline attributes, such as .

Here is a regex to catch it:

#<p[^>]*>(\s|&nbsp;|</?\s?br\s?/?>)*</?p>#

answered Jan 1, 2016 at 14:04

Jimski

1,0401 gold badge13 silver badges28 bronze badges

Comments

ajwilco · Accepted Answer · 2017-07-27 14:21:56Z

0

None of the given answers were working for me, but here's what did:

$str = str_replace('&lt;p&gt;&nbsp;&lt;/p&gt;', '', $str);

Definitely not the most correct way to do things. But if you're working with (against) TinyMCE, specifically inside of SuiteCRM, this should help.

answered Jul 27, 2017 at 14:21

ajwilco

1

Comments

Patrick · Accepted Answer · 2012-07-20 02:59:09Z

-1

Try this

$string="a bunch of text with <p> </p> in it";

$string=str_replace("/<p> <\/p>/","",$string);

Note a couple things: the forward slashes before and after the string to match, as well as the escaping backslash before the forward slash in the second paragraph tag.

answered Jul 20, 2012 at 2:59

Patrick

3281 gold badge2 silver badges9 bronze badges

3 Comments

TurdPile Over a year ago

This is a poor example for this. Using regular expressions with preg_replace is the way to go.

Patrick Over a year ago

@TurdPile I didn't say it was a good example, but the OP's question was regarding the regex method, not the merits of str_replace. I personally use preg_replace as well.

TurdPile Over a year ago

Problem is you missed two big features that would make that work, one is you should be using \s+ instead of a space, secondly, you should use the g flag to signify a global replacement, otherwise it will simply replace the first one it comes across.

Collectives™ on Stack Overflow

Using regex to remove empty paragraph tags <p> </p> (standard str_replace on "space" not working)

5 Answers 5

3 Comments

Comments

Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

Comments

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related