1

TinyMCE creates empty paragraph tags when you hit enter twice. like:

<p> </p>

Which is <p>SPACE</p>

In FireBug it calls this space a "&nbsp;" but the html code/DB backend just shows a space. When I do "str_replace('<p> </p>'....." it doesnt find the block... basically I think the "space" is somehow not a standard space and some sort of borked encoded space. Is there a regex I can run that will remove this tag? I've been stuck on this for hours... or even something like

regex('<p>LESS THAN THREE CHARS</p>'...)

would probably work

Thank you

4
  • 1
    regex('<p>.</p>'...) a period may work for this character? Commented Jul 20, 2012 at 2:56
  • Non-breaking space does have a different code point. Can you loop through the string and check print the numeric equivalent of the character? Commented Jul 20, 2012 at 3:01
  • Ok, when I do utf8_encode(<p> </p>) I am getting <p>Â </p>...turns out the DB was ISO, but now has been changed to UTF8 encoding... so now how do I get rid of this garbage data? Commented Jul 20, 2012 at 4:50
  • possible duplicate of How to remove empty paragraph tags from string? Commented Jan 14, 2014 at 14:03

5 Answers 5

5

I would use:

$str = preg_replace('~<p>\s*<\/p>~i','',$str);

where \s signifies a white space of any kind (tab, space, etc.) and * indicates 0 or more occurence of this (space). So <p></p>, <p> </p>, <p>{multiple spaces here}</p> will all be replaced by an empty string. The additional i flag is for case-insensitivity, just in case <p>'s might instead be <P>'s.

Sign up to request clarification or add additional context in comments.

3 Comments

no dice... Warning: preg_replace() [<a href='function.preg-replace'>function.preg-replace</a>]: Unknown modifier 'g'
@inhan There is no g flag, that is only for str_replace(). preg_replace automatically replaces everything from the input. Chad, remove the g flag and it should work.
Sorry confused Javascript and PHP flags :) Editing my post. Thanks @TurdPile
2
$text = preg_replace('#<p>&nbsp;</p>#i','<p></p>', $text);

worked for me, as the variable contains the actual string "&nbsp;" and not the non-breaking space unicode character. Thus neither #<p>.</p>#i worked nor copying the non-breaking-space character from character map.

Comments

1

The answers above won't work if <p> tag has any inline attributes, such as <p style="font-weight:bold">.

Here is a regex to catch it:

#<p[^>]*>(\s|&nbsp;|</?\s?br\s?/?>)*</?p>#

Comments

0

None of the given answers were working for me, but here's what did:

$str = str_replace('&lt;p&gt;&nbsp;&lt;/p&gt;', '', $str);

Definitely not the most correct way to do things. But if you're working with (against) TinyMCE, specifically inside of SuiteCRM, this should help.

Comments

-1

Try this

$string="a bunch of text with <p> </p> in it";

$string=str_replace("/<p> <\/p>/","",$string);

Note a couple things: the forward slashes before and after the string to match, as well as the escaping backslash before the forward slash in the second paragraph tag.

3 Comments

This is a poor example for this. Using regular expressions with preg_replace is the way to go.
@TurdPile I didn't say it was a good example, but the OP's question was regarding the regex method, not the merits of str_replace. I personally use preg_replace as well.
Problem is you missed two big features that would make that work, one is you should be using \s+ instead of a space, secondly, you should use the g flag to signify a global replacement, otherwise it will simply replace the first one it comes across.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.