0

I've searched and searched and for some reason I couldn't find any solution.

This is my current text:

Lorem ipsum <strong>dolor</strong> sit <i>amet</i>.

This is what I want:

Lorem ipsum sit.

I do not want to use an HTML parser. I just want to use a simple regex to remove HTML tags and their inner content.

3 Answers 3

1

This regular expression used with the global flag will match html-tags and text inside html-tags.

<[\/\!]*?[^<>]*?>[A-Za-z0-9.,;:]*<[\/\!]*?[^<>]*?>
Sign up to request clarification or add additional context in comments.

4 Comments

strip_tags just unwraps the content. I want the content gone as well.
Your question has already been answered here stackoverflow.com/questions/1516085/…
Both answers use HTML parsers, something I don't want to use.
Edited the answer with a regular expression for you.
0

Though @Tommy's answer works for you, that regex is really much too complicated for what you want to do. You can simply do this:

$str = "Lorem ipsum <strong>dolor</strong> sit <i>amet</i>.";

$r = preg_replace("/ <\S*>/", "", $str);

echo $r;
#=> Lorem ipsum sit.

1 Comment

Nice. I just took a regexp from the notes on the documentation for strip_tags(). This looks way nicer.
0
preg_replace('/(<.*?>)|(&.*?;)/', '', $string)

This one works pretty well for me. It strips all the HTML tags and special HTML characters. Hope this helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.