2

I am getting page contents into variable $content

I need to strip HTML comments from $content using regular expression. I tried following code, it's not working properly

$content = preg_replace('/<!--(.|\)*?-->/', '', $content);
0

3 Answers 3

6

looks like you are missing something.

 $content = preg_replace( '/<!--(.|\s)*?-->/' , '' , $content );

You can test it here http://www.phpliveregex.com/p/1LX

Sign up to request clarification or add additional context in comments.

4 Comments

You solution worked as expected. Thanks.
Any idea why this line of code is causing a "500 Internal Server Error" on my end?
This worked for me: $html = preg_replace("~<!--(.*?)-->~s", "", $html);
It is a very poor, resource-consuming pattern that will fail with longer input.
4

Your back slash is escaping your )... I'm not sure what you think (.|\) is doing; Why not just use .*? and remove the capture group entirely?

Also, you want to set the s modifier to make . match new lines.

Revised code

$content = preg_replace('/<!--.*?-->/s', '', $content);

http://php.net/manual/en/reference.pcre.pattern.modifiers.php
http://www.regular-expressions.info/

Comments

0

Use this:

you have to escape ! because it's part of reg exp and also need to include new lines s modifier, this for if comment is not one line. And lazy flag U to match as less as possible, this when you got multiple comments Works perfect

$content = preg_replace('/<\!--.*-->/Us', '', $content);

2 Comments

The ! does not need to be escaped... While the U modifier is an alternative the OP has already set the regex to match in an ungreedy fashion with the use of ??
I said it for my own code

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.