2

I'm using the following regular expression with preg_replace to strip the string of any punctuation:

$string = preg_replace("#((?!-|')\pP)+#", '', $string);

But I realized that it ruins some unicode characters. When the string is something like this "höpöttää?!...", I get back this "h�p�ttää" with no punctuation but ruined characters.

I read the PHP documentation and found some advice to use `...`u modifier. So I tried this:

$string = preg_replace("`#((?!-|')\pP)+#`u", '', $string);

And it really fixed the problem with characters. But now it stopped removing the punctuation. With this string "höpöttää?!...", I get the same "höpöttää?!...".

0

1 Answer 1

3

Don't know what backticks doing there.

$string = preg_replace("#(?![-'])\pP#u", '', $string);

or

$string = preg_replace("#[^-'\PP]#u", '', $string);

DEMO

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.