3

We have a regex to strip out non alpha numeric characters except for '#', '&' and '-'. Here is what it looks like:

preg_replace('/[^a-zA-Z0-9#&-*]/', '', strtolower($title));

Now we need to support traditional Chinese strings and the above function won't work. How can I implement similar functionality for traditional Chinese.

Thanks,

1
  • 1
    So which Chinese characters are "alpha numeric"? Commented Aug 12, 2011 at 19:50

2 Answers 2

3

Use u modifier:

preg_replace(`/[^a-zA-Z0-9#&-*诶]/u`, '', $string);

By the way, don't use strtolower(), because it will break your string. Use mb_strtolower():

mb_strtolower($string, 'UTF-8');
Sign up to request clarification or add additional context in comments.

Comments

0

Have you tried mb_ereg_replace() instead of preg_replace()? That might do the trick.

http://www.php.net/manual/en/function.mb-ereg-replace.php

2 Comments

ereg should be avoided in general, even if this particular one's in the mb_ section and not marked as deprecated. preg's the standard regex interface in PHP now, and ereg itself will vanish some day.
There is no preg_replace in the mb_-section though, which is why I suggested the ereg from there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.