0

Regular expressions are not strong point. I can do simple stuff, but this one has just got my goat !! So could someone give me a hand with this one.

Here's the comment in the code :

// If utf8 detection didnt work before, strip those weird characters for an underscore, as a last resort.

eregi_replace("[^a-z0-9 \-\.\(\)\/\\]","_",$str);

to (here's what I tried)

preg_replace("{[^a-z0-9 \-\.\(\)\/\\]}i","_",$str);

Any regex pros out there who give me a hand?

2
  • Nevermind , I got it. it becomes preg_replace("{[^a-z0-9]\-\.()\/\\/}i","_",$str) Commented Oct 20, 2011 at 19:30
  • I'd be careful, though - I don't think the eregi_replace and your proposed preg_replace expressions are even nearly equivalent. I would recommend testing it thoroughly, then if you still feel it is the answer, post it as an answer and accept it Commented Oct 21, 2011 at 14:25

3 Answers 3

1

You need to specify regexp identifier such as # or /

preg_replace("#[^a-z0-9 \-\.\(\)\/\\]#i","_",$str);

So you should enclose your regular expression in those identifier characters.

Sign up to request clarification or add additional context in comments.

1 Comment

PHP has supported both types of delimiting for a while, so that is not a/the problem with the expression, though it's probably not a bad idea to change them anyways!
1

First, I believe the { and } are fine as delimiters for the expression from the flags, but I know there are some regex flavors that don't support it, so it might be a good idea to just use something like ! or #

Second, I am not sure how the expression before worked, because AFAIK escaping with a \ character does not work with ERE expressions. You have to represent special characters like ^, -, and ] by their position within the class (^ cannot be the first character, ] must be the first character, and - must be either the first or the last character). The - character in the first expression would be interpreted as a range specifier (in this case a character in the range between \ and \). Additionally, the \ characters are treated literally, so you've got a confusing looking and largely redundant regex.

The replacement expression, however, needs to be in preg notation/flavor, so there are rule changes:

  • Very few things need to be escaped in a character class, even with the new rules
  • The \ character needs to be escaped twice - once for the string, and then one more time for the regex - otherwise, it will escape the closing bracket ]
  • Assuming you want to match a dash (or rather match something OTHER than a dash, it needs to be moved to the end of the class

So, here is some code (link) that I believe does what you need it to do:

$source = 'hello! @#$%^&* wazzup-dawg?.()/\\[]{}<>:"';
$blah = preg_replace('![^a-z0-9 .()/\\\\-]!i','_',$source);

print($blah);

Comments

0

preg_replace("{[^a-z0-9]-.()/\/}i","_",$str)

works just fine.

I tried it with all # and / and { and they all worked.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.