4

I want to strip invalid characters from a string with js.

My regex currently is as below:

var newString = oldString.replace(/([^a-z0-9 ]+)/gi, '');

i.e find anything but a-z or 0-9 and spaces independent of casing and replace with nothing - however I also want to allow underscore (_), hyphen (-) and dot (.).

I attempted to update my regex as below but it is not working as expected - after I made the change I found strings with brackets () were not getting those stripped?

var newString = oldString.replace(/([^a-z0-9 .-_]+)/gi, '');

Am I missing something simple?

2
  • I always use regex 101 to test any assumptions. See the link below. I've made it multi-line for the sake of the example regex101.com/r/xY1aL3/1 Commented May 6, 2015 at 12:02
  • @benembery - thanks for the link - really useful - never came across it before Commented May 6, 2015 at 12:08

4 Answers 4

8
var newString = oldString.replace(/([^a-z0-9 ._-]+)/gi, '');

                                               ^^

Keep - at the end as it forms a range when placed between []. Now it is forming a range between . and _. Or you can escape it as well.

 var newString = oldString.replace(/([^a-z0-9 ._\-]+)/gi, '');
Sign up to request clarification or add additional context in comments.

Comments

0

You have to escape a dot and hyphen:

var newString = oldString.replace(/([^a-z0-9 \.\-_]+)/gi, '');

3 Comments

You do not have to escape a dot in character classes.
Actually dots have no meaning in character ranges. The dash is the culprit.
within [] you dont need to escape
0

Use backslash to escape . - _

this should work

.replace(/([^a-z0-9 \.\_\-]+)/gi, '');

ALSO... you can also use \w to represent letters numbers and undrescore

[a-zA-Z0-9_] == \w

6 Comments

within [] you dont need to escape .-_
i think maybe you do on . and hyphen
hmmm a nice point made by vks below! may be using - will use it as range so you might need \ for - but not for .
i agree with you on that one @Wishy
Not according to popular opinion or reality even, but ok if you say so... w3schools.com/jsref/jsref_regexp_wordchar.asp
|
0

Put the literal dash last in the character class, or escape it with a backslash. Right now it's allowing an ASCII range of . to _.

var newString = oldString.replace(/[^a-z0-9 ._-]+/gi, '');

Side note: you don't need the parenthesis unless you're storing the match for something (and if parenthesis cover the whole match you don't need them then either, because \0 refers to the entire match).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.