0

Need help coming up with a regex that only allows numbers, letters, empty string, or spaces.

^[_A-z0-9]*((-|\s)*[_A-z0-9])*$

This one is the closest I've found but it allows underscores and hyphen.

7
  • Can you just remove the underscores if it does everything else you want? ^[A-z0-9]*((-|\s)*[A-z0-9])*$ Commented Jan 27, 2020 at 17:49
  • I tried that but I am not sure how the syntax should look - do you know? Commented Jan 27, 2020 at 17:51
  • 1
    The syntax should be exactly the same except you take out _. Commented Jan 27, 2020 at 17:51
  • 1
    The hyphen is because of (-|\s)*. Why do you have that if you don't want to allow hyphens? Commented Jan 27, 2020 at 17:53
  • I don't understand how you could have written this regexp in the first place. Why would you put those characters in if you didn't want them? Did you just copy this from somewhere without actually understanding how it works? Commented Jan 27, 2020 at 17:54

3 Answers 3

3

Only letters, numbers, space, or empty string?
Then 1 character class will do.

^[A-Za-z0-9 ]*$
^ : start of the string or line (depending on the flag)
[A-Za-z0-9 ]* : zero or more upper-case or lower-case letters, or digits, or spaces.
$ : end of the string or line (depending on the flag) 

The A-z range contains more than just letters.
You can see that in the ASCII table.

And \s for whitespace also includes tabs or linebreaks (depending on the flag).
But if you also want those, then just use that instead of the space.

^[A-Za-z0-9\s]*$

Also, depending on the regex engine/dialect that your language/tool uses, you could use \p{L} for any unicode letter.
Since [A-Za-z] only includes the normal ascii letters.
Reference here

Sign up to request clarification or add additional context in comments.

Comments

1

Your regex is too complicated for what you need. the first part is fine, you are allowing letter and number, you could simply add the space character with it.

Then, if you use the * character, which translate to 0 or any, you could take care of your empty string problem.

See here.

/^[a-z0-9 ]*$/gmi

Notice here that i'm not using A-z like you were because this translate to any character between the A in ascii (101) and the z(172). this mean it will also match char in between (133 to 141 that are not number nor letter). I've instead use a-z which allow lowercase letter and used the flag i which tell the regex to not take care of the case.

Here is a visual explanation of the regex

Regular expression visualization

You can also test more cases in this regex101

Comments

1

Matching only certain characters is equivalent to not matching any other character, so you could use the regex r = /[^a-z\d ]/i to determine if the string contains any character other than the ones permitted. In Ruby that would be implemented as follows.

"aBc d01e e$9" !~ r #=> false
"aBc d01e ex9" !~ r #=> true

In this situation there may not much to choose between this approach and attempting to match /\A[a-z\d ]+\z/i, but in other situations the use of a negative match can simplify the regex considerably.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.