2

I am trying to write a Regex validator (Python 3.8) to accept strings like these:

foo
foo,bar
foo, bar
foo , bar
foo    ,      bar
foo, bar,foobar

This is what I have so far (but it matches only the first two cases):

^[a-zA-Z][0-9a-zA-Z]+(,[a-zA-Z][0-9a-zA-Z]+)*$|^[a-zA-Z][0-9a-zA-Z]+

However, when I add the whitespace match \w, it stops matching altogether:

^[a-zA-Z][0-9a-zA-Z]+(\w+,\w+[a-zA-Z][0-9a-zA-Z]+)*$|^[a-zA-Z][0-9a-zA-Z]+

What is the pattern to use (with explanation as to why my second pattern above is not matching).

4
  • I assume this is a purely academic exercise because judicious use of split() and strip() would solve this problem with ease. Also, you say you're writing a "validator" which begs the question "What constitutes an invalid string?" Commented Aug 16, 2022 at 10:36
  • @Stuart this is part of a Django project - so to an extent it is academic - in the sense that I could simply override the save() method and do my checks there - but I like the idea of abstracting and encapsulating validation in another class, so it can be used on other fields and other models too. Valid strings are strings that can be split by a delimiter into a list of alphanumeric tokens. Commented Aug 16, 2022 at 10:47
  • Therefore the string 'foo' would be considered invalid as there's no apparent delimiter - is that right? Commented Aug 16, 2022 at 10:50
  • I'm simplifying things a bit here - but a single alphanumeric string is the exception (bad choice of word), to the general rule I gave previously. Commented Aug 16, 2022 at 10:53

1 Answer 1

3

\w matches [0-9a-zA-Z_] and it doesn't include whitespaces.

What you need is this regex:

^[a-zA-Z][0-9a-zA-Z]*(?:\s*,\s*[a-zA-Z][0-9a-zA-Z]*)*$

RegEx Demo

RegEx Details:

  • ^: Start
  • [a-zA-Z][0-9a-zA-Z]*: Match a text starting with a letter followed by 0 or more alphanumeric characters
  • (?:: Start non-capture group
    • \s*,\s*: Match a comma optionally surrounded with 0 or more whitespaces on both sides
    • [a-zA-Z][0-9a-zA-Z]*: Match a text starting with a letter followed by 0 or more alphanumeric characters
  • )*: End non-capture group. Repeat this group 0 or more times
  • $: End
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.