0

I'd like to use the following regex in order to validate project version numbers:

(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$

DEMO

Valid inputs:

  • 1.0.0-SNAPSHOT
  • 1.0.0.RC
  • 1.0.0

I'm trying to use a script as follows:

#!/bin/bash

r=true;
p="(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$"
while [ $r == true ]; do
    echo "get_v: "
    read v;
    if [[ $v =~ $p ]]; then
        echo "ok";
        r=false
    else
        echo "nok"
    fi
done

But it returns me NOK also by using valid inputs.

What am I doing wrong?

6
  • There are some examples on regex demo. Commented Oct 5, 2014 at 12:32
  • I've just updated the regex (but still doesn't work in bash) Commented Oct 5, 2014 at 12:36
  • somewhat related: Backus–Naur Form Grammar for Valid SemVer Versions Commented Oct 5, 2014 at 12:39
  • Where did you get that regex? Commented Oct 5, 2014 at 13:10
  • 1
    @vdenotaris Sorry, I once was blind and missed the link. The regex isnt't right as Bash does not support PCRE :] Commented Oct 5, 2014 at 14:01

2 Answers 2

5

Bash doesn't supports pcre - perl regular expressions. It supports extended regular expr - ERE.

You can check your string with grep -P like:

while read -r ver
do
    res=$(grep -oP '(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$' <<<"$ver")
    echo "ver:$ver status:$?  result:=$res="
done <<EOF | column -t
1
1-release
1.2
1.2-dev1
1.0.0-release
1.0.0.3
Q
x-release
1.x
z.2
1.x-dev1
1.x.0-dev3
.1.0-dev3
v1
v1.3-SNAPSHOT
EOF

However, recheck your regex, because the above prints:

ver:1              status:1  result:==
ver:1-release      status:1  result:==
ver:1.2            status:0  result:=1.2=
ver:1.2-dev1       status:1  result:==
ver:1.0.0-release  status:1  result:==
ver:1.0.0.3        status:0  result:=1.0.0.3=
ver:Q              status:1  result:==
ver:x-release      status:1  result:==
ver:1.x            status:1  result:==
ver:z.2            status:1  result:==
ver:1.x-dev1       status:1  result:==
ver:1.x.0-dev3     status:1  result:==
ver:.1.0-dev3      status:1  result:==
ver:v1             status:1  result:==
ver:v1.3-SNAPSHOT  status:0  result:=1.3-SNAPSHOT=

I would use

r='((?<=\A)|(?<=\s))v?\d+(\.\d+)*(-\w+)?(?=(\s|\z))'

e.g.:

r='((?<=\A)|(?<=\s))v?\d+(\.\d+)*(-\w+)?(?=(\s|\z))'
while IFS= read -r ver
do
    res=$(grep -oP "$r" <<<"$ver")
    printf "ver:%-15.15s status:%s result:=%s=\n" "$ver" $?  "$res"
done <<EOF
1
 1-release
  1.2
1.2-dev1
1.0.0-release
1.0.0.3
Q
x-release
1.x
z.2
1.x-dev1
1.x.0-dev3
.1.0-dev3
v1
v1.3-SNAPSHOT
EOF

prints:

ver:1               status:0 result:=1=
ver: 1-release      status:0 result:=1-release=
ver:  1.2           status:0 result:=1.2=
ver:1.2-dev1        status:0 result:=1.2-dev1=
ver:1.0.0-release   status:0 result:=1.0.0-release=
ver:1.0.0.3         status:0 result:=1.0.0.3=
ver:Q               status:1 result:==
ver:x-release       status:1 result:==
ver:1.x             status:1 result:==
ver:z.2             status:1 result:==
ver:1.x-dev1        status:1 result:==
ver:1.x.0-dev3      status:1 result:==
ver:.1.0-dev3       status:1 result:==
ver:v1              status:0 result:=v1=
ver:v1.3-SNAPSHOT   status:0 result:=v1.3-SNAPSHOT=

if you don't want allow v1.1 - e.g. v at the beginning remove the v? from the regex.

If you want more restrictive regex, use the

r='((?<=\A)|(?<=\s))\d+(\.\d+){1,2}(-[A-Z]+)?(?=(\s|\z))'

will allows only 2 or 3 numbers and only uppercase after the -.

and finally, if you want pure bash - use ERE like the next:

r='^[0-9]+(\.[0-9]+)+(-[A-Z]+)?$'
while read -r ver
do
    [[ $ver =~ $r ]] && echo "$ver: ok" || echo "$ver: no"
done <<EOF | column -t
1
1-RELEASE
1.2
1.2-DEV
1.2-DEV2
1.0.0-RELEASE
1.0.0  
1.0.0.3
Q
x-RELEASE
1.x
z.2
1.x-DEV
1.x.0-DEV
.1.0-DEV
v1
v1.3-SNAPSHOT
EOF

prints

1:              no
1-RELEASE:      no
1.2:            ok
1.2-DEV:        ok
1.2-DEV2:       no
1.0.0-RELEASE:  ok
1.0.0:          ok
1.0.0.3:        ok
Q:              no
x-RELEASE:      no
1.x:            no
z.2:            no
1.x-DEV:        no
1.x.0-DEV:      no
.1.0-DEV:       no
v1:             no
v1.3-SNAPSHOT:  no
Sign up to request clarification or add additional context in comments.

Comments

1

Your regex uses look-arounds, which bash doesn't support. But you don't need look-arounds, once your regex has some "problems" fixed.

The look-arounds can be removed without changing what matches.

The leading look ahead is impossible to fail:

(?!\.)(\d+...

Because the regex start with a digit, it's unnecessary to assert it isn't a dot.

The trailing look ahead is also impossible to fail:

 (?![\d.])$

The end-of-input can't be a digit.

You also have unnecessary brackets. With all the unnecessary parts removed, we get:

\d+(\.\d+)+([-.][A-Z]+)?

But bash doesn't support \d, so try using [0-9] instead of \d:

[0-9]+(\.[0-9]+)+([-.][A-Z]+)?

That should work.

1 Comment

bash supports extended regular expressions; what it doesn't support is Perl regular expressions like the (?!\.) used in the question. However, \d is also not part of the extended regular expression language, either. You'll have to use [0-9] instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.