4

Here is a little unit script for the good old bash regex match called by =~

#!/bin/bash

# From "man bash"
# An additional binary operator, =~, is available, with the same
# precedence as == and !=. When it is used, the string to the right of
# the operator is considered an extended regular expression  and  matched
# accordingly (as  in regex(3)).  The return value is 0 if the string
# matches the pattern, and 1 otherwise.  If the regular expression
# is syntactically incorrect, the conditional expression's return value
# is 2.

# The above should say regex(7) of course

match() {
   local REGEX=$1
   local VAL=$2
   [[ $VAL =~ $REGEX  ]]
   RES=$?
   case $RES in
      0) echo "Match of '$VAL' against '$REGEX': MATCH" >&2 ;;
      1) echo "Match of '$VAL' against '$REGEX': NOMATCH" >&2 ;;
      2) echo "Error in regex expression '$REGEX'" >&2 ;;
      *) echo "Unknown returnvalue $RES" >&2 ;;
   esac
   echo $RES
}

v() {
   SHALL=$1
   IS=$2
   if [ "$SHALL" -eq "$IS" ]; then echo "OK"; else echo "NOT OK"; fi
}

unit_test() {
   v 0 "$(match A                A  )"
   v 0 "$(match A.               AB )"
   v 0 "$(match A[:digit:]?      A  )"
   v 0 "$(match A[:digit:]       A6 )"
   v 0 "$(match \"A[:digit:]*\"  A6 )"  # enclosing in quotes needed otherwise fileglob happens
   v 0 "$(match A[:digit:]+      A6 )"
   v 0 "$(match A                BA )"
   v 1 "$(match ^A               BA )"
   v 0 "$(match ^A               Ab )"
   v 0 "$(match 'A$'             BA )"
   v 1 "$(match 'A$'             Ab )"
}

unit_test

Looks pretty straightforward but running this yields:

Match of 'A' against 'A': MATCH
OK
Match of 'AB' against 'A.': MATCH
OK
Match of 'A' against 'A[:digit:]?': MATCH
OK
Match of 'A6' against 'A[:digit:]': NOMATCH
NOT OK
Match of 'A6' against 'A[:digit:]*': MATCH
OK
Match of 'A6' against 'A[:digit:]+': NOMATCH
NOT OK
Match of 'BA' against 'A': MATCH
OK
Match of 'BA' against '^A': NOMATCH
OK
Match of 'Ab' against '^A': MATCH
OK
Match of 'BA' against 'A$': MATCH
OK
Match of 'Ab' against 'A$': NOMATCH
OK

One would expect

Match of 'A6' against 'A[:digit:]'

and

Match of 'A6' against 'A[:digit:]+'

to succeed.

What am I doing wrong?

9
  • 5
    [:digit:] must be enclosed between square brackets. => [[:digit:]] Commented Jan 26, 2017 at 13:25
  • 4
    Otherwise it is seen as [:digt] or [id:tg] ... Commented Jan 26, 2017 at 13:29
  • 1
    Aside: All-caps variable names are bad form. See pubs.opengroup.org/onlinepubs/009695399/basedefs/…, fourth paragraph, specifying all-caps names for variables with meaning to the shell or operating system and reserving names with at least one lowercase character for application use. While that spec is specifically for environment variables, assigning a shell variable will implicitly overwrite any like-named environment variable that's already present, making the convention apply in both places. Commented Jan 26, 2017 at 16:52
  • 1
    You might also run your code through shellcheck.net -- you've got a few quoting bugs. Commented Jan 26, 2017 at 17:10
  • 1
    ...another aside: function foo { ...; } is needlessly incompatible with other shells -- "needlessly" because unlike other bashisms it offers no compensating advantages for the loss in portability. Consider making a habit of using foo() { ...; }, which is POSIX-compliant and thus will work in every modern shell. Commented Jan 26, 2017 at 17:13

3 Answers 3

3

Remember to enclose the character classes within brackets [], to match them as a list of characters i.e. as [[:digit:]]

string="A6"
[[ $string =~ A[[:digit:]] ]]
echo $?
0

Check more on Bracket-Expressions.

Sign up to request clarification or add additional context in comments.

5 Comments

The solution is to use [[:digit:]] - the OP hasn't suggested that they want to capture anything.
@TomFenech: Do you think that am missing something in my answer?
On the contrary, I think that you are adding something extra which distracts from the main point - [:digit:] represents a series of characters that can be put inside a bracket expression - the two things need to be used together. A capturing group is a completely different thing.
I'm not suggesting that you delete the answer, I'm just not sure why you're talking about capture groups at all as they don't seem to be relevant to the question. I don't think you can remove answers that have upvotes and have been accepted.
Seems fine, yeah. I don't really get the bit about "expansion as a range", I guess I'd describe the character class as representing a list of characters. Personally I'd just drop the whole bit about capturing things too but it's not my answer!
2

You are using the [:digit:] in the wrong contexts. These character class are meant to be used inside a bracket expression, like [[:digit:][:alnum:]._+-] (for example).

It should be:

if [[ "A6" =~ A[[:digit:]] ]] ; then
    echo "match"
fi

Comments

1

As suggested in the comments, the ShellCheck tool shows what the problem is:

The output of ShellCheck

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.