3

I have a file called test.txt with the following contents:

1 2 3

I have the following script that uses a regular expression to match at least one whitespace character between the numbers:

#!/bin/sh
if ! grep -q -e "1[ \t]+2[ \t]+3" test.txt; then
    echo "not found"
else
    echo "found"
fi

Executing the script prints out not found, but it should have print out found. Why is that?

0

2 Answers 2

2

Per the grep man:

Basic vs Extended Regular Expressions

In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).

Try:

#!/bin/sh
if ! grep -q -e "1[ \t]\+2[ \t]\+3" test.txt; then
    echo "not found"
else
    echo "found"
fi
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. An alternative would be to just change the -e to -E.
Hmm, it's still not quite working - the script prints out not found if I change the spaces in test.txt to tabs.
My man says -P is "highly experimental", so I'll change the ` \t` to [:space:] instead. Thank you!
@pacoverflow Sorry, I should have caught that! \t won't work with grep because grep uses a POSIX regex definition (which doesn't define \t as a tab character). You easily work around this by just pasting a literal tab character into your pattern. Alternatively, depending on your environment, you can try using the -P flag to tell grep to use the PERL regex definition. I think there are other solutions as well.
0

Well, I tried to edit the other answer, which is incorrect as it currently stands. But the edit was rejected, so I'll have to post my own answer, given that comments are "second class citizens on the Stack Exchange network, not designed to hold information for all eternity [and] may get cleaned up at any time."

As mentioned in the other answer, the -e option only supports basic regular expressions (meaning that + does not have special meaning). Therefore the -E option should be used for extended regular expressions, which support the + metacharacter.

In addition, grep only supports POSIX regular expressions, which do not recognize \t as a tab character. The easiest way to fix this, while still maintaining readability and without using any experimental grep options (such as -P) is to replace [ \t] with [[:space:]].

Therefore the fixed script looks like:

#!/bin/sh
if ! grep -q -E "1[[:space:]]+2[[:space:]]+3" test.txt; then
    echo "not found"
else
    echo "found"
fi

2 Comments

@CharlesDuffy It has been unaccepted. BTW, I looked at the 2 questions you cited while closing the question as a duplicate, and I do not think they apply. The first question is about perl regex, which I am not using. The second question is about \s which I am not using and the answer even mentions using \t which is incorrect for my situation.
Gotcha. I read \t as something taken from PCRE engines, but you make a strong enough case -- reopened.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.