3

I want to use Bash regex matching (with the =~ operator) to match a string which includes quotes. Say for example I have the following string and I want to extract the text between quotes:

foo='"Hello World!"'

My first try was to put the regex in strong quotes like so to force the quotes to be regular characters.

[[ "$foo" =~ '".*"' ]]

That fails because Bash interprets this as a string match rather than a regex.

Then I tried to escape the quotes with \ like so:

[[ "$foo" =~ \".*\" ]]

That fails (EDIT: Actually, it doesn't. It fails if there's no space between \" and ]] but the version here works just fine.) because the first \ is in plain bash text and fails to escape the quote (I think. The coloring in VIM indicates that the second quote is escaped but not the first and running the script fails).

So is there some way I can escape the " characters without transforming the regex match into a string match?

3
  • 3
    foo='"Hello World!"'; [[ "$foo" =~ \".*\" ]] && echo "match" works for me, and the Vim syntax highlighting (which by the way does not affect the code in any ways) is fine. Commented Jun 11, 2014 at 18:58
  • 3
    Take syntax highlighting with a grain of salt, particularly around regular expressions. Most syntax highlighting engines are not robust lexers or parsers, and (most relevantly) are not the same lexer or parser as used by the actual interpreter that's going to read your code. Commented Jun 11, 2014 at 19:06
  • Ah, the problem was apparently that I didn't use a space after between the \" and the ]] when I used it in my real program. Bash handles spaces in confusing ways. Commented Jun 11, 2014 at 19:48

1 Answer 1

12

Actually, your second attempt works for me in bash 3 and 4:

$ echo "$BASH_VERSION"
3.2.51(1)-release
$ echo "$foo"
"Hello World!"
$ [[ "$foo" =~ \".*\" ]] && echo $BASH_REMATCH
"Hello World!"

$ echo "$BASH_VERSION"
4.3.18(1)-release
$ echo "$foo"
"Hello World!"
$ [[ "$foo" =~ \".*\" ]] && echo "${BASH_REMATCH[0]}"
"Hello World!"

However, to talk theory for a second, it all has to do with how bash interprets the expression as a whole. As long as the regular-expression characters themselves aren't quoted, the rest of the expression can be quoted without side-effects:

$ [[ $foo =~ '"'.*'"' ]] && echo $BASH_REMATCH
"Hello World!"

but perhaps the easiest way of all is to use a second variable to hold the regex itself.

$ exp='".*"'
$ [[ $foo =~ $exp ]] && echo $BASH_REMATCH
"Hello World!"
Sign up to request clarification or add additional context in comments.

2 Comments

As I said in my comment on the OP it turns out that the issue was the lack of a space between \" and ]] in my actual program. I probably should have tested before I said it didn't work. I definitely prefer the method in that third block of code though. That way VIM's text coloring doesn't get screwed up and I don't have to type so many strong quotes to escape the quotes in my RE.
Also note that in Bash 3.1-3.2 they changed how the quotes work to default to exact string match so your best bet is example 3 to handle newer and older versions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.