1

I have a string in Bash which may or may not start with any number of leading spaces, e.g.

"  foo bar baz"
" foo bar baz"
"foo bar baz"

I want to delete the first instance of "foo" from the string, and any leading spaces (there may not be any).

Following the advice from this question, I have tried the following:

str=" foo bar baz"
regex="[[:space:]]*foo"
echo "${str#$regex}"
echo "${str#[[:space:]]*foo}"

If str has one or more leading spaces, then it will return the result I want, which is _bar baz (underscore = leading space). If the string has no leading spaces, it won't do anything and will return foo bar baz. Both 'echoes' return the same results here.

My understanding is that using * after [[:space:]] should match zero or more instances of [[:space:]], not one or more. What am I missing or doing wrong here?

EDITS

@Raman - I've tried the following, and they also don't work:

echo "${str#[[:space:]]?foo}"
echo "${str#?([[:space:]])foo}"
echo "${str#*([[:space:]])foo}"

All three solutions will not delete 'foo' whether or not there is a trailing space. The only solution that kind of works is the one I posted with the asterisk - it will delete 'foo' when there is a trailing space, but not when there isn't.

6
  • @RamanSailopal The docs at GNU say that ? matches zero or one occurrence, and * matches zero or more occurrences. I tried it anyway and it didn't work - will update the question. Commented Dec 1, 2020 at 10:30
  • and they also don't work: enable extglob... Commented Dec 1, 2020 at 10:35
  • 1
    What's wrong with just ${str#*foo}? Commented Dec 1, 2020 at 11:13
  • 1
    @oguzismail in the case str=oguzfoo, I guess op doesn't want a match. Commented Dec 1, 2020 at 15:31
  • @Lou: just to make sure this isn't an XY-problem: are you trying to split your string at spaces? if that's the case, you should instead use: read -ra ary -d '' < <(printf '%s\0' "$str"), and you'll have the tokens in the array ary. Commented Dec 1, 2020 at 15:35

3 Answers 3

6

The best thing to do is to use parameter expansions (with extended globs) as follows:

# Make sure extglob is enabled
shopt -s extglob

str=" foo bar baz"
echo "${str##*([[:space:]])}"

This uses the extended glob *([[:space:]]), and the ## parameter expansion (greedy match).

Edit. Since your pattern has the suffix foo, you don't need to use greedy match:

echo "${str#*([[:space:]])foo}"

is enough.

Note. you can put foo in a variable too, but just be careful, you'll have to quote it:

pattern=foo
echo "${str#*([[:space:]])"$pattern"}"

will work. You have to quote it in case the expansion of pattern contains glob characters. For example when pattern="foo[1]".

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks, this works! Why do you use the longest match instead of the shortest match out of interest?
@Lou: Otherwise it will only remove the first space. But now I realize you also have foo in your pattern, so echo "${str#*([[:space:]])foo}" would be enough. I've edited the answer (and also added a remark about putting the pattern in a variable).
@Lou: 1. you don't need to quote the token foo in the re: this is enough: re="*([[:space:]])foo*([[:space:]])". 2. The pattern must not be quoted! actually, quotes are here to prevent interpretation of the pattern as a pattern! hence: echo "${str#$re}" (without quotes for $re) is correct.
@Lou: not quite :) The pattern variable should be quoted in the parameter expansion only if you don't want it to be interpreted as a pattern. Here's an extremely simple example you can try: str="foo bar"; pattern="*". (The fact we're using quotes here is irrelevant). Without quotes: echo "${str#$pattern}" you'll get oo bar, because str matches the pattern f*. But with quotes: echo "${str#"$pattern"}", you'll get foo bar, since str doesn't match the verbatim content of pattern which is f*.
Ah that makes sense! Thanks for explaining it so clearly :)
|
3

My understanding is that using * after [[:space:]] should match zero or more instances of [[:space:]], not one or more

That's wrong.

What am I missing

That glob is not regex. In regex * matches zero or more preceding characters or groups. In glob * matches anything. It's the same as for filename expansion, think along ls [[:space:]]*foo.

You can use extended bash glob and do:

shopt -s extglob
str=' foo bar baz'
echo "${str#*([[:space:]])foo}"

To do anything more complicated, actually use a regex.

str=' foo bar baz';
[[ $str =~ ^[[:space:]]*foo(.*) ]];
echo "${BASH_REMATCH[1]}"

3 Comments

Ah brilliant! I did not know about the difference between regexes and globs before. Now it works, and the string trims correctly. Cheers :).
Out of interest though, why get the longest match from the start ##? The shortest match # also works.
Och, I think you wanted to remove the spaces behind foo too.
0

If what you want is a real regex match, you should be using a real regex match:

$: [[ "$str" =~ [[:space:]]*(.*) ]]
$: echo "[${BASH_REMATCH[1]}]"
[foo  bar       baz]

A more pedestrian approach would be to skip the quotes.

$: echo "[$str]"
[ foo bar baz]
$: new=$( echo $str )
$: echo "[$new]"
[foo bar baz]

Be aware that this opens you up to all sorts of messes in any more complex situations. It breaks if you wanted to preserve more than a single consecutive space between values, or a tab instead of just a quote, etc.

$: str=' foo  bar'$'\t''baz';
$: echo "[$str]"
[ foo  bar      baz]
$: new=$( echo $str )
$: echo "[$new]"
[foo bar baz]

It can cause other sorts of havoc too, but it's good to know the trick for the cases when it's appropriate.

2 Comments

This isn't what I'm trying to do - I only want to trim leading spaces from the front of the string plus a given word, not from all words in the string.
Which is why I warned about it. The first match solution is the better approach

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.