0

I'm working in bash and I want to remove a substring from a string, I use grep to detect the string and that works as I want, my if conditions are true, I can test them in other tools and they select exactly the string element I want.

When it comes to removing the element from the string I'm having difficulty.

I want to remove something like ": Series 1", where there could be different numbers including 0 padded, a lower case s or extra spaces.

temp='Testing: This is a test: Series 1'

    echo "A. "$temp
    if echo "$temp" | grep -q -i ":[ ]*[S|s]eries[ ]*[0-9]*" && [ "$temp" != "" ]; then
        title=$temp
        echo "B. "$title
        temp=${title//:[ ]*[S|s]eries[ ]*[0-9]*/ }
        echo "C. "$temp
    fi
    # I trim temp for spaces here
    series_title=${temp// /_}   
    echo "D. "$series_title

The problem I have is that at points C & D

Give me: C. Testing D. Testing_

3
  • You want to remove everything after : character? Commented Apr 16, 2019 at 11:33
  • I want to remove the ": series 1" or the slight variations of it (different numbers etc), but not anything before or after it. Commented Apr 16, 2019 at 12:54
  • The ${var//pattern/replacement} construct uses glob wildcard patterns (aka wildcards), not regular expressions. They look similar, but the syntax is quite a bit different, and globs are less powerful (unless you enable extended globs). Commented Apr 16, 2019 at 18:11

1 Answer 1

2

You can perform regex matching from bash alone without using external tools.

It's not clear what your requirement is. But from your code, I guess following will help.

temp='Testing: This is a test: Series 1'

# Following will do a regex match and extract necessary parts
# i.e. extract everything before `:` if the entire pattern is matched
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]* ]] || { echo "regex match failed"; exit; }

# now you can use the extracted groups as follows    
echo "${BASH_REMATCH[1]}"    # Output = Testing: This is a test

As mentioned in the comments, if you need to extract parts both before and after the removed section,

temp='Testing: This is a test: Series 1 <keep this>'
[[ $temp =~ (.*):\ *[Ss]eries\ *[0-9]*\ *(.*) ]] || { echo "invalid"; exit; }
echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]}"  # Output = Testing: This is a test <keep this>

Keep in mind that [0-9]* will match zero lengths too. If you need to force that there need to be at least single digit, use [0-9]+ instead. Same goes for <space here>* (i.e. zero or more spaces) and others.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.