5

I have a string separated by dot in Linux Shell,

$example=This.is.My.String

I want to

1.Add some string before the last dot, for example, I want to add "Good.Long" before the last dot, so I get:

This.is.My.Goood.Long.String

2.Get the part after the last dot, so I will get

String

3.Turn the dot into underscore except the last dot, so I will get

This_is_My.String

If you have time, please explain a little bit, I am still learning Regular Expression.

Thanks a lot!

4
  • Based on your tags, it sounds like this is specifically a sed question, right? (Except that $example in the first line, which makes it look perl...) Commented Nov 9, 2010 at 20:09
  • Yes,It is about sed. I havent install perl on my Linux. Commented Nov 9, 2010 at 20:21
  • Is this really a question about sed? It looks from the question that it's just about shell scripting. Commented Nov 10, 2010 at 7:00
  • I suggest for you to change the title. All threads on this site deal about "Linux Shell Script Problem" :-) . Commented May 29, 2016 at 15:15

6 Answers 6

10

I don't know what you mean by 'Linux Shell' so I will assume bash. This solution will also work in zsh, etcetera:

example=This.is.My.String
before_last_dot=${example%.*}
after_last_dot=${example##*.}
echo ${before_last_dot}.Goood.Long.${after_last_dot} 
This.is.My.Goood.Long.String

echo ${before_last_dot//./_}.${after_last_dot} 
This_is_My.String

The interim variables before_last_dot and after_last_dot should explain my usage of the % and ## operators. The //, I also think is self-explanatory but I'd be happy to clarify if you have any questions.

This doesn't use sed (or even regular expressions), but bash's inbuilt parameter substitution. I prefer to stick to just one language per script, with as few forks as possible :-)

Sign up to request clarification or add additional context in comments.

2 Comments

I didn't know bash is so powerful! This answer is an eye-opener for me. Thanks a lot!
Happy to help. "Thanks a lot!" is usually expressed on StackOverflow as an up-vote or an accepted answer (for the one that helped you solve your problem).
3

Other users have given good answers for #1 and #2. There are some disadvantages to some of the answers for #3. In one case, you have to run the substitution twice. In another, if your string has other underscores they might get clobbered. This command works in one go and only affects dots:

sed 's/\(.*\)\./\1\n./;h;s/[^\n]*\n//;x;s/\n.*//;s/\./_/g;G;s/\n//'
  1. It splits the line before the last dot by inserting a newline and copies the result into hold space:

    s/\(.*\)\./\1\n./;h
    
  2. removes everything up to and including the newline from the copy in pattern space and swaps hold space and pattern space:

    s/[^\n]*\n//;x
    
  3. removes everything after and including the newline from the copy that's now in pattern space

    s/\n.*//
    
  4. changes all dots into underscores in the copy in pattern space and appends hold space onto the end of pattern space

    s/\./_/g;G
    
  5. removes the newline that the append operation adds

    s/\n//
    

Then the sed script is finished and the pattern space is output.

At the end of each numbered step (some consist of two actual steps):

Step        Pattern Space                 Hold Space

  1.        This.is.My\n.String       This.is.My\n.String

  2.        This.is.My\n.String       .String

  3.        This.is.My                        .String

  4.        This_is_My\n.String     .String

  5.        This_is_My.String            .String

2 Comments

Nice work; you must have had more time to think about it than I did.
@Jonathan: Possibly, but it's a common pattern in sed - "divide and conquer". Quite similar to Johnysweb's Bash answer.
3

Solution

  1. Two versions of this, too:
    • Complex: sed 's/\(.*\)\([.][^.]*$\)/\1.Goood.Long\2/'
    • Simple: sed 's/.*\./&Goood.Long./' - thanks Dennis Williamson
  2. What do you want?
    • Complex: sed 's/.*[.]\([^.]*\)$/\1/'
    • Simpler: sed 's/.*\.//' - thanks, glenn jackman.
  3. sed 's/\([^.]*\)[.]\([^.]*[.]\)/\1_\2/g'

With 3, you probably need to run the substitute (in its entirety) at least twice, in general.

Explanation

Remember, in sed, the notation \(...\) is a 'capture' that can be referenced as '\1' or similar in the replacement text.

  1. Capture everything up to a string starting with a dot followed by a sequence of non-dots (which you also capture); replace by what came before the last dot, the new material, and the last dot and what came after it.

  2. Ignore everything up to the last dot followed by a capture of a sequence of non-dots; replace with the capture only.

  3. Find and capture a sequence of non-dots, a dot (not captured), followed by a sequence of non-dots and a dot; replace the first dot with an underscore. This is done globally, but the second and subsequent matches won't touch anything already matched. Therefore, I think you need ceil(log2N) passes, where N is the number of dots to be replaced. One pass deals with 1 dot to replace; two passes deals with 2 or 3; three passes deals with 4-7, and so on.

6 Comments

Thanks a lot!:)If u could explain a little bit then it will be perfect.
Your #2 can be simpler: sed 's/^.*\.//'
@Glenn: err...yes - that's what comes of fitting it in around a con-call that you're supposed to paying attention to.
Similarly, #1 can be simplified to: sed 's/\(.*\.\)/\1Goood.Long./'
@Dennis: many eyes make for shallow bugs - or all code, even one-liners, can benefit from constructive code review. Thanks.
|
3

Here's a version that uses Bash's regex matching (Bash 3.2 or greater).

[[ $example =~ ^(.*)\.(.*)$ ]]
echo ${BASH_REMATCH[1]//./_}.${BASH_REMATCH[2]}

Here's a Bash version that uses IFS (Internal Field Separator).

saveIFS=$IFS
IFS=.
array=($e)                    # *   split the string at each dot
lastword=${array[@]: -1}
unset "array[${#array}-1]"    # *
IFS=_
echo "${array[*]}.$lastword"  #     The asterisk as a subscript when inside quotes causes IFS (an underscore in this case) to be inserted between each element of the array
IFS=$saveIFS

* use declare -p array after these steps to see what the array looks like.

1 Comment

+1 for showing there're more than one way to skin a cat. Some ways are more readable than others, mind.
2

1.

$ echo 'This.is.my.string' | sed 's}[^\.][^\.]*$}Good Long.&}'
This.is.my.Good Long.string

before: a dot, then no dot until the end. after: obvious, & is what matched the first part

2.

$ echo 'This.is.my.string' | sed 's}.*\.}}'
string

sed greedy matches, so it will extend the first closure (.*) as far as possible i.e. to the last dot.

3.

$ echo 'This.is.my.string' | tr . _ | sed 's/_\([^_]*\)$/\.\1/'
This_is_my.string

convert all dots to _, then turn the last _ to a dot.

(caveat: this will turn 'This.is.my.string_foo' to 'This_is_my_string.foo', not 'This_is_my.string_foo')

Comments

1

You don't need regular expressions at all (those complex things hurt my eyes!) if you use Awk and are a little creative.

1. echo $example| awk -v ins="Good.long" -F . '{OFS="."; $NF = ins"."$NF;print}'

What this does:
-v ins="Good.long" tells awk to create a variable called 'ins' with "Good.long" as content,
-F . tells awk to use the dot as a separator for your fields for input,
-OFS tells awk to use the dot as a separator for your fields as output,
NF is the number of fields, so $NF represents the last field,
the $NF=... part replaces the last field, it appends the current last string to what you want to insert (the variable called "ins" declared earlier).

2. echo $example| awk -F . '{print $NF}'

$NF is the last field, so that's all!

3. echo $example| awk -F . '{OFS="_"; $(NF-1) = $(NF-1)"."$NF; NF=NF-1; print}'

Here we have to be creative, as Awk AFAIK doesn't allow deleting fields. Of course, we set the output field separateor to underscore.

$(NF-1) = $(NF-1)"."$NF: First, we replace the second last field with the last glued to the second last, with a dot between.
Then, we fool awk to make it think the Number of fields is equal to the number of fields minus one, hence deleting the last field!

Note you can't say $NF="", because then it would display two underscores.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.