A Linux Shell Script Problem

Question

I have a string separated by dot in Linux Shell,

$example=This.is.My.String

I want to

1.Add some string before the last dot, for example, I want to add "Good.Long" before the last dot, so I get:

This.is.My.Goood.Long.String

2.Get the part after the last dot, so I will get

String

3.Turn the dot into underscore except the last dot, so I will get

This_is_My.String

If you have time, please explain a little bit, I am still learning Regular Expression.

Thanks a lot!

Based on your tags, it sounds like this is specifically a sed question, right? (Except that $example in the first line, which makes it look perl...) — Cascabel
– Cascabel, Commented Nov 9, 2010 at 20:09
Is this really a question about sed? It looks from the question that it's just about shell scripting. — johnsyweb
– johnsyweb, Commented Nov 10, 2010 at 7:00
I suggest for you to change the title. All threads on this site deal about "Linux Shell Script Problem" :-) . — Sopalajo de Arrierez
– Sopalajo de Arrierez, Commented May 29, 2016 at 15:15

johnsyweb · Accepted Answer · 2010-11-09 22:56:59Z

10

I don't know what you mean by 'Linux Shell' so I will assume bash. This solution will also work in zsh, etcetera:

example=This.is.My.String
before_last_dot=${example%.*}
after_last_dot=${example##*.}
echo ${before_last_dot}.Goood.Long.${after_last_dot} 
This.is.My.Goood.Long.String

echo ${before_last_dot//./_}.${after_last_dot} 
This_is_My.String

The interim variables before_last_dot and after_last_dot should explain my usage of the % and ## operators. The //, I also think is self-explanatory but I'd be happy to clarify if you have any questions.

This doesn't use sed (or even regular expressions), but bash's inbuilt parameter substitution. I prefer to stick to just one language per script, with as few forks as possible :-)

edited Nov 9, 2010 at 22:56

answered Nov 9, 2010 at 20:44

johnsyweb

143k26 gold badges197 silver badges253 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

DocWiki Over a year ago

I didn't know bash is so powerful! This answer is an eye-opener for me. Thanks a lot!

johnsyweb Over a year ago

Happy to help. "Thanks a lot!" is usually expressed on StackOverflow as an up-vote or an accepted answer (for the one that helped you solve your problem).

Dennis Williamson · Accepted Answer · 2010-11-09 23:28:25Z

3

Other users have given good answers for #1 and #2. There are some disadvantages to some of the answers for #3. In one case, you have to run the substitution twice. In another, if your string has other underscores they might get clobbered. This command works in one go and only affects dots:

sed 's/\(.*\)\./\1\n./;h;s/[^\n]*\n//;x;s/\n.*//;s/\./_/g;G;s/\n//'

It splits the line before the last dot by inserting a newline and copies the result into hold space:
```
s/$.*$\./\1\n./;h
```
removes everything up to and including the newline from the copy in pattern space and swaps hold space and pattern space:
```
s/[^\n]*\n//;x
```
removes everything after and including the newline from the copy that's now in pattern space
```
s/\n.*//
```
changes all dots into underscores in the copy in pattern space and appends hold space onto the end of pattern space
```
s/\./_/g;G
```
removes the newline that the append operation adds
```
s/\n//
```

Then the sed script is finished and the pattern space is output.

At the end of each numbered step (some consist of two actual steps):

Step Pattern Space Hold Space

This.is.My\n.String This.is.My\n.String
This.is.My\n.String .String
This.is.My .String
This_is_My\n.String .String
This_is_My.String .String

answered Nov 9, 2010 at 23:28

Dennis Williamson

364k95 gold badges386 silver badges446 bronze badges

2 Comments

Jonathan Leffler Over a year ago

Nice work; you must have had more time to think about it than I did.

Dennis Williamson Over a year ago

@Jonathan: Possibly, but it's a common pattern in sed - "divide and conquer". Quite similar to Johnysweb's Bash answer.

Community · Accepted Answer · 2017-05-23 12:31:35Z

3

Solution

Two versions of this, too:
- Complex: sed 's/$.*$$[.][^.]*$$/\1.Goood.Long\2/'
- Simple: sed 's/.*\./&Goood.Long./' - thanks Dennis Williamson
What do you want?
- Complex: sed 's/.*[.]$[^.]*$$/\1/'
- Simpler: sed 's/.*\.//' - thanks, glenn jackman.
sed 's/$[^.]*$[.]$[^.]*[.]$/\1_\2/g'

With 3, you probably need to run the substitute (in its entirety) at least twice, in general.

Explanation

Remember, in sed, the notation $...$ is a 'capture' that can be referenced as '\1' or similar in the replacement text.

Capture everything up to a string starting with a dot followed by a sequence of non-dots (which you also capture); replace by what came before the last dot, the new material, and the last dot and what came after it.
Ignore everything up to the last dot followed by a capture of a sequence of non-dots; replace with the capture only.
Find and capture a sequence of non-dots, a dot (not captured), followed by a sequence of non-dots and a dot; replace the first dot with an underscore. This is done globally, but the second and subsequent matches won't touch anything already matched. Therefore, I think you need ceil(log₂N) passes, where N is the number of dots to be replaced. One pass deals with 1 dot to replace; two passes deals with 2 or 3; three passes deals with 4-7, and so on.

edited May 23, 2017 at 12:31

CommunityBot

11 silver badge

answered Nov 9, 2010 at 20:32

Jonathan Leffler

760k145 gold badges961 silver badges1.3k bronze badges

6 Comments

DocWiki Over a year ago

Thanks a lot!:)If u could explain a little bit then it will be perfect.

glenn jackman Over a year ago

Your #2 can be simpler: sed 's/^.*\.//'

Jonathan Leffler Over a year ago

@Glenn: err...yes - that's what comes of fitting it in around a con-call that you're supposed to paying attention to.

Dennis Williamson Over a year ago

Similarly, #1 can be simplified to: sed 's/$.*\.$/\1Goood.Long./'

Jonathan Leffler Over a year ago

@Dennis: many eyes make for shallow bugs - or all code, even one-liners, can benefit from constructive code review. Thanks.

|

Dennis Williamson · Accepted Answer · 2010-11-10 06:25:12Z

3

Here's a version that uses Bash's regex matching (Bash 3.2 or greater).

[[ $example =~ ^(.*)\.(.*)$ ]]
echo ${BASH_REMATCH[1]//./_}.${BASH_REMATCH[2]}

Here's a Bash version that uses IFS (Internal Field Separator).

saveIFS=$IFS
IFS=.
array=($e)                    # *   split the string at each dot
lastword=${array[@]: -1}
unset "array[${#array}-1]"    # *
IFS=_
echo "${array[*]}.$lastword"  #     The asterisk as a subscript when inside quotes causes IFS (an underscore in this case) to be inserted between each element of the array
IFS=$saveIFS

* use declare -p array after these steps to see what the array looks like.

answered Nov 10, 2010 at 6:25

Dennis Williamson

364k95 gold badges386 silver badges446 bronze badges

1 Comment

johnsyweb Over a year ago

+1 for showing there're more than one way to skin a cat. Some ways are more readable than others, mind.

glenn jackman · Accepted Answer · 2010-11-09 20:57:39Z

2

1.

$ echo 'This.is.my.string' | sed 's}[^\.][^\.]*$}Good Long.&}'
This.is.my.Good Long.string

before: a dot, then no dot until the end. after: obvious, & is what matched the first part

2.

$ echo 'This.is.my.string' | sed 's}.*\.}}'
string

sed greedy matches, so it will extend the first closure (.*) as far as possible i.e. to the last dot.

3.

$ echo 'This.is.my.string' | tr . _ | sed 's/_\([^_]*\)$/\.\1/'
This_is_my.string

convert all dots to _, then turn the last _ to a dot.

(caveat: this will turn 'This.is.my.string_foo' to 'This_is_my_string.foo', not 'This_is_my.string_foo')

edited Nov 9, 2010 at 20:57

glenn jackman

249k42 gold badges233 silver badges363 bronze badges

answered Nov 9, 2010 at 20:29

vlabrecque

3461 silver badge4 bronze badges

Comments

Hans · Accepted Answer · 2010-11-10 09:52:23Z

You don't need regular expressions at all (those complex things hurt my eyes!) if you use Awk and are a little creative.

1. echo $example| awk -v ins="Good.long" -F . '{OFS="."; $NF = ins"."$NF;print}'

What this does:
-v ins="Good.long" tells awk to create a variable called 'ins' with "Good.long" as content,
-F . tells awk to use the dot as a separator for your fields for input,
-OFS tells awk to use the dot as a separator for your fields as output,
NF is the number of fields, so $NF represents the last field,
the $NF=... part replaces the last field, it appends the current last string to what you want to insert (the variable called "ins" declared earlier).

2. echo $example| awk -F . '{print $NF}'

$NF is the last field, so that's all!

3. echo $example| awk -F . '{OFS="_"; $(NF-1) = $(NF-1)"."$NF; NF=NF-1; print}'

Here we have to be creative, as Awk AFAIK doesn't allow deleting fields. Of course, we set the output field separateor to underscore.

$(NF-1) = $(NF-1)"."$NF: First, we replace the second last field with the last glued to the second last, with a dot between.
Then, we fool awk to make it think the Number of fields is equal to the number of fields minus one, hence deleting the last field!

Note you can't say $NF="", because then it would display two underscores.

Collectives™ on Stack Overflow

A Linux Shell Script Problem

6 Answers 6

2 Comments

2 Comments

Solution

Explanation

6 Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

2 Comments

Solution

Explanation

6 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related