remove string between two character with sed

Question

I have a file of this type:

16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4

and i want remove all the strings inside square parentheses in order to obtain

16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

I am trying with sed in this manner:

sed 's/\[.*]//g' file1 > file2

but i obtain

16:00 1.57 4

and with

sed 's/\[.[1234567890]]//g' file1 > file2

does not work if the string contains more than 2 digit.

how can i do this?

Jörg Beyer · Accepted Answer · 2012-02-09 11:58:09Z

1

your pattern allows only one character, adding a star behind the pattern widens it to all matching characters.

sed 's/\[.[1234567890]]*//g' file1 > file2

alternative:

sed 's/\[^\]*//g' file1 > file2

that means: after the starting "[" everything but the "]" is OK, and that for as many characters as there come (the "*")

for further reading on sed: http://www.grymoire.com/Unix/Sed.html

edited Feb 9, 2012 at 11:58

answered Feb 9, 2012 at 11:48

Jörg Beyer

3,68123 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

potong Over a year ago

This may work for this solution but will not scale well for all types of string between two characters. The alternative?

user unknown Over a year ago

[1234567890] can be shortened to [0-9]

TLP · Accepted Answer · 2012-02-11 02:33:50Z

1

Your first regex does not work because the quantifier * is greedy, meaning it matches as many characters as possible. Since . also matches brackets, it continues to match until the last closing bracket ] it can find.

So you basically have two options: Use a non-greedy quantifier or restrict the types of characters you can match. You have tried the second solution. I would go with using a negated character class instead:

sed 's/\[[^]]*\]//g'

I'm not sure if sed has non-greedy quantifiers, but perl does:

perl -lpwe 's/\[.*?\]//g'

answered Feb 11, 2012 at 2:33

TLP

68.2k10 gold badges97 silver badges156 bronze badges

Comments

John3136 · Accepted Answer · 2012-02-09 11:48:19Z

0

Does escaping the closing ] help ?

sed 's/\[.*\]//g' file1 > file2

answered Feb 9, 2012 at 11:48

John3136

29.3k4 gold badges55 silver badges77 bronze badges

1 Comment

potong Over a year ago

\[.*\] is greedy and will swallow up all characters between the first [ and the last ] including other ]['s.

Birei · Accepted Answer · 2012-02-09 11:59:25Z

0

You already got the sed answer, so I will add other one using awk:

awk '
  BEGIN { 
    FS = "\\[[^]]*\\]"; 
    OFS = " " 
  } 
  { 
    for (i=1; i<=NF; i++) 
      printf "%s", $i 
  } 
  END { 
    printf "\n" 
  }
' <<<"16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4"

Output:

16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

answered Feb 9, 2012 at 11:59

Birei

36.4k3 gold badges80 silver badges84 bronze badges

Comments

Community · Accepted Answer · 2020-06-20 09:12:55Z

0

using `awk`:

$ echo '16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4' | awk -F '\[[0-9]*\]' '$1=$1'
16:00  Al-Najma - Al-Rifaa  5.06  3.55  1.57 4

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Feb 9, 2012 at 12:06

kev

163k49 gold badges286 silver badges282 bronze badges

1 Comment

Kent Over a year ago

same as my solution (didn't post) ;) btw, the two "\" could be saved.

potong · Accepted Answer · 2012-02-09 16:09:03Z

0

This might work for you:

echo "16:00 [61]Al-Najma - Al-Rifaa [62]5.06 [63]3.55 [64]1.57 4" |
sed 's/\[[^]]*\]//g'
16:00 Al-Najma - Al-Rifaa 5.06 3.55 1.57 4

answered Feb 9, 2012 at 16:09

potong

59.3k6 gold badges55 silver badges92 bronze badges

Collectives™ on Stack Overflow

remove string between two character with sed

6 Answers 6

2 Comments

Comments

1 Comment

Comments

using `awk`:

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

Comments

1 Comment

Comments

using awk:

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related

using `awk`: