2

I have a text file that looks like:

#filelists.txt
a
# aaa
b
#bbb
c #ccc

I want to delete parts of lines starting with '#' and afterwards, if line starts with #, then to delete whole line.

So I use 'sed' command in my shell:

sed -e "s/#*//g" -e "/^$/d" filelists.txt

I wish its result is:

a
b
c

but actually result is:

filelists.txt
a
 aaa
b
bbb
c ccc

What's wrong in my "sed" command?

I know '*' which means "any", so I think that '#*' means string after "#".

Isn't it?

1
  • #* means zero or more of #. To make this work you need #.* where . means any character and the star then gives zero or more of any character. Commented Aug 26, 2019 at 8:48

2 Answers 2

3

You may use

sed 's/#.*//;/^$/d' file > outfile

The s/#.*// removes # and all the rest of the line and /^$/d drops empty lines.

See an online test:

s="#filelists.txt
a
# aaa
b
#bbb
c #ccc"

sed 's/#.*//;/^$/d' <<< "$s"

Output:

a
b
c 

Another idea: match lines having #, then remove # and the rest of the line there and drop if the line is empty:

sed '/#/{s/#.*//;/^$/d}' file > outfile

See another online demo.

This way, you keep the original empty lines.

Sign up to request clarification or add additional context in comments.

2 Comments

HI @Wiktor Stribiżew: Does ';' is equivalent to 'sed -e'?
@curlywei Well, I understand it as an action sequence operator, the operation on the right side is performed after the operation on the left side of it.
2

* does not mean "any" (at least not in regular expression context). * means "zero or more of the preceding pattern element". Which means you are deleting "zero or more #". Since you only have one #, you delete it, and the rest of the line is intact.

You need s/#.*//: "delete # followed by zero or more of any character".

EDIT: was suggesting grep -v, but didn't notice the third example (# in the middle of the line).

3 Comments

Hi @Amadan: Does "." of "#." mean "followed by zero or more of any character"
. is just "any character". .* is "zero or more of any character".
The #. means first after #, #.. means 2 characters after #, #.* means lots of characters after #

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.