1

I have a text formatted like the following:

2020-05-02
apple
string
string
string
string
string
2020-05-03
pear
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

Each group has 7 lines = Date, Fruit and then 5 strings.

I would like to delete groups of 7 lines from the file by supplying just the date and the fruit.

So if choose '2020-05-03' and 'pear'

this would remove:

2020-05-03
pear
string
string
string
string
string

from the file, resulting in this:

2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

The file contains thousands of lines, I need a command, probably using sed or awk to:

  1. Search for date 2020-05-03

  2. Check if string after date is pear

  3. delete both lines and following 5 lines

I know i can delete with sed like sed s'/string//g', however i am not sure if i can delete multiple lines.

Note: Date followed by fruit is never repeated twice so

2020-05-02
pear

would only occur once in the file

How can i acheive this?

6
  • Sounds like 2020-05-03\npear\n(?:.*\n){5} Commented Feb 5, 2020 at 20:49
  • thanks @MonkeyZeus i fixed it Commented Feb 5, 2020 at 20:52
  • sed s'/2020-05-03\npear\n(?:.*\n){5}//' doesn't seem to work, but doesn't throw an error Commented Feb 5, 2020 at 20:54
  • I'm not a sed expert but maybe it doesn't support non-capturing groups so try 2020-05-03\npear\n.*\n.*\n.*\n.*\n.*\n Commented Feb 5, 2020 at 20:55
  • I think the issue is the \n , it does not match, eg: sed s'/2020-05-03\npear//' does not remove the first 2 lines Commented Feb 5, 2020 at 20:57

2 Answers 2

3

Using awk, you may do this:

awk -v dt='2020-05-03' -v ft='pear' '$1==dt{p=NR} p && NR==p+1{del=($1==ft)}
del && NR<=p+6{next} 1' file

2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

Explanation:

  • -v dt='2020-05-03' -v ft='pear': Supply 2 values to awk from command line
  • $1==dt{p=NR}: If we find a line with matching date then store line no in variable p
  • p && NR==p+1{del=($1==ft)}: If p>0 and we are at next line then set a flag del to 1 if we have matching fruit name otherwise set that flag to 0.
  • del && NR<=p+6{next}: If flag del is set then skip next 6 lines
  • 1: Default action to print line
Sign up to request clarification or add additional context in comments.

Comments

0

This might work for you (GNU sed):

sed '/2020-05-03/{:a;N;s/[^\n]*/&/7;Ta;/^[^\n]*\npear/d}' file

If a line contains 2020-05-03 gather up in total 7 lines and if the 2nd of these lines contains pear delete them.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.