E.g.
xyz
A1
B1
C1
D1
End
End
End
X1
X2
X3
Done
I want to extract all string between xyz to End pattern. So output should be
xyz
A1
B1
C1
D1
End
End
End
E.g.
xyz
A1
B1
C1
D1
End
End
End
X1
X2
X3
Done
I want to extract all string between xyz to End pattern. So output should be
xyz
A1
B1
C1
D1
End
End
End
perl -l -0777ne 'print /^(xyz.*?^End$(?:\nEnd$)*)/ms' yourfile
perl -lne '
next unless /xyz/ ... eof;
last if !/End/ and $flag;
$flag ||= 1 if /End/;
print;
' yourfile
sed -e '
/xyz/!d
:a
$q;N
/\nEnd$/!ba
:b
n
/End/bb
d
' yourfile
In this method we operate the first do-while loop (:a) which will accumulate lines starting from /xyz/ to /End/.
The second do-while loop (:b) will print lines till the next line happens to be /End/.
sed -e '
/xyz/,/End/!d
H;/xyz/h;/End/!d
:a
$q;N
/\(.*\)\n\1$/!{g;q;}
s/.*\n//;H
ba
' yourfile
With this method we are first selecting the right range then storing that range data in the hold space. The do-while loop (:a) is setup which incrementally appends to the hold space while the next line happens to be /End/.
xyz
A1
B1
C1
D1
End
End
End
This is a kind of job pcregrep is good at:
pcregrep -M 'xyz(.|\n)*End' file
Notice that it is very greedy and eats everything till the final End, including other Ends.
Perl to the rescue: Print all the lines between the first xyz and the last End:
perl -ne '
$inside = 1 if /^xyz$/;
$seen_end = 1 if $inside && /^End$/;
push @buff, $_ if $inside;
print splice @buff if /^End$/ && @buff;
' input-file
From the first occurrence of xyz, we start pushing all lines into a buffer. Once End is encountered, we output and clear the buffer (see splice), but we continue to push lines into the buffer in case there was another End later.
As you are asking for an sed solution, I'd do it like this:
sed -e '/^xyz$/!d;:a' -e '$!{N;ba' -e '};s/\(.*\nEnd\).*/\1/'
So discard everything before the first pattern (/^xyz$/!d), then loop to collect all remaining lines in the pattern space (:a;$!{N;ba) and remove everything behind the last occurence of the second pattern (s/\(.*\nEnd\).*/\1/).
Collecting in the pattern space is neccessary as addressing (/xyz/,/End/) is not greedy, but .* inside the pattern space is.
awk solution:
awk '/xyz/,/End/{ print $0; n=NR }($0=="End" && n && NR>n && NR-n++ == 1)' file
The output:
xyz
A1
B1
C1
D1
End
End
End
/xyz/,/End/ - record range, from xyz to End
n=NR - capturing record number (on range matching - will eventually contain the number of the last record of the range)