1

Problem: given a file samplein, it can be split up into multiple pieces as follows:

$ cat samplein
START
Unix
Linux
START
Solaris
Aix
SCO

$ awk '/START/{x="F"++i;}{print > x}' samplein
$ ls F*
F1  F2

$ cat F1
START
Unix
Linux

$ cat F2
START
Solaris
Aix
SCO

The above was recipe 5 from this page.
However, I had the case where the pattern (START in this case) didn't occur at the first line.

But if we append a newline to samplein the same code/recipe doesn't work any more!

$ echo -e "firstline\n$(cat samplein)" > samplein
$ cat samplein
$ awk '/START/{x="F"++i;}{print > x}' samplein
awk: cmd. line:1: (FILENAME=samplein FNR=1) fatal: expression for `>' redirection has null string value

Please also explain in the answer how this awk command works in the first place. The only context I had used awk previously was {BEGIN}{loop over all lines}{END}. This recipe looks slightly different from that!

2
  • You might want to read some of the information linked at awk.info/?Learn Commented Jan 14, 2016 at 15:40
  • Unless you are using this problem as a learning exercise for awk, you might want to look at csplit if your system provides it Commented Jan 14, 2016 at 15:41

1 Answer 1

4

Just add x="F0" to the beginning so the target file is always defined, even if the first line doesn't contain the pattern:

awk 'BEGIN { x="F0" ; } /START/{x="F"++i;}{print > x}' 

The above breaks down to this pseudo code:

### -> BEGIN { x="F0" ; }
i=0 # implicit
x="F0" # explicit
loop through file

### -> /START/{x="F"++i;}
if ( line contains "START" ) output file is F(next i value) ;

### -> {print > x}
print line to output file

endloop

Keep in mind that all clauses like BEGIN, END , { ...} are optional.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.