1

I am an awk newbie and admittedly don't understand how the FNR NR drives looping through files. I'm able to get two input files working. I need to add another (inputFile3).

I am running this from the command line:

awk -f parseField.awk inputFile1.csv inputFile2.csv ./inputFile3.TXT

Currently, I loop through inputFile3 using:

FNR!=NR {...}

I loop through inputFile1 using:

FNR==NR {...}

I need to add another file to the mix (inputFile2). What is the syntax that I can use in my awk script (parseField) to access that third input file?

1
  • 2
    FNR == "The input record number in the current input file." NR == "The total number of input records seen so far." so FNR==NR for the first file and is different for every other file. What are you trying to do with your third file? Commented Oct 18, 2015 at 20:46

2 Answers 2

4

To add to @EtanReisner 's good information, you can keep a counter: FNR==1 {file_number++}. This will increase the counter whenever the first line of a file is read.

All together, you can say:

#!/bin/awk -f

BEGIN {print "start program"}
NR==1 {print "reading first file"}
FNR==1 {filenum++; print "I am in file number", filenum}
{ ... }

If you are in a GNU POSIX awk (thanks Jonathan Leffler) you can also use the FILENAME variable. Or also the ARGC variables and ARGV array.


Also see information about this in Idiomatic awk:

Another construct that is often used in awk is as follows:

$ awk 'NR == FNR { # some actions; next} # other condition {# other actions}' file1.txt file2.txt

This is used when processing two files. When processing more than one file, awk reads each file sequentially, one after another, in the order they are specified on the command line. The special variable NR stores the total number of input records read so far, regardless of how many files have been read. The value of NR starts at 1 and always increases until the program terminates. Another variable, FNR, stores the number of records read from the current file being processed. The value of FNR starts at 1, increases until the end of the current file is reached, then is set again to 1 as soon as the first line of the next file is read, and so on. So, the condition NR == FNR is only true while awk is reading the first file.

Sign up to request clarification or add additional context in comments.

4 Comments

FILENAME is part of POSIX awk. So too is the ARGV array, and ARGC variable — the indexes of ARGV start from 0 (rather than 1), and the arguments recorded exclude the options to awk and the program.
@JonathanLeffler yes, so that's why I suggest using a counter whenever FNR==1 as the most reliable way to do this.
I agree that FNR == 1 is a good way of detecting a change of file. Your comment about GNU Awk is more restrictive than need be (FILENAME is not exclusively in GNU Awk). And knowing that ARGC and ARGV exist can be helpful.
@JonathanLeffler ah, now I see your point. Thanks for it, updated!
1

Not as elegant as the POSIX FILENAME solution, but handy for dusty, old awks that lack too many features. You can make a compound statement that manipulates your data before sending it to awk in a couple of ways...

Option 1

First, you could output the filenumber on its own before each file that you send to awk. So, if your files look like this:

file1

Line 1 of 1

file2

Line 1 of 2
Line 2 of 2

file3

Line 1 of 3
Line 2 of 3
Line 3 of 3

You could do this:

{ echo 1; cat file1; echo 2; cat file2; echo 3; cat file3; }
1
Line 1 of 1
2
Line 1 of 2
Line 2 of 2
3
Line 1 of 3
Line 2 of 3
Line 3 of 3

and pipe that into awk and then pick up the filenumber every time the number of fields is 1

{ echo 1; cat file1; echo 2; cat file2; echo 3; cat file3; } | awk 'NF==1{file=$1;next} {print file,$0}'
1 Line 1 of 1
2 Line 1 of 2
2 Line 2 of 2
3 Line 1 of 3
3 Line 2 of 3
3 Line 3 of 3

Option 2

Or, you could edit the filenumber onto the start, or end, of every line so it is available as $1 inside awk, like this:

{ sed 's/^/1 /' file1; sed 's/^/2 /' file2; sed 's/^/3 /' file3; }
1 Line 1 of 1
2 Line 1 of 2
2 Line 2 of 2
3 Line 1 of 3
3 Line 2 of 3
3 Line 3 of 3

So, now you can do

{ sed 's/^/1 /' file1; sed 's/^/2 /' file2; sed 's/^/3 /' file3; } | awk '{file=$1; ...}'

I'm still voting for @fedorqui's solution though :-)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.