3

So I was trying to understand this answer for merging two files using awk and I was coming up with my solution for a requirement of mine.

awk 'FNR==NR{a[$1]=$2 FS $3;next} {a[$1]=$2 FS $3}{ print a[$1]}' file2 file1

My files are as follows:-

file1 and file2 contents are as follows:-

1 xyz pqr F -
1 abc def A -
1 abc mno G -


1 abc def A
1 xyz pqr T

I am expecting an output as below:-

1 xyz pqr F - T
1 abc def A - A

Basically to match columns 1,2,3 from file2 on file1 and print append the content of the last column on file2 over the result.

So my understanding of the solution I did as follows,

  1. FNR==NR{a[$1]=$2 FS $3;next} will process on file2 storing the entries of the array a as column2 space column3 till the end of file2.
  2. Now on file1, I can match those rows from file2 by doing {a[$1]=$2 FS $3} which will give me all those rows in file1 whose column $1's value a[$1] is same as column2 value $2 space column3 value $3. Now here comes the problem.
  3. After having matched them in file1, I don't know how to print the values as expected. I tried printing $0 and a[$1] and they are giving me

outputs as sequentially,

1 xyz pqr F -
1 abc def A -

xyz pqr
abc def

respectively. My biggest concern was since I did not capture the last column from file2 during the FNR==NR pass, I may not have the value stored in my array? Or do I have it stored?

2
  • 1
    Wouldn't it be best to say awk 'FNR==NR{a[$1 FS $2 FS $3]=$4;next} (($1 FS $2 FS $3) in a) {print $0, a[$1 FS $2 FS $3]}' f2 f1? Commented Jun 14, 2016 at 11:39
  • @fedorqui: Missed the logic of a[...]=$4, that would have helped me! Please provide it as an answer, so that it will useful for reference! Commented Jun 14, 2016 at 11:48

1 Answer 1

6

Use this awk:

awk 'NR==FNR{a[$2 FS $3]=$4; next} $2 FS $3 in a{print $0, a[$2 FS $3]}' file2 file1

There are some issues in your awk.

  • Your main concern is $4 from file2. But, you haven't stored it.
  • While accessing file1, you are reassigning an array a with values of file1. (this: a[$1]=$2 FS $3)

As suggested by @EdMorton, a more readable form :

awk '{k=$2 FS $3} NR==FNR{a[k]=$4; next} k in a{print $0, a[k]}' file2 file1
Sign up to request clarification or add additional context in comments.

2 Comments

That is an excellent answer! I will of-course accept it, just waiting for an even more efficient logic than mine
@EdMorton: Please post it as an answer for a future reference!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.