0

Hi I have one AWK Command which is combine two files have same key.

awk -v OFS='\t' '
NR==1   { print $0, "Column4", "Column5"; next }
NR==FNR { a[$1]=$0; next}
$1 in a { print a[$1], $2, $3 }
' $1 $2 > $3

This is return only one key from each files. For example as below,

File 1

Key    Column1  Column2  Column3  
Test1    500     400     200               
Test1    499     400     200               
Test1    499     399     200               
Test1    498     100     100               
Test2    600     200     150               
Test2    600     199     150               
Test2    599     199     100               

File 2

Test1    Good     Good                    
Test2    Good     Good

Then Results will be

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good   
Test2    600     200     150       Good      Good

but I want to make all rows have combined like below.

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good            
Test1    499     400     200       Good      Good             
Test1    499     399     200       Good      Good             
Test1    498     100     100       Good      Good             
Test2    600     200     150       Good      Good             
Test2    600     199     150       Good      Good              
Test2    599     199     100       Good      Good           

Anyone has idea simply to change logic using AWK. Thank you!C

2
  • 2
    I think you are missing ; next in that NR==FNR block. Commented Apr 24, 2015 at 17:42
  • @EtanReisner Yes I modified thanks! Commented Apr 24, 2015 at 17:43

1 Answer 1

2

I think you're looking for

join file1 file2

If you insist on doing it with awk, a good way would be to process the files the other way around, so that you have the parts you want to add ready when you process the main file:

awk -v OFS='\t' '
  FNR == NR { a[$1] = $2 OFS $3; next }
  { $1 = $1 }
  FNR ==  1 { print $0, "Column4", "Column5" }
  FNR !=  1 { print $0, a[$1] }
  ' "$2" "$1" > "$3"

EDIT: @EtanReisner suggested the addition of { $1 = $1 }. The purpose of this is to force awk to rebuild the line from the fields, so that input data that is split by a mixture of whitespaces comes out uniformly separated by OFS (tab in this case). If the data is already tab-separated, it is not necessary (but doesn't hurt).

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you! but if I use join head will be deleted and want to know using awk
Sorry Second file does not have header,
@clear.choi This pulls the header from the second file not the first. Notice FNR==1 is after the FNR==NR block.
This is almost verbatim what I was about to post as an answer. My only change was $1=$1 in each block to get columns uniformly split by tabs.
@EtanReisner That's not a bad idea, to be on the safe side. I kind of just assumed that the input data was already tab-delimited.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.