Combine (Merge) Multiple Row using AWK

Question

Hi I have one AWK Command which is combine two files have same key.

awk -v OFS='\t' '
NR==1   { print $0, "Column4", "Column5"; next }
NR==FNR { a[$1]=$0; next}
$1 in a { print a[$1], $2, $3 }
' $1 $2 > $3

This is return only one key from each files. For example as below,

File 1

Key    Column1  Column2  Column3  
Test1    500     400     200               
Test1    499     400     200               
Test1    499     399     200               
Test1    498     100     100               
Test2    600     200     150               
Test2    600     199     150               
Test2    599     199     100

File 2

Test1    Good     Good                    
Test2    Good     Good

Then Results will be

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good   
Test2    600     200     150       Good      Good

but I want to make all rows have combined like below.

Key    Column1  Column2  Column3  Column4  Column5
Test1    500     400     200       Good      Good            
Test1    499     400     200       Good      Good             
Test1    499     399     200       Good      Good             
Test1    498     100     100       Good      Good             
Test2    600     200     150       Good      Good             
Test2    600     199     150       Good      Good              
Test2    599     199     100       Good      Good

Anyone has idea simply to change logic using AWK. Thank you!C

I think you are missing ; next in that NR==FNR block.

Etan Reisner
– Etan Reisner

2015-04-24 17:42:23 +00:00
Commented Apr 24, 2015 at 17:42 — Etan Reisner
– Etan Reisner, Commented Apr 24, 2015 at 17:42
@EtanReisner Yes I modified thanks!

clear.choi
– clear.choi

2015-04-24 17:43:06 +00:00
Commented Apr 24, 2015 at 17:43 — clear.choi
– clear.choi, Commented Apr 24, 2015 at 17:43

Wintermute · Accepted Answer · 2015-04-24 17:54:18Z

2

I think you're looking for

join file1 file2

If you insist on doing it with awk, a good way would be to process the files the other way around, so that you have the parts you want to add ready when you process the main file:

awk -v OFS='\t' '
  FNR == NR { a[$1] = $2 OFS $3; next }
  { $1 = $1 }
  FNR ==  1 { print $0, "Column4", "Column5" }
  FNR !=  1 { print $0, a[$1] }
  ' "$2" "$1" > "$3"

EDIT: @EtanReisner suggested the addition of { $1 = $1 }. The purpose of this is to force awk to rebuild the line from the fields, so that input data that is split by a mixture of whitespaces comes out uniformly separated by OFS (tab in this case). If the data is already tab-separated, it is not necessary (but doesn't hurt).

edited Apr 24, 2015 at 17:54

answered Apr 24, 2015 at 17:37

Wintermute

44.3k5 gold badges85 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

clear.choi Over a year ago

Thank you! but if I use join head will be deleted and want to know using awk

clear.choi Over a year ago

Sorry Second file does not have header,

Etan Reisner Over a year ago

@clear.choi This pulls the header from the second file not the first. Notice FNR==1 is after the FNR==NR block.

Etan Reisner Over a year ago

This is almost verbatim what I was about to post as an answer. My only change was $1=$1 in each block to get columns uniformly split by tabs.

Wintermute Over a year ago

@EtanReisner That's not a bad idea, to be on the safe side. I kind of just assumed that the input data was already tab-delimited.

|

Collectives™ on Stack Overflow

Combine (Merge) Multiple Row using AWK

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related