Extracting columns from multiple files into a single output file from the command line

Question

Say I have a tab-delimited data file with 10 columns. With awk, it's easy to extract column 7, for example, and output that into a separate file. (See this question, for example.)

What if I have 5 such data files, and I would like to extract column 7 from each of them and make a new file with 5 data columns, one for the column 7 of each input file? Can this be done from the command line with awk and other commands?

Or should I just write up a Python script to handle it?

nu11p01n73R · Accepted Answer · 2014-10-13 15:05:43Z

1

awk '{a[FNR] = a[FNR]" " $7}END{for(i=0;i<FNR;i++) print a[i]}'

a array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

END{for(i=0;i<FNR;i++) print a[i]} prints the content of array a on END of file

answered Oct 13, 2014 at 15:05

nu11p01n73R

26.8k3 gold badges42 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Etan Reisner · Accepted Answer · 2014-10-13 15:05:01Z

0

If the data is small enough to store it all in memory then this should work:

awk '{out[FNR]=out[FNR] (out[FNR]?OFS:"") $7; max=(FNR>max)?FNR:max} END {for (i=1; i<=max; i++) {print out[i]}}' file1 file2 file3 file4 file5

If it isn't then you would need something fancier which could seek around file streams or read single lines from multiple files (a shell loop with N calls to read could do this).

answered Oct 13, 2014 at 15:05

Etan Reisner

81.7k8 gold badges120 silver badges154 bronze badges

Collectives™ on Stack Overflow

Extracting columns from multiple files into a single output file from the command line

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related