3

File 1

001 00A 892 J27
002 00G 742 M65
003 00B 934 B32
004 00J 876 K57
005 00k 543 N21

File 2 has 1,628,433 columns but, would like to add all four columns from file 1 after column one in this file.

a 2 T ..........
b 3 C ..........
c 4 G ..........
d 5 A ..........
e 6 B ..........

Desired output

a 001 00A 892 J27 2 T ..........
b 002 00G 742 M65 3 C ..........
c 003 00B 934 B32 4 G ..........
d 004 00J 876 K57 5 A ..........
e 005 00k 543 N21 6 B ..........

Tried the following

awk 'NR==FNR{a[FNR]=$1,$2,$3,$4} {print $1,a[FNR],$5}' file2 file1
4
  • 3
    Is the first output line supposed to be a 001 00A 892 J27 or a 001 00A 892 J27 2 T .... Commented Jul 12, 2021 at 18:50
  • This works but the output file is not tab delimited as files 1 &2 are Commented Jul 14, 2021 at 20:29
  • That's a surprising comment: there's nowhere in your question where you say "tab delimited". Commented Jul 14, 2021 at 20:48
  • @glennjackman Sorry I missed your first question. The first output is supposed to be the second ouput option you presented in your first question 'a 001 00A 892 J27 2 T ....'. Commented Jul 14, 2021 at 21:32

5 Answers 5

4

This version is lighter on memory: it only reads one line at a time from each file:

awk '{getline f1 < "file1"; $1 = $1 OFS f1; print}' file2
Sign up to request clarification or add additional context in comments.

2 Comments

This is almost correct but, both input files are tab delimited and the desired output file is tab delimited as well. Sorry I forgot to mention it my question. I tried using your command but adding in the tab delimited but it still is not working awk '{getline f1 < "file1"; $1 = $1 OFS="/t" f1; print}' file2
use awk -v OFS='\t' '...
4

With your shown samples, please try following awk code.

awk 'FNR==NR{arr[FNR]=$1;next} {$1=$1 OFS arr[FNR]} 1' file2 file1

Explanation: Simple explanation would be, using FNR==NR condition when file2 is being read. Create array with index of line number and have 1st field as its value in it. While reading file1 save value of equivalent array of current line into first field then print current line there.

Comments

4
$ paste -d' ' <(cut -d' ' -f1 file2) file1 <(cut -d' ' -f2- file2)
a 001 00A 892 J27 2 T ..........
b 002 00G 742 M65 3 C ..........
c 003 00B 934 B32 4 G ..........
d 004 00J 876 K57 5 A ..........
e 005 00k 543 N21 6 B ..........

Comments

2

Here is a python that deals with the input files one line at a time:

python3 -c '
import sys
with open(sys.argv[1]) as f1, open(sys.argv[2]) as f2:
    for l1, l2 in zip(f1,f2):
        lf1,lf2=map(str.split, [l1,l2])
        print(" ".join([lf2[0]]+lf1+lf2[1:]))
' file1 file2 

Comments

2
awk -F'\t' -v OFS="\t" '{getline f1 < "file1"; $1 = $1 OFS f1; print}' file2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.