0

I have a very large text file with hundreds of columns. I want to add a header to every column from an independent text file containing a list.

My large file looks like this:

largefile.txt
chrom start end 0 1 0 1 0 0 0 etc
chrom start end 0 0 0 0 1 1 1 etc
chrom start end 0 0 0 1 1 1 1 etc

my list of headers:

headers.txt
h1
h2
h3

wanted output:

output.txt
                h1 h2 h3 h4 h5 h6 h7 etc..
chrom start end 0 1 0 1 0 0 0 etc
chrom start end 0 0 0 0 1 1 1 etc
chrom start end 0 0 0 1 1 1 1 etc
1
  • so headers should be added starting from the 4th field, right? Commented Apr 27, 2017 at 13:23

3 Answers 3

1
$ awk 'NR==FNR{h=h OFS $0; next} FNR==1{print OFS OFS h} 1' head large | column -s ' ' -t
                   h1  h2  h3
chrom  start  end  0   1   0   1  0  0  0  etc
chrom  start  end  0   0   0   0  1  1  1  etc
chrom  start  end  0   0   0   1  1  1  1  etc

or if you prefer:

$ awk -v OFS='\t' 'NR==FNR{h=h OFS $0; next} FNR==1{print OFS OFS h} {$1=$1}1' head large
                        h1      h2      h3
chrom   start   end     0       1       0       1       0       0       0       etc
chrom   start   end     0       0       0       0       1       1       1       etc
chrom   start   end     0       0       0       1       1       1       1       etc
Sign up to request clarification or add additional context in comments.

5 Comments

yes, I also ended up with similar awk 'NR==FNR{l=l FS $1; next}FNR==1{ printf("%s%s%s%s\n", FS,FS,FS,l); print}1' headers.txt largefile.txt
Never use the letter l as a variable name as it looks far too much like the number 1 (completely indistinguishable in some fonts) and so obfuscates your code (e.g. at a glance we can't tell if FNR==1 is testing for FNR equal to the string you created and stored in the variable l or to the number 1, and is that a test for l or 1 at the end, etc...?). I spent 3 hours helping someone debug a script once where the error turned out to be that their typing teacher had told them to use l instead of 1 as its faster to type and looks the same - Grrrr...
I didn't post it, so, no fear
Oh nice move that OFS in the NR==FNR{}. Why didn't I think of that. ^
thanks @EdMorton that worked perfectly for my data. Appreciate the help!
1

Well, here's one. OFS is tab for eye candy. From the OP I concluded that the headers should start from the fourth field, hence +3s in the code.

$ awk -v OFS="\t" '               # tab OFS
NR==FNR { a[NR]=$1; n=NR; next }  # has headers
FNR==1  {                         # print headers in the beginning of 2nd file
    $1=$1                         # rebuild record for tabs
    b=$0                          # buffer record
    $0=""                         # clear record
    for(i=1;i<=n;i++)             # spread head to fields
        $(i+3)=a[i]
    print $0 ORS b                # output head and buffered first record
}
{ $1=$1 }1' head data             # implicit print with record rebuild
                        h1      h2      h3
chrom   start   end     0       1       0       1       0       0       0       etc
chrom   start   end     0       0       0       0       1       1       1       etc
chrom   start   end     0       0       0       1       1       1       1       etc

Then again, this would also do the trick:

$ awk 'NR==FNR{h=h (NR==1?"":OFS) $0;next}FNR==1{print OFS OFS OFS h}1' head date
   h1 h2 h3
chrom start end 0 1 0 1 0 0 0 etc
chrom start end 0 0 0 0 1 1 1 etc
chrom start end 0 0 0 1 1 1 1 etc

Comments

0

Use paste to pivot the headers into a single line and then cat them together with the main file (- instead of a file name means stdin to cat):

$ paste -s -d' ' headers.txt | cat - largefile.txt

If you really need the headers to line up as in your example output you can preprocess (either manually or with a command) the headers file, or you can finish with sed (for just one option) as below:

$ paste -s -d' ' headers.txt | cat - largefile.txt | sed '1 s/^/                /'
                h1 h2 h3
chrom start end 0 1 0 1 0 0 0 etc
chrom start end 0 0 0 0 1 1 1 etc
chrom start end 0 0 0 1 1 1 1 etc

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.