I am opening up a file and checking if the items in columns 1 & 8 match certain specs. If yes, write output to a file x. If the items in column 1 match specs but column 8 does not match the specs, write output to file y.
I am defining multiple variables (awk -v v=$var,f1=$file,f2=$output), and I believe how I reference f1 & f2 is the problem. If I remove the quotes:
print $0 >> f2
awk: cmd. line:5: (FILENAME=- FNR=2) fatal: expression for `>>' redirection has null string value
If I put in a $:
print $0 >> $f2
I end up with a bunch of files with odd names that I don't want, and the files I do want are empty (except for the echoed line).
if I put "":
print $0 >> "f2"
The files I want are almost empty, and it creates a file called f2.
#!/bin/bash
output="output.txt"
echo -e "C1\tSeqID\tAminoAcid\tCD1\tCD2\tCD3\tGene\tEnvironment\tFilename" > $output
inputFile="input.txt.gz"
for var in A B C D E F G H I J K L
do
file=$var".txt"
echo -e "C1\tSeqID\tAA\tCD1\tCD2\tCD3\tGene\tEnvironment\tFilename" > $file
#---Wrong, forgot to catch $8 != v
#zcat $inputFile | awk -v v=$var '{
# if ($8 == v && ($1 == "V1" || $1 == "V2" || $1 == "V3" || $1 == "V4" || $1 == "V5" || $1 == "V6" || $1 == "V7" || $1 == "V8" || $1 == "V9" || $1 == "V10"))
# print $0
# }' | tee -a $file $output
zcat $inputFile | awk -v v=$var,f1=$file,f2=$output '{
if ($8 == v && ($1 == "V1" || $1 == "V2" || $1 == "V3" || $1 == "V4" || $1 == "V5" || $1 == "V6" || $1 == "V7" || $1 == "V8" || $1 == "V9" || $1 == "V10"))
print $0 >> "file"
else if ($8 != v && ($1 == "V1" || $1 == "V2" || $1 == "V3" || $1 == "V4" || $1 == "V5" || $1 == "V6" || $1 == "V7" || $1 == "V8" || $1 == "V9" || $1 == "V10"))
print $0 >> "f2"
}'
gzip $file
done
gzip $output
I can run through the loop and have two separate awk commands that write to different files. However, it is a very large file (4G compressed) and it is more efficient to use my current approach (or something similar to it). Any guidance on how to reference the 2nd & 3rd variable are greatly appreciated.