1

I have two files such as:

File_1

c1,c2,c3,c4

File_2

c1,c3,c2,c4

DA,CA,DD,CD

Thus, I want to make a File 3 using the File 1 as model using BASH:

File_3

c1,c2,c3,c4

DA,DD,CA,CD

In this example, the File_1 is a model of the correct disposition of the columns and the File_2 has the columns and their respective informations but in a wrong disposition. Thus, the File_3 used the file_1 as a template and ordered the information in the file_2 in a correct disposition.

In the example I just gave 4 columns, but my real file has 402 columns. So, to do an

awk -F"," '{print $1","$3","$2","$4}' File_2

or something like this, will not work because I dont know the position of the itens of the File_1 in the File_2 (for example the c1 column in the File_2 could be in the sixth, the second, or the last columns positions).

I hope that you can help me using BASH (if possible) and I would like an small explanation of the script, because I'm newbie and I don't know a lot the commands.

Thanks in advance.

2 Answers 2

1

You can make a header index mapping like this:

File_2  =>  File_1
------      ------
1       =>  1
2       =>  3
3       =>  2
4       =>  4

awk -F, '
    FNR==NR{
        for(i=1;i<=NF;i++)
            a[$i]=i
        print
        nextfile
    }
    FNR==1{
        for(j=1;j<=NF;j++)
            b[j]=a[$j]
        next
    }
    {
        for(k=1;k<=NF;k++)
            printf( "%s%s",$b[k], k==NF?"\n":",")
    }
' File_{1,2}

Note: This command works if File_{1,2} contain no empty lines!

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Kev. It worked perfectly. I had more than 100,000 lines and worked as I wanted and quickly. What site do you encorajous me to read to understand the script?
Just basic awk stuff: for...loop, printf, next/nextfile, NR/FNR. I think you can understand them in several mins.
0

If you are free to change the format of file 2 from:

File_2
c1,c3,c2,c4    
DA,CA,DD,CD

to:

s/c1/DA/g
s/c3/CA/g
s/c2/DD/g
s/c4/CD/g

you can use sed:

sed -f File_2 File_1 > File_3

Else you may work with arrays:

key=($(head -n1 File_2 | tr "," " "))
val=($(tail -n1 File_2 | tr "," " "))
len=${#key[*]}
for i in $(seq 0 $((len-1))); do echo s/${key[$i]}/${val[$i]}/g; done > subst.sed 
sed -f subst.sed File_1 > File_3 

The generated sed-Program is the one as above. If a substitution matches the key of a following command, you might get unexpected results. If you only like to match whole words, you have to change the sed command a bit.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.