Select column based on the row value

Question

I have file like this (Tab Separated)

Name  Data1  Data2  Extra  A   B   C   D  
Test1   A     C       40   23  10  12  5  
Test2   B     C       20   13  3   32  5  
Test3   C     D       44   43  0   1   5  
Test4   A     D       43   2   7   0   5

I need add column called frequency based on this Data1 and Data2. Freq= Data2 value/ (Data2 value+ Data1 value). For example for the Test1 Freq = 12/(12+23)

It will be easy to calculate and add values like this (for the row where Data1="A" and Date2="C"

  awk '{print$7/($5+$7)}‘

But How can I select the column based on the row value ?

Expected out

Name  Data1  Data2  Extra  A   B   C   D  Freq
Test1   A     C       40   23  10  12  5  0.34
Test2   B     C       20   13  3   32  5  0.91
Test3   C     D       44   43  0   1   5  0.83
Test4   A     D       43   2   7   0   5  0.71

"But I how can Select the column based on the row value ?" ??? — Luuk
– Luuk, Commented Mar 4, 2022 at 13:40
I mean If I can say column name == "A" based on the row value , it will be easier to call the value — Phela
– Phela, Commented Mar 4, 2022 at 13:43

Ed Morton · Accepted Answer · 2022-03-04 14:36:18Z

3

$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR==1 {
    $(NF+1) = "Freq"
    for (i=1; i<=NF; i++) {
        f[$i] = i
    }
    print
    next
}
{
    d1 = $(f["Data1"])
    d2 = $(f["Data2"])
    numer = $(f[d2])
    denom = numer + $(f[d1])
    $(f["Freq"]) = sprintf( "%.02f", (denom ? numer / denom : 0) )
    print
}

$ awk -f tst.awk file
Name    Data1   Data2   Extra   A       B       C       D       Freq
Test1   A       C       40      23      10      12      5       0.34
Test2   B       C       20      13      3       32      5       0.91
Test3   C       D       44      43      0       1       5       0.83
Test4   A       D       43      2       7       0       5       0.71

edited Mar 4, 2022 at 14:36

answered Mar 4, 2022 at 14:05

Ed Morton

209k18 gold badges90 silver badges212 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Luuk · Accepted Answer · 2022-03-04 13:51:20Z

2

To get a copy based on the values of Data1:

gawk '{ 
   s=($2=="A"?5:0)+($2=="B"?6:0)+($2=="C"?7:0)+($2=="D"?8:0); 
   print $0,s,(s!=0?$s:"") '}   inputfile

With you sample input this gives:

Name  Data1  Data2  Extra  A   B   C   D 0
Test1   A     C       40   23  10  12  5 5 23
Test2   B     C       20   13  3   32  5 6 3
Test3   C     D       44   43  0   0   5 7 0
Test4   A     D       43   0   7   0   5 5 0

The value of s refers to the column, so $s gives the value for that column.

BTW: I am using gawk, but this should work in awk too.

answered Mar 4, 2022 at 13:51

Luuk

15.4k5 gold badges28 silver badges44 bronze badges

Comments

Andreas Louv · Accepted Answer · 2022-03-04 14:12:09Z

1

Something like this might work for you, I have written it a bit verbose, to emphasize on what is going on:

$ cat freq_from_col.awk 
function indirect(val) {
        if (val == "A")
                return $col_a
        if (val == "B")
                return $col_b
        if (val == "C")
                return $col_c
        if (val == "D")
                return $col_d

        return 0
}
BEGIN {
        col_name = 1
        col_data1 = 2
        col_data2 = 3
        col_extra = 4
        col_a = 5
        col_b = 6
        col_c = 7
        col_d = 8
}
NR == 1 {
        print $0, "Freq"
        next;
}
{
        n = indirect($col_data1);
        m = indirect($col_data2);

        print $0, sprintf("%.2f", m/(n+m));
}

$ awk -f freq_from_col.awk data.txt
Name  Data1  Data2  Extra  A   B   C   D Freq
Test1   A     C       40   23  10  12  5 0.34
Test2   B     C       20   13  3   32  5 0.91
Test3   C     D       44   43  0   1   5 0.83
Test4   A     D       43   2   7   0   5 0.71

answered Mar 4, 2022 at 14:12

Andreas Louv

47.3k14 gold badges109 silver badges126 bronze badges

1 Comment

Ed Morton Over a year ago

That would fail with a divide-by-zero error if Data1 and Data2 were both 0.

Collectives™ on Stack Overflow

Select column based on the row value

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related