0

I have a CSV file with two columns namely column 1: file name column 2: access status

The following are some sample recotds

FileA, CREATE
FileA, MODIFY
FileA, DELETE
FileB, CREATE
FileB, MODIFY

I need to TRANSPOSE values of the second column into a single row based on the distinct values of the first column.

FileA, CREATE|MODIFY|DELETE
FileB, CREATE|MODIFY
3
  • Do you need to keep the order? I mean, is FileA, CREATE|MODIFY|DELETE the same as FileA, DELETE|CREATE|MODIFY? Commented Dec 10, 2018 at 16:01
  • stackoverflow.com/questions/32940758/… Commented Dec 10, 2018 at 16:22
  • No, ordering based on second column is required Commented Dec 11, 2018 at 3:22

5 Answers 5

1

Try also

awk '
$1 != LAST      {printf "%s%s ", LD, $1         # print every new COL1 value
                 LAST = $1                      # and remeber it
                 LD = RS                        # set the line delimiter (empty at program start)
                 FD = ""                        # unset field delimiter
                }
                {printf "%s%s", FD, $2          # print successive second fields, after field delim 
                 FD = "|"                       # set the field delimiter
                }
END             {printf RS                      # last action: new line
                }
' file
FileA, CREATE|MODIFY|DELETE
FileB, CREATE|MODIFY
1

If you don't care about the order the commands are in, you can use:

$ awk -F"[, ]" '{
            a[$1][$2]++
           }
           END{
            for(i in a){
                printf "%s,",i; 
                for(k in a[i]){
                    printf  "%s|", k
                }
                print ""
                }
            }' file | sed 's/|$//'
FileA, DELETE|CREATE|MODIFY
FileB, CREATE|MODIFY

If you need the order, you can apply some perl magic:

$ sed 's/ //' file | 
    perl -F, -lne 'push @{$k{$F[0]}},$F[1]; }{ 
    print "$_, ",join "|", @{$k{$_}} for keys(%k);' 
FileB, CREATE|MODIFY
FileA, CREATE|MODIFY|DELETE
1
awk '1 {if (a[$1]) {a[$1] = a[$1]" "$2"|"} else {a[$1] = $2"|"}} END {for (i in a) { print i,a[i]}}' file |sed 's/.$//'
1

And a fourth one :D

awk '1 {if (a[$1]) {a[$1] = a[$1]$2"|"} else {a[$1] = $2"|"}} END {for (i in a) { print i,gensub( /\|$/,"","1",a[i])}}' kumarjit
FileA, CREATE|MODIFY|DELETE
FileB, CREATE|MODIFY
1
  • What does that first 1 do? I'd expect it to print the unmodified line. Commented Dec 10, 2018 at 22:45
1

To output in sorted order, with GNU awk

gawk -F', ' '
    { a[$1] = a[$1] "|" $2 }
    END {
        PROCINFO["sorted_in"] = "@ind_str_asc"
        for (b in a) print b ", " substr(a[b], 2)
    }
'

To output in the original order of the keys:

awk -F', ' '
    !($1 in a) { keys[++count] = $1 }
    { a[$1] = a[$1] "|" $2 }
    END {
        for (i = 1; i <= count; i++)
            print keys[i] ", " substr(a[keys[i]], 2)
    }
'

You must log in to answer this question.