0

Currently my csv input looks like below:

network_name,network_set
P1MSVmgmtvM,Data_NetworkSet_A
P1MSVvMotion,Data_NetworkSet_A
P2MSVmgmtvM,Data_NetworkSet_B
P2MSVvMotion,Data_NetworkSet_B
E1MSVEDGEiDMZRUE1,Edge_NetworkSet_A
E1MSVEDGEiEXPRUE1,Edge_NetworkSet_A

I want output like below: (for each network set, I want to display its associated network name)

Data_NetworkSet_A
 P1MSVmgmtvM
 P1MSVvMotion
Data_NetworkSet_B
 P2MSVmgmtvM
 P2MSVvMotion
Edge_NetworkSet_A
 E1MSVEDGEiDMZRUE1
 E1MSVEDGEiEXPRUE1
2
  • 1
    ...and what have you tried this far? Commented Apr 14, 2018 at 11:31
  • Actually it was a big csv file with lot of column values. I was able to seperate out mentioned network_set and network_name column using processLine () function, but could not understand the logic , how to compare and achieve desired output by ignoring duplicate network_set Commented Apr 14, 2018 at 11:37

1 Answer 1

0

You could use awk:

$ awk -F, 'NR>1{seen[$2]=seen[$2]"\n "$1;} END{for(x in seen) print x, seen[x]}' infile
Data_NetworkSet_A
 P1MSVmgmtvM
 P1MSVvMotion
Data_NetworkSet_B
 P2MSVmgmtvM
 P2MSVvMotion
Edge_NetworkSet_A
 E1MSVEDGEiDMZRUE1
 E1MSVEDGEiEXPRUE1

In seen[$2]=seen[$2]"\n "$1;; means print a \newline-space, followed by first column value $1 when it has same second column seen[$2]=... and append into the same key index =seen[$2]... and save the result in same key's value.

The END statement, awk executing this block in end when all records/line read, and we used a for-loop to iterate over the array called seen and print the key first and the value of the key in next.

1
  • Thanks alot @afshin. You saved my day. I have a related query , if you can advice on same would be great.. Commented Apr 14, 2018 at 22:53

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.