1

I would like to remove duplicate lines in a file (duplicates of column 2) keeping the complete first line for each duplicate.

Example input:

10.4.14.1,201s-1-S
10.4.16.1,201s-1-S
10.4.17.1,40-MDF-S
10.4.18.1,201s-1-S
10.4.19.1,201s-1-S
10.4.20.1,201s-1-S
10.4.21.1,201s-1-S
10.4.22.1,201s-1-S
10.4.23.1,201s-1-S
10.4.24.1,MDF-S

Desired result:

10.4.14.1,201s-1-S
10.4.17.1,40-MDF-S
10.4.24.1,MDF-S

So far I have tried

awk '!k[$5]++' file

and

awk '!_[$5]++' file

but this does not yield my desired output.

4 Answers 4

3

using a perl one-liner

perl -aF, -lne 'print if ! $seen{$F[1]}++' data.txt

Outputs:

10.4.14.1,201s-1-S
10.4.17.1,40-MDF-S
10.4.24.1,MDF-S

Explanation:

Switches:

  • -a: Splits the line on space and loads them in an array @F
  • -F/pattern/: split() pattern for -a switch (//'s are optional)
  • -l: Enable line ending processing
  • -n: Creates a while(<>){..} loop for each line in your input file.
  • -e: Tells perl to execute the code on command line.
Sign up to request clarification or add additional context in comments.

1 Comment

Unless you hate using unless which means not doing perl -F, -lane 'print unless $seen{$F[1]}++' data.txt you can also do perl -F, -lane '$seen{$F[1]}++||print' data.txt. :P
3

You need to set the delimiter to , (the default delimiter is whitespace) and use the correct column ($2) for the "seen" array.

$ awk -F, '!seen[$2]++' file
10.4.14.1,201s-1-S
10.4.17.1,40-MDF-S
10.4.24.1,MDF-S

Comments

1

You could also use sort for this:

$ sort -t, -k2 -u file
10.4.14.1,201s-1-S
10.4.17.1,40-MDF-S
10.4.24.1,MDF-S

Comments

0

This might work for you (GNU sed):

sed -rn '1!G;/^[^,]*(,[^\n]*)\n.*\1/!P;h' file

If the second field in the current line is not a duplicate print the current line.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.