How to delete duplicate lines in file in unix?

Question

I can delete duplicate lines in files using below commands: 1) sort -u and uniq commands. is that possible using sed or awk ?

if you have sort and uniq, why do you want to use sed or awk? — Skriptotajs
– Skriptotajs, Commented Feb 27, 2014 at 11:33
Well, possible it is, since both are turing complete languages, as far as I recall. The question is what for you'd use them, as pointed by @Skriptotajs. — Rubens
– Rubens, Commented Feb 27, 2014 at 11:34
Possible duplicate of How can I delete duplicate lines in a file in Unix? — tripleee
– tripleee, Commented Oct 24, 2018 at 4:45

glenn jackman · Accepted Answer · 2014-02-27 11:52:03Z

11

There's a "famous" awk idiom:

awk '!seen[$0]++' file

It has to keep the unique lines in memory, but it preserves the file order.

answered Feb 27, 2014 at 11:52

glenn jackman

249k42 gold badges233 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

mherzl Over a year ago

This looks awesome but somehow it's not working for me on macOS Sierra.

Alex Muravyov Over a year ago

only for small files, if file bigger then ram + swap - not worked

glenn jackman Over a year ago

For some definition of "small" . Measured in GB

Arun Binoy · Accepted Answer · 2014-02-28 05:53:12Z

0

sort and uniq these only need to remove duplicates cat filename | sort | uniq >> filename2

if its file consist of number use sort -n

edited Feb 28, 2014 at 5:53

answered Feb 27, 2014 at 13:34

Arun Binoy

3551 silver badge10 bronze badges

2 Comments

tripleee Over a year ago

Though the cat is useless.

dave58 Over a year ago

The uniq is also useless. Just use sort -u filename The -u invokes sort's unique mode. [ None of which answered the OP's question... ]

fedorqui · Accepted Answer · 2014-03-03 10:10:10Z

0

After sorting we can use this sed command

sed -E '$!N; /^(.*)\n\1$/!P; D' filename

If the file is unsorted then you can use with combination of the command.

sort filename | sed -E '$!N; /^\(.*\)\n\1$/!P; D'

edited Mar 3, 2014 at 10:10

fedorqui

294k113 gold badges592 silver badges640 bronze badges

answered Feb 27, 2014 at 12:57

Q_SaD

3651 silver badge12 bronze badges

1 Comment

tripleee Over a year ago

Obviously these alternatives are unacceptable if you can't sort the file.

Collectives™ on Stack Overflow

How to delete duplicate lines in file in unix?

3 Answers 3

3 Comments

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related