3

I can delete duplicate lines in files using below commands: 1) sort -u and uniq commands. is that possible using sed or awk ?

3
  • 2
    if you have sort and uniq, why do you want to use sed or awk? Commented Feb 27, 2014 at 11:33
  • Well, possible it is, since both are turing complete languages, as far as I recall. The question is what for you'd use them, as pointed by @Skriptotajs. Commented Feb 27, 2014 at 11:34
  • Possible duplicate of How can I delete duplicate lines in a file in Unix? Commented Oct 24, 2018 at 4:45

3 Answers 3

11

There's a "famous" awk idiom:

awk '!seen[$0]++' file

It has to keep the unique lines in memory, but it preserves the file order.

Sign up to request clarification or add additional context in comments.

3 Comments

This looks awesome but somehow it's not working for me on macOS Sierra.
only for small files, if file bigger then ram + swap - not worked
For some definition of "small" . Measured in GB
0

sort and uniq these only need to remove duplicates cat filename | sort | uniq >> filename2

if its file consist of number use sort -n

2 Comments

Though the cat is useless.
The uniq is also useless. Just use sort -u filename The -u invokes sort's unique mode. [ None of which answered the OP's question... ]
0

After sorting we can use this sed command

sed -E '$!N; /^(.*)\n\1$/!P; D' filename

If the file is unsorted then you can use with combination of the command.

sort filename | sed -E '$!N; /^\(.*\)\n\1$/!P; D' 

1 Comment

Obviously these alternatives are unacceptable if you can't sort the file.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.