How to find unique lines in Unix file by negelecting a specific pattern

Question

I have a file in Unix like the follows

">hello"
"hello"
"newuser"
"<newuser"
"newone"

Now I want to find unique occurrences in the file (exluding the < or > only while searching) and the output as:

">hello"
"<newuser"
"newone"

uniq will do much, but not all of this. You could ignore the > and < by removing them with sed and piping through uniq, but then the > < won't appear in the output. — Paul
– Paul, Commented Jun 11, 2013 at 7:04
You can also use an associative array in a language like perl or python to keep a cache of the strings seen so far. This cache can be used to decide when new lines are unique. — Paul
– Paul, Commented Jun 11, 2013 at 7:11

falsetru · Accepted Answer · 2013-06-11 07:50:30Z

3

#!/usr/bin/env python

import sys
seen = set()
for line in sys.stdin:
    word = line.strip().replace('>', '').replace('<', '')
    if word not in seen:
        seen.add(word)
        sys.stdout.write(line)

$ ./uniq.py < file1
">hello"
"newuser"
"newone"

answered Jun 11, 2013 at 7:50

falsetru

371k69 gold badges769 silver badges659 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

falsetru · Accepted Answer · 2013-06-11 08:29:04Z

2

$ awk '{ w = $1; sub(/[<>]/, "", w) } word[w] == 0 { word[w]++; print $1 }' file1
">hello"
"newuser"
"newone"

answered Jun 11, 2013 at 8:29

falsetru

371k69 gold badges769 silver badges659 bronze badges

Comments

loren · Accepted Answer · 2013-06-11 07:29:01Z

0

Here's that associative array idea in Ruby.

2.0.0p195 :005 > entries= [">hello", "hello", "newuser", "<newuser", "newone"]
 => [">hello", "hello", "newuser", "<newuser", "newone"] 
2.0.0p195 :006 > entries.reduce({}) { |hash, entry| hash[entry.sub(/[<>]/,'')]=entry; hash}.values
 => ["hello", "<newuser", "newone"]

answered Jun 11, 2013 at 7:29

loren

612 bronze badges

Collectives™ on Stack Overflow

How to find unique lines in Unix file by negelecting a specific pattern

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related