0

I have a very interesting problem that I need to solve.

Assume that I have a string in the following format:

input = ['opst tops', 'opst opts', 'opst pots', 'eip pie', 'eip epi']

The above string will help me find anagrams of words. For example, the word "tops" has signiature "opts", meanwhile, the word "opts" also has signiature "opst" ... Hence, all words has signiature "opst" should be grouped together, as followed. The output is the word anagram class.

output = ['tops opts pots', 'pie epi']

I am new to python and I'd appreciate if you can help. Sorry about the confusion and hope this make sense.

2
  • did you tried something? Commented Feb 16, 2014 at 22:36
  • is it supposed to be: raw = ['opst tops', 'opst opts', 'opst pots', 'eip pie', 'eip epi']? Commented Feb 16, 2014 at 22:37

2 Answers 2

3

Use a collections.defaultdict() object for ease, to collect your words:

from collections import defaultdict

words = defaultdict(list)
for entry in raw:
    key, word = entry.split()
    words[key].append(word)

raw = [' '.join(v) for v in words.values()]

The defaultdict makes the code cleaner here; it's just a subclass of dict that will call the factory (here set to list) if a key doesn't yet exist. Without a defaultdict you'd have to use:

words = {}

and in the loop:

words.setdefault(key, []).append(word)

Demo:

>>> from collections import defaultdict
>>> raw = ['opst tops', 'opst opts', 'opst pots', 'eip pie', 'eip epi']
>>> words = defaultdict(list)
>>> for entry in raw:
...     key, word = entry.split()
...     words[key].append(word)
... 
>>> [' '.join(v) for v in words.values()]
['pie epi', 'tops opts pots']

If your input list is sorted and order is important, you can also use itertools.groupb():

from itertools import groupby

raw = [' '.join(w.split()[1] for w in words) 
       for key, words in groupby(raw, key=lambda e: e.split()[0])]

Demo:

>>> from itertools import groupby
>>> [' '.join(w.split()[1] for w in words) 
...        for key, words in groupby(raw, key=lambda e: e.split()[0])]
['tops opts pots', 'pie epi']
Sign up to request clarification or add additional context in comments.

Comments

1

Something like:

from collections import defaultdict

d = defaultdict(list)
for s in raw:
    sig, val = s.split(' ')
    d[sig].append(val)
res = [' '.join(val) for val in d.values()]

Note that using a dictionary means the "signatures" may not come out in the order they went in.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.