2

I've tried using Counter and itertools, but since a list is unhasable, they don't work.

My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]

I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?

3
  • Converting to tuple and using Counter sounds fine, as long as you know what your data looks like and you know it should always be a list of lists. Commented Aug 12, 2014 at 2:25
  • 6
    Is [1, 2, 3] the same as [3, 2, 1]? Commented Aug 12, 2014 at 2:41
  • If you don't want to use Counter you can do something like this, which isn't very smart, but it's simple and works alright for small lists. Edit: The question in the title doesn't match the one in the description. Commented Aug 12, 2014 at 2:43

4 Answers 4

7
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})

The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:

>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})

If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).

Sign up to request clarification or add additional context in comments.

Comments

2
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))

Also, if you wanted to count the occurences of a unique* list:

 print(ll.count([1,2,3]))

*value unique, not reference unique)

Comments

2

I think, using the Counter class on tuples like

Counter(tuple(item) for item in li)

Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).

The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.

In order to improve on performance, you would have to

  • Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
  • Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)

At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.

As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.

So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.

Comments

0
list =  [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
    count = 0;
    if i not in repeats:
        for i2 in list:
            if i == i2:
                count += 1
    if count > 1:
        repeats.append(i)
    elif count == 1:
        unique += 1

print "Repeated Items"
for r in repeats:
    print r,

print "\nUnique items:", unique

loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.