89

I have a list containing multiple lists as its elements

eg: [[1,2,3,4],[4,5,6,7]]

If I use the built in set function to remove duplicates from this list, I get the error

TypeError: unhashable type: 'list'

The code I'm using is

TopP = sorted(set(TopP),reverse=True)

Where TopP is a list just like in the e.g. Above

Is this usage of set() wrong? Is there any other way in which I can sort the above list?

2
  • 3
    What would be your desired output for the list you provided? Commented Nov 19, 2012 at 23:24
  • 1
    @arshajii This seems like a bad example since there aren't any duplicates. But anyway, they probably want [[4, 5, 6, 7], [1, 2, 3, 4]], which you can get from sorted(TopP, reverse=True) in this case. Commented Jul 17, 2023 at 0:43

4 Answers 4

79

Sets require their items to be hashable. Out of types predefined by Python only the immutable ones, such as strings, numbers, and tuples, are hashable. Mutable types, such as lists and dicts, are not hashable because a change of their contents would change the hash and break the lookup code.

Since you're sorting the list anyway, just place the duplicate removal after the list is already sorted. This is easy to implement, doesn't increase algorithmic complexity of the operation, and doesn't require changing sublists to tuples:

def uniq(lst):
    last = object()
    for item in lst:
        if item == last:
            continue
        yield item
        last = item

def sort_and_deduplicate(l):
    return list(uniq(sorted(l, reverse=True)))
Sign up to request clarification or add additional context in comments.

9 Comments

+1 for the explanation, because sorting before uniquifying is easier than the other way around, and your answer doesn't require elements to be convertible to tuple. However, it isn't quite true that only immutable types are hashable—only immutable types, and mutable types where a==b implies id(a)==id(b) are hashable. (I forget the exact wording, but there are multiple questions on SO from people who found the wording confusing…)
@abarnert Yeah, saying that only immutable types are hashable is a bit of a simplification because it ignores objects that compare/hash by part of their state. (Objects that compare by identity can still be argued to be immutable as long as their useful state is captured by their identity — think interned enums and sentinels.)
Well, yes, but the Python documentation explicitly talks about mutable objects that compare by identity, so arguing that such objects should really be considered immutable just makes an already-confusing part of the docs even more confusing.
@abarnert Do you have a reference for that? The closest I could find was the definition of hashable, but it doesn't explicitly talk about mutable objects that compare by identity. The documentation of hash and dict didn't help either.
docs.python.org/2/reference/…, under Dictionaries: "The only types of values not acceptable as keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity". If there were no such thing as mutable types that are compared by object identity, this distinction wouldn't be necessary, or even meaningful. IIRC, there's similar text once more in the reference, plus once in the tutorial.
|
29

Sets remove duplicate items. In order to do that, the item can't change while in the set. Lists can change after being created, and are termed 'mutable'. You cannot put mutable things in a set.

Lists have an immutable equivalent, called a 'tuple'. This is how you would write a piece of code that took a list of lists, removed duplicate lists, then sorted it in reverse.

result = sorted(set(map(tuple, my_list)), reverse=True)

Additional note: If a tuple contains a list, the tuple is still considered mutable.

Some examples:

>>> hash( tuple() )
3527539
>>> hash( dict() )

Traceback (most recent call last):
  File "<pyshell#5>", line 1, in <module>
    hash( dict() )
TypeError: unhashable type: 'dict'
>>> hash( list() )

Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    hash( list() )
TypeError: unhashable type: 'list'

4 Comments

Thanks!! Here, is result a tuple or list ?
It's a list of tuples; you'll have to convert the elements back to lists.
@user1747696: Converting back to lists is as easy as converting to tuples: result = map(list, sorted(set(map(tuple, my_list)), reverse=True)). (At some point, you're going to want to break this up into multiple lines, with names for the intermediate steps…)
thanks a lot for your explanations arround set vs the mutable and immutable lists
4
    python 3.2


    >>>> from itertools import chain
    >>>> eg=sorted(list(set(list(chain(*eg)))), reverse=True)
        [7, 6, 5, 4, 3, 2, 1]


   ##### eg contain 2 list within a list. so if you want to use set() function
   you should flatten the list like [1, 2, 3, 4, 4, 5, 6, 7]

   >>> res= list(chain(*eg))       # [1, 2, 3, 4, 4, 5, 6, 7]                   
   >>> res1= set(res)                    #   [1, 2, 3, 4, 5, 6, 7]
   >>> res1= sorted(res1,reverse=True)

Comments

4

Definitely not the ideal solution, but it's easier for me to understand if I convert the list into tuples and then sort it.

mylist = [[1,2,3,4],[4,5,6,7]]
mylist2 = []
for thing in mylist:
    thing = tuple(thing)
    mylist2.append(thing)
set(mylist2)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.