2

I have a list made up of arrays. All have shape (2,).

Minimum example: mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]

I would like to get a unique list, e.g. [np.array([1,2]),np.array([3,4])]

or perhaps even better, a dict with counts, e.g. {np.array([1,2]) : 2, np.array([3,4]) : 1}

So far I tried list(set(mylist)), but the error is TypeError: unhashable type: 'numpy.ndarray'

4 Answers 4

4

As the error indicates, NumPy arrays aren't hashable. You can turn them to tuples, which are hashable and build a collections.Counter from the result:

from collections import Counter

Counter(map(tuple,mylist))
# Counter({(1, 2): 2, (3, 4): 1})

If you wanted a list of unique tuples, you could construct a set:

set(map(tuple,mylist))
# {(1, 2), (3, 4)}
Sign up to request clarification or add additional context in comments.

3 Comments

This answer is also good, but I accepted another one as that's what I decided to use in the end.
If you have a list of equally shaped arrays, do construct an array from it, so that you can leverage numpy's functions. np.array(mylist), and hence have a much better performance. Since you posted a list in the question, I assumed the inner arrays could be different in shape @cddt
Thank you for the comprehensive answer. You are correct, I should have converted the list to an array.
2

In general, the best option is to use np.unique method with custom parameters

u, idx, counts = np.unique(X, axis=0, return_index=True, return_counts=True)

Then, according to documentation:

  • u is an array of unique arrays
  • idx is the indices of the X that give the unique values
  • counts is the number of times each unique item appears in X

If you need a dictionary, you can't store hashable values in its keys, so you might like to store them as tuples like in @yatu's answer or like this:

dict(zip([tuple(n) for n in u], counts))

Comments

1

Pure numpy approach:

numpy.unique(mylist, axis=0)

which produces a 2d array with your unique arrays in rows:

numpy.array([
 [1 2],
 [3 4]])

Works if all your arrays have same length (like in your example). This solution can be useful depending on what you do earlier in your code: perhaps you would not need to get into plain Python at all, but stick to numpy instead, which should be faster.

Comments

0

Use the following:

import numpy as np
mylist = [np.array([1,2]),np.array([1,2]),np.array([3,4])]
np.unique(mylist, axis=0)

This gives out list of uniques arrays.

array([[1, 2],
       [3, 4]])

Source: https://numpy.org/devdocs/user/absolute_beginners.html#how-to-get-unique-items-and-counts

1 Comment

You might want to mention that this will internally construct a 2d array from the list, which will fail if one of inner arrays has a different length

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.