1

I have 2 Arrays, e.g. like:

A: [[1 2 3][2 2 2][1 2 3][2 3 3][2 2 2][2 3 3][2 3 3]]  
B: [[1 2 3][2 2 2][2 3 3]]

B are the sorted unique rows of A.
I need:

C: [0 1 0 2 1 2 2]

Which is the list of indices of B in the order of A. I would like to avoid loops because it needs to be fast even with very big arrays.

The only solutions to this i found were only for 1D Arrays (e.g. Getting the indices of several elements in a NumPy array at once ).
I think this can be solved using np.void in a similar way to this: Find unique rows in numpy.array but i cannot get my head around it :/

I need to use NumPy 1.10 with no other libraries available.

2 Answers 2

3

Given A and B, you can generate C using

In [25]: (B[:,None,:] == A).all(axis=-1).argmax(axis=0)
Out[25]: array([0, 1, 0, 2, 1, 2, 2])

Note that this assumes that every row of B is in A. (Otherwise, argmax could return bogus indices where the equality is False.)


Note that if you had NumPy version 1.13 or newer, then you could use np.unique to generate both B and C at the same time:

In [33]: np.unique(A, axis=0, return_inverse=True)
Out[33]: 
(array([[1, 2, 3],
        [2, 2, 2],
        [2, 3, 3]]), array([0, 1, 0, 2, 1, 2, 2]))

Note that Divakar's solution (using np.void) is far faster, particularly if A has many rows:

A = np.random.randint(10, size=(1000, 3))
B, C = np.unique(A, axis=0, return_inverse=True)

In [44]: %%timeit
   ....: A1D, B1D = view1D(A, B)
   ....: sidx = B1D.argsort()
   ....: out = argsort_unique(sidx)[np.searchsorted(B1D, A1D, sorter=sidx)]
   ....: 
1000 loops, best of 3: 271 µs per loop

In [45]: %timeit (B[:,None,:] == A).all(axis=-1).argmax(axis=0)
100 loops, best of 3: 15.5 ms per loop
Sign up to request clarification or add additional context in comments.

4 Comments

OP says they are on version 1.10, which AFAIK has no axis argument in unique.
right, no axis argument before 1.13 sadly. I tried to copy this function from NumpPy 1.14 source into my script but was not able to get it running and was too noobish to find out why.
@PaulPanzer: Thanks for the heads-up.
(B[:,None,:] == A).all(axis=-1).argmax(axis=0) works very good, thx. :)
3

Using void dtypes -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

# https://stackoverflow.com/a/41242285/ @Andras Deak
def argsort_unique(idx):
    n = idx.size
    sidx = np.empty(n,dtype=int)
    sidx[idx] = np.arange(n)
    return sidx

A1D, B1D = view1D(A, B)
sidx = B1D.argsort()
out = argsort_unique(sidx)[np.searchsorted(B1D, A1D, sorter=sidx)]

Sample run -

In [36]: # Let's take OP sample and shuffle them 
         # to make for a more generic sample case
    ...: A = np.array([[1 ,2, 3],[2, 2, 2],[1, 2, 3],[2, 3, 3],[2 ,2, 2],[2, 3, 3],[2 ,3 ,3]])
    ...: B = np.array([[1, 2, 3],[2, 2 ,2],[2 ,3, 3]])
    ...: 
    ...: np.random.seed(0)
    ...: np.random.shuffle(B)
    ...: indx = np.array([0,1,0,2,1,2,2]) # we need to  retrieve these
                            # as the desired o/p
    ...: A = B[indx]

In [37]: A
Out[37]: 
array([[2, 3, 3],
       [2, 2, 2],
       [2, 3, 3],
       [1, 2, 3],
       [2, 2, 2],
       [1, 2, 3],
       [1, 2, 3]])

In [38]: B
Out[38]: 
array([[2, 3, 3],
       [2, 2, 2],
       [1, 2, 3]])

In [39]: A1D, B1D = view1D(A, B)
    ...: sidx = B1D.argsort()
    ...: out = argsort_unique(sidx)[np.searchsorted(B1D, A1D, sorter=sidx)]

In [40]: out
Out[40]: array([0, 1, 0, 2, 1, 2, 2])

3 Comments

Note that np.array([-0.]).view(np.void) != np.array([0.]).view(np.void), so extra care has to be taken if np.issubdtype(arr.dtype, np.floating). In that case arr += 0. fixes the problem. It's not a problem here, but perhaps good to know in general.
This works very good, thank you :) I have chosen the other answer as correct, because it is shorter.
@kukuschi note that this answer should scale better. Of course, if you only have a handful of uniques it doesn't matter.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.