8

I have a numpy 2d array A, and a list of row numbers row_set. How can I get new array B such as if row_set = [0, 2, 5], then B = [A_row[0], A_row[2], A_row[5]]?

I thought of something like this:

def slice_matrix(A, row_set):
    slice = array([row for row in A if row_num in row_set])

but I don't have any idea, how can I get a row_num.

3 Answers 3

13

Use take():

In [87]: m = np.random.random((6, 2))

In [88]: m
Out[88]: 
array([[ 0.6641412 ,  0.31556053],
       [ 0.11480163,  0.00143887],
       [ 0.4677745 ,  0.43055324],
       [ 0.49749099,  0.15678506],
       [ 0.48024596,  0.65701218],
       [ 0.48952677,  0.97089177]])

In [89]: m.take([0, 2, 5], axis=0)
Out[89]: 
array([[ 0.6641412 ,  0.31556053],
       [ 0.4677745 ,  0.43055324],
       [ 0.48952677,  0.97089177]])
Sign up to request clarification or add additional context in comments.

Comments

8

You can pass a list or an array as indexes to any np array.

>>> r = np.random.randint(0,10,(5,5))
>>> r
array([[3, 8, 9, 8, 4],
       [4, 1, 5, 9, 1],
       [3, 6, 8, 8, 0],
       [5, 1, 7, 6, 1],
       [6, 1, 7, 7, 7]])
>>> idx = [0,3,1]
>>> r[idx]
array([[3, 8, 9, 8, 4],
       [5, 1, 7, 6, 1],
       [4, 1, 5, 9, 1]])

Comments

2

Speed comparison: take() is faster.

In [1]:  m = np.random.random((1000, 2))
         i = np.random.randint(1000, size=500)

         %timeit m[i]
Out[1]:
         10000 loops, best of 3: 27.2 µs per loop

In [2]:  %timeit m.take(i, axis=0)
Out[2]:
         100000 loops, best of 3: 7.24 µs per loop

This remains true for very large m and i

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.