Select elements row-wise based on single array

Question

Say I have an array d of size (N,T), out of which I need to select elements using index of shape (N,), where the first element corresponds to the index in the first row, etc... how would I do that?

For example

>>> d
Out[748]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])
>>> index
Out[752]: array([5, 6, 1], dtype=int64)

Expected Output:

array([[5],
       [6],
       [2])

Which is an array containing the fifth element of the first row, the 6th element of the second row and the second element of the third row.

Update

Since I will have sufficiently larger N, I was interested in the speed of the different methods for higher N. With N = 30000:

>>> %timeit np.diag(e.take(index2, axis=1)).reshape(N*3, 1)
1 loops, best of 3: 3.9 s per loop
>>> %timeit e.ravel()[np.arange(e.shape[0])*e.shape[1]+index2].reshape(N*3, 1)
1000 loops, best of 3: 287 µs per loop

Finally, you suggest reshape(). As I want to leave it as general as possible (without knowing N), I instead use [:,np.newaxis] - it seems to increase duration from 287µs to 288µs, which I'll take :)

Is the final output an array of three different arrays, like you've printed here? Or just one array with three elements? — TheSoundDefense
– TheSoundDefense, Commented Jul 11, 2014 at 15:17
Finally, I want to add the final output to the initial array. The way I printed it here, I can simply do dNew = append(d, expectedOutput, axis=-1). Other final output that also allows this is equally welcome. — FooBar
– FooBar, Commented Jul 11, 2014 at 15:21
If speed is important you should check my second edit then if it also improves on your computer. — deinonychusaur
– deinonychusaur, Commented Jul 11, 2014 at 16:29

Emanuele Paolini · Accepted Answer · 2014-07-11 16:43:46Z

2

This might be ugly but more efficient:

>>> d.ravel()[np.arange(d.shape[0])*d.shape[1]+index]
array([5, 6, 2])

edit

As pointed out by @deinonychusaur the statement above can be written as clean as:

d[np.arange(index.size),index]

edited Jul 11, 2014 at 16:43

answered Jul 11, 2014 at 15:36

Emanuele Paolini

10.2k5 gold badges45 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

marc_s · Accepted Answer · 2017-01-24 16:57:02Z

2

There might be nicer ways, but a combo of take, diag and reshape would do:

In [137]: np.diag(d.take(index, axis=1)).reshape(3, 1)
Out[137]: 
array([[5],
       [6],
       [2]])

EDIT

Comparisons with @Emanuele Paolinis' alterative, adding reshape to it to match the sought output:

In [142]: %timeit d.reshape(d.size)[np.arange(d.shape[0])*d.shape[1]+index].reshape(3, 1)
100000 loops, best of 3: 9.51 µs per loop

In [143]: %timeit np.diag(d.take(index, axis=1)).reshape(3, 1)
100000 loops, best of 3: 3.81 µs per loop

In [146]: %timeit d.ravel()[np.arange(d.shape[0])*d.shape[1]+index].reshape(3, 1)
100000 loops, best of 3: 8.56 µs per loop

This method is about twice as fast as both proposed alternatives.

EDIT 2: An even better method

Based on @Emanuele Paulinis' version but reduced number of operations outperforms all on large arrays 10k rows by 100 columns.

In [199]: %timeit d[(np.arange(index.size), index)].reshape(index.size, 1)
1000 loops, best of 3: 364 µs per loop

In [200]: %timeit d.ravel()[np.arange(d.shape[0])*d.shape[1]+index].reshape(index.size, 1)
100 loops, best of 3: 5.22 ms per loop

So if speed is of essence:

d[(np.arange(index.size), index)].reshape(index.size, 1)

edited Jan 24, 2017 at 16:57

marc_s

760k186 gold badges1.4k silver badges1.5k bronze badges

answered Jul 11, 2014 at 15:28

deinonychusaur

7,3543 gold badges32 silver badges43 bronze badges

4 Comments

Emanuele Paolini Over a year ago

I said "more efficient" because your solution passes through a N*N matrix of which you discard all but the diagonal. So I assume (but I have not checked) that your solution is quadratic in the length of index while mine should be linear.

deinonychusaur Over a year ago

@EmanuelePaolini True that given a large enough size yours should be more efficient.

deinonychusaur Over a year ago

@EmanuelePaolini for 10k rows and 100 columns yours is about 150 times faster so I suppose it's a matter of problem size

deinonychusaur Over a year ago

@EmanuelePaolini made tweak to yours that I think is more clear and is more than 10-fold faster.

Collectives™ on Stack Overflow

Select elements row-wise based on single array

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related