Multiple Element Indexing in multi-dimensional array

Question

I have a 3d Numpy array and would like to take the mean over one axis considering certain elements from the other two dimensions.

This is an example code depicting my problem:

import numpy as np
myarray = np.random.random((5,10,30))
yy = [1,2,3,4]
xx = [20,21,22,23,24,25,26,27,28,29]
mymean = [ np.mean(myarray[t,yy,xx]) for t in np.arange(5) ]

However, this results in:

ValueError: shape mismatch: objects cannot be broadcast to a single shape

Why does an indexing like e.g. myarray[:,[1,2,3,4],[1,2,3,4]] work, but not my code above?

Jaime · Accepted Answer · 2013-08-27 16:24:29Z

5

This is how you fancy-index over more than one dimension:

>>> np.mean(myarray[np.arange(5)[:, None, None], np.array(yy)[:, None], xx],
            axis=(-1, -2))
array([ 0.49482768,  0.53013301,  0.4485054 ,  0.49516017,  0.47034123])

When you use fancy indexing, i.e. a list or array as an index, over more than one dimension, numpy broadcasts those arrays to a common shape, and uses them to index the array. You need to add those extra dimensions of length 1 at the end of the first indexing arrays, for the broadcast to work properly. Here are the rules of the game.

edited Aug 27, 2013 at 16:24

answered Aug 27, 2013 at 15:40

Jaime

67.7k19 gold badges128 silver badges164 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

HyperCube Over a year ago

Even on second glance, this is just bewildering to me. Anyhow, it works and I thank you a lot for the explanation!

Jaime Over a year ago

It is confusing, but if you think about it long enough, you'll eventually see that it is the syntax that makes more sense and gives a more consistent behavior, especially when considering multidimensional indexing arrays. They could have implemented a special case when all indexing arrays are 1D, but as the Zen of Python states "Special cases aren't special enough to break the rules."

HyperCube Over a year ago

In case myarray is a masked arrary, I get: TypeError: tuple indices must be integers, not tuple. So masked arrays behave differently?

Viktor Kerkez · Accepted Answer · 2013-08-27 16:28:30Z

3

Since you use consecutive elements you can use a slice:

import numpy as np
myarray = np.random.random((5,10,30))
yy = slice(1,5)
xx = slice(20, 30)
mymean = [np.mean(myarray[t, yy, xx]) for t in np.arange(5)]

edited Aug 27, 2013 at 16:28

answered Aug 27, 2013 at 15:38

Viktor Kerkez

46.8k13 gold badges109 silver badges88 bronze badges

4 Comments

Daniel Over a year ago

That is how you fix the problem, but why doesnt the example work?

HyperCube Over a year ago

Thank you! In case of no consecutive elements, what is the solution?

Viktor Kerkez Over a year ago

The problem is that broadcasting rules has to be followed in case of the integer array subindexing. If you want non-consecutive, use boolean indexing.

Viktor Kerkez Over a year ago

@Ophion docs If the index arrays do not have the same shape, there is an attempt to broadcast them to the same shape. If they cannot be broadcast to the same shape, an exception is raised:

BrenBarn · Accepted Answer · 2013-08-27 16:50:26Z

3

To answer your question about why it doesn't work: when you use lists/arrays as indices, Numpy uses a different set of indexing semantics than it does if you use slices. You can see the full story in the documentation and, as that page says, it "can be somewhat mind-boggling".

If you want to do it for nonconsecutive elements, you must grok that complex indexing mechanism.

edited Aug 27, 2013 at 16:50

answered Aug 27, 2013 at 15:48

BrenBarn

253k39 gold badges421 silver badges392 bronze badges

Collectives™ on Stack Overflow

Multiple Element Indexing in multi-dimensional array

3 Answers 3

3 Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related