What's the difference when indexing a numpy array between using an integer and a numpy scalar?

Question

I didn't expect them to be different, until it just cost me 2 hours to find a bug. Here is an example showing the difference I noticed, but I couldn't make sense of it.

>>> a = np.array([[1, 2], [3, 4]])
>>> a[0][0]
1
>>> a[np.array(0)][np.array(0)]
1
>>> a[0][0] = 5
>>> a
array([[5, 2],
       [3, 4]])
>>> a[np.array(0)][np.array(0)] = 6
>>> a
array([[5, 2],
       [3, 4]])

It looks like using numpy scalar as index the element can't be changed. Is a copy of the original array element instead of the reference being returned?

However, with tuple indexing, the problem is gone.

>>> a[np.array(0), np.array(0)] = 6
>>> a
array([[6, 2],
       [3, 4]])

What's happening here? I understand sementically chain bracket indexing and tuple indexing are different, but in principle shouldn't they both access the same element regardless?

Out of curiosity, I tried it with one dimensional array. The result is different.

>>> a = np.array([1, 2])
>>> a[np.array(0)] = 3
>>> a
array([3, 2])

This time the element has been modified.

The lesson I learned is that I should use tuple index for numpy arrays as much as possible just to be safe. But I would really like an explanation for these inconsistent effects. Thanks!

Think of a[i][j] as temp=a[i]; temp[j]. Look at the intermediate value, and shape if applicable. a[i, j] is one numpy indexing operation; the other is 2. The difference can matter. — hpaulj
– hpaulj, Commented Sep 14, 2018 at 0:45
@hpaulj The problem is I can change the element if i and j are integers, but if they are numpy scalars, I can't, at least with the chain bracket method. — user8578429
– user8578429, Commented Sep 14, 2018 at 0:48
The difference that I was hinting at in my comment is that (sometimes) the chained indexing gives different dimensions. But what I missed here is that you are trying to set values. In that case, what matters is whether the intermediate value is a view or a copy. It a copy the set doesn't work. — hpaulj
– hpaulj, Commented Sep 14, 2018 at 0:50

hpaulj · Accepted Answer · 2018-09-14 02:20:12Z

Looking at the databuffer location:

In [45]: a.__array_interface__['data']
Out[45]: (44666160, False)
In [46]: a[0].__array_interface__['data']
Out[46]: (44666160, False)

Same location for the a[0] case. Modifying a[0] will modify a.

But with the array index, the data buffer is different - this a copy. Modifying this copy will not affect a.

In [47]: a[np.array(0)].__array_interface__['data']
Out[47]: (43467872, False)

a[i,j] indexing is more idiomatic than a[i][j]. In some cases they are the same. But there are enough cases where they differ that it is wise to avoid the later unless you really know what it does, and why.

In [49]: a[0]
Out[49]: array([1, 2])
In [50]: a[np.array(0)]   
Out[50]: array([1, 2])
In [51]: a[np.array([0])]
Out[51]: array([[1, 2]])

Indexing with np.array(0), a 0d array, is like indexing with np.array([0]), a 1d array. Both produce a copy, whose first dimension is sized like the index.

Admittedly this is tricky, and probably doesn't show up except when doing this sort of set.

When using np.matrix the choice of [i][j] versus [i,j] affects shape as well - python difference between the two form of matrix x[i,j] and x[i][j]

Collectives™ on Stack Overflow

What's the difference when indexing a numpy array between using an integer and a numpy scalar?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related