0

I didn't expect them to be different, until it just cost me 2 hours to find a bug. Here is an example showing the difference I noticed, but I couldn't make sense of it.

>>> a = np.array([[1, 2], [3, 4]])
>>> a[0][0]
1
>>> a[np.array(0)][np.array(0)]
1
>>> a[0][0] = 5
>>> a
array([[5, 2],
       [3, 4]])
>>> a[np.array(0)][np.array(0)] = 6
>>> a
array([[5, 2],
       [3, 4]])

It looks like using numpy scalar as index the element can't be changed. Is a copy of the original array element instead of the reference being returned?

However, with tuple indexing, the problem is gone.

>>> a[np.array(0), np.array(0)] = 6
>>> a
array([[6, 2],
       [3, 4]])

What's happening here? I understand sementically chain bracket indexing and tuple indexing are different, but in principle shouldn't they both access the same element regardless?

Out of curiosity, I tried it with one dimensional array. The result is different.

>>> a = np.array([1, 2])
>>> a[np.array(0)] = 3
>>> a
array([3, 2])

This time the element has been modified.

The lesson I learned is that I should use tuple index for numpy arrays as much as possible just to be safe. But I would really like an explanation for these inconsistent effects. Thanks!

3
  • 1
    Think of a[i][j] as temp=a[i]; temp[j]. Look at the intermediate value, and shape if applicable. a[i, j] is one numpy indexing operation; the other is 2. The difference can matter. Commented Sep 14, 2018 at 0:45
  • @hpaulj The problem is I can change the element if i and j are integers, but if they are numpy scalars, I can't, at least with the chain bracket method. Commented Sep 14, 2018 at 0:48
  • The difference that I was hinting at in my comment is that (sometimes) the chained indexing gives different dimensions. But what I missed here is that you are trying to set values. In that case, what matters is whether the intermediate value is a view or a copy. It a copy the set doesn't work. Commented Sep 14, 2018 at 0:50

1 Answer 1

1

Looking at the databuffer location:

In [45]: a.__array_interface__['data']
Out[45]: (44666160, False)
In [46]: a[0].__array_interface__['data']
Out[46]: (44666160, False)

Same location for the a[0] case. Modifying a[0] will modify a.

But with the array index, the data buffer is different - this a copy. Modifying this copy will not affect a.

In [47]: a[np.array(0)].__array_interface__['data']
Out[47]: (43467872, False)

a[i,j] indexing is more idiomatic than a[i][j]. In some cases they are the same. But there are enough cases where they differ that it is wise to avoid the later unless you really know what it does, and why.

In [49]: a[0]
Out[49]: array([1, 2])
In [50]: a[np.array(0)]   
Out[50]: array([1, 2])
In [51]: a[np.array([0])]
Out[51]: array([[1, 2]])

Indexing with np.array(0), a 0d array, is like indexing with np.array([0]), a 1d array. Both produce a copy, whose first dimension is sized like the index.

Admittedly this is tricky, and probably doesn't show up except when doing this sort of set.


When using np.matrix the choice of [i][j] versus [i,j] affects shape as well - python difference between the two form of matrix x[i,j] and x[i][j]

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.