8

If you change a view of a numpy array, the original array is also altered. This is intended behaviour.

arr = np.array([1,2,3])
mask = np.array([True, False, False])
arr[mask] = 0
arr
# Out: array([0, 2, 3])

However, if I take a view of such a view, and change that, then the original array is not altered:

arr = np.array([1,2,3])
mask_1 = np.array([True, False, False])
mask_1_arr = arr[mask_1]  # Becomes: array([1])
mask_2 = np.array([True])
mask_1_arr[mask_2] = 0
arr
# Out: array([1, 2, 3])

This implies to me that, when you take a view of a view, you actually get back a copy. Is this correct? Why is this?

The same behaviour occurs if I use numpy arrays of numerical indices instead of a numpy array of boolean values. (E.g. arr[np.array([0])][np.array([0])] = 0 doesn't change the first element of arr to 0.)

1 Answer 1

15

Selection by basic slicing always returns a view. Selection by advanced indexing always returns a copy. Selection by boolean mask is a form of advanced indexing. (The other form of advanced indexing is selection by integer array.)

However, assignment by advanced indexing affects the original array.

So

mask = np.array([True, False, False])
arr[mask] = 0

affects arr because it is an assignment. In contrast,

mask_1_arr = arr[mask_1]

is selection by boolean mask, so mask_1_arr is a copy of part of arr. Once you have a copy, the jig is up. When Python executes

mask_2 = np.array([True])
mask_1_arr[mask_2] = 0

the assignment affects mask_1_arr, but since mask_1_arr is a copy, it has no effect on arr.


|            | basic slicing    | advanced indexing |
|------------+------------------+-------------------|
| selection  | view             | copy              |
| assignment | affects original | affects original  |

Under the hood, arr[mask] = something causes Python to call arr.__setitem__(mask, something). The ndarray.__setitem__ method is implemented to modify arr. After all, that is the natural thing one should expect __setitem__ to do.

In contrast, as an expression arr[indexer] causes Python to call arr.__getitem__(indexer). When indexer is a slice, the regularity of the elements allows NumPy to return a view (by modifying the strides and offset). When indexer is an arbitrary boolean mask or arbitrary array of integers, there is in general no regularity to the elements selected, so there is no way to return a view. Hence a copy must be returned.

Sign up to request clarification or add additional context in comments.

3 Comments

This all makes sense! I guess then that there's no easy way to do a one-liner like arr[x][y] = 1. Right now I'm doing this by assigning an intermediate value, e.g. int = arr[x]; int[y] = 1; arr[x] = int.
If x is a boolean mask, a one-line equivalent would be np.put(arr, np.flatnonzero(x)[y], 1). np.flatnonzero(x) converts the boolean mask to a 1D integer array. You can then select some subset of those integers using np.flatnonzero(x)[y] where y could be a basic slice or advanced indexer. Then np.put(arr, np.flatnonzero(x)[y], 1) works since it is roughly equivalent to arr.flat[np.flatnonzero(x)[y]] = 1.
If arr is 1-dimensional, arr[np.flatnonzero(x)[y]] = 1 also works. The purpose of np.put above is to provide an answer that works even if arr is n-dimensional.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.