6

Consider a 2D-array:

arr = np.zeros((10,10))
arr[3:7,3:7] = 1

Now I want to replace a part of it with some other value using a mask:

mask = np.ones((5,5)).astype(bool)
arr[5:,5:][mask] = 2

Is it possible to keep the nonzero elements in the original arr and replace only the zero elements using the mask? I would like to avoid doing so by flat indexing since the arrays I deal with are large 3D arrays (about 1000x1000x1000).

EDIT: Some additional information:

I would like to avoid changing the mask, this includes setting it to False where the array is nonzero as well as resizing it. The reason is that this operation needs to be repeated lots of times with placing the mask at different regions of the array. Since the arrays are quite large, it would also be nice to avoid copying of data.

2
  • 1
    You can set the mask to zero where the array is nonzero: mask = np.logical_and(mask, arr == 0) Commented Mar 23, 2016 at 12:51
  • Yes, I forgot to mention that I would also try to avoid that since this operation needs to be repeated lots of times with placing the mask at different regions of the array. So I would need to regenerate it all the time.. Commented Mar 23, 2016 at 12:57

5 Answers 5

2

use np.logical_and

arr = np.zeros((10,10))
arr[3:7,3:7] = 1
mask = np.ones((10,10)).astype(bool) #same shape as the array
mask = np.logical_and(mask, arr == 0)
arr[mask] = 2 # replace 0's with whatever value
Sign up to request clarification or add additional context in comments.

Comments

1

Others have suggested logical_and, but you have objected that it involves too much copying. But first let's set up an interative case that does this

In [353]: arr=np.zeros((10,10))
In [354]: arr[3:7,3:7]=1

In [355]: tups=[(slice(5),slice(5)),
                (slice(0,5),slice(3,8)),
                (slice(4,9),slice(1,6))]

In [356]: for i,tup in enumerate(tups):
    mask1=np.logical_and(mask,arr[tup]==0)
    arr[tup][mask1]=i+1
   .....:     

In [357]: arr
Out[357]: 
array([[ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  2.,  0.,  0.],
       [ 0.,  3.,  3.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  3.,  3.,  3.,  0.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  3.,  3.,  3.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

arr[tup]==0 is another mask. It's the only way you can tell numpy that you are interested in changing only the 0s. It does not automatically treat 0s differently from 1s or 3s. I don't see any way around using logical_and to create a new mask at each step.


Application of a boolean mask does involve flat indexing - that is, the result is a 1d array (whether on the right or left hand side)

Look at the result of applying the masks from that last iteration

In [360]: arr[tup][mask]
Out[360]: 
array([ 1.,  1.,  1.,  1.,  1.,  3.,  3.,  1.,  1.,  1.,  3.,  3.,  1.,
        1.,  1.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.])

In [361]: arr[tup][mask1]
Out[361]: array([ 3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.])

Here's an alternative using np.where:

for i,tup in enumerate(tups):
    arr[tup]=np.where(arr[tup]==0,i+1,arr[tup])

That's more concise, but involves writing the whole arr[tup] slice each time.

In [374]: %%timeit arr=np.zeros((10,10),int);arr[3:7,3:7]=1
   .....: for i,tup in enumerate(tups):
    arr[tup]=np.where(arr[tup]==0,i+1,arr[tup])
   .....: 
1000 loops, best of 3: 134 us per loop

In [375]: %%timeit arr=np.zeros((10,10),int);arr[3:7,3:7]=1
   .....: for i,tup in enumerate(tups):
    mask1=np.logical_and(mask,arr[tup]==0)
    arr[tup][mask1]=i+1p
   .....: 
10000 loops, best of 3: 64.9 us per loop

Warning, when using arr[tup][mask]=..., arr[tup] must be a view, such as produced by slicing. Other indexing produces a copy, which blocks changes to the original array.

1 Comment

Thanks for this very clear answer. I guess arr[tup][np.logical_and(mask, arr[tup]==0)] = value is the best shot.
1

If you want to apply a sliding window approach, you could use extend a bit @Thiru's approach to have it working:

>>> arr = np.zeros((10,10))
>>> arr[3:7,3:7] = 1
>>> mask = np.ones((5,5)).astype(bool)

Update the array accordingly:

>>> CONSTANT = 2
>>> arr[5:,5:] += np.logical_and(mask, arr[5:, 5:] == 0) * CONSTANT
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.]])

The addition will keep the non-zero elements as they are, logical_and will create a mask that multiplied by a constant will add 0 values where the array is non-zero and CONSTANT otherwise.

Comments

0

You can easily do this using pandas. To transform to 3d array, you'll need to use a multi-index in pandas.

import pandas as pd
import numpy as np

arr = np.zeros((10,10))
arr[3:7,3:7] = 1    

df = pd.DataFrame(arr)
df.loc[5:,5:] = df.loc[5:,5:].replace(0,2)

2 Comments

Replacing the zeros is straightforward: arr[arr==0] = value. I don't need to replace all zeros but only those that are included in the mask.
Ok try that? Should only do the replace on the 5:5 square you had as your mask.
0

Such local problem use fancy indexing (True/False mask) which is generally costly, due to multiple pass on the array.

Numba (or cython)is often a good source of improvement in this case:

def s1(a):
    a[N//2:,N//2:][N//2:, N//2:] == 0] = 30 

from numba import jit
@jit(nopython=True)
def s2(a):
    for i in range(N//2,N):
        for j in range(N//2,N):
            if a[i,j]==0 : a[i,j]=30

Tests for a 100x100 array :

In [8]: %timeit s1(a)
10000 loops, best of 3: 65.5 µs per loop

In [9]: %timeit s2(a)
100000 loops, best of 3: 10.5 µs per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.