how to replace only zeros of a numpy array using a mask

Question

Consider a 2D-array:

arr = np.zeros((10,10))
arr[3:7,3:7] = 1

Now I want to replace a part of it with some other value using a mask:

mask = np.ones((5,5)).astype(bool)
arr[5:,5:][mask] = 2

Is it possible to keep the nonzero elements in the original arr and replace only the zero elements using the mask? I would like to avoid doing so by flat indexing since the arrays I deal with are large 3D arrays (about 1000x1000x1000).

EDIT: Some additional information:

I would like to avoid changing the mask, this includes setting it to False where the array is nonzero as well as resizing it. The reason is that this operation needs to be repeated lots of times with placing the mask at different regions of the array. Since the arrays are quite large, it would also be nice to avoid copying of data.

You can set the mask to zero where the array is nonzero: mask = np.logical_and(mask, arr == 0) — MB-F
– MB-F, Commented Mar 23, 2016 at 12:51
Yes, I forgot to mention that I would also try to avoid that since this operation needs to be repeated lots of times with placing the mask at different regions of the array. So I would need to regenerate it all the time.. — a.smiet
– a.smiet, Commented Mar 23, 2016 at 12:57

Thiru · Accepted Answer · 2016-03-23 12:59:09Z

2

use np.logical_and

arr = np.zeros((10,10))
arr[3:7,3:7] = 1
mask = np.ones((10,10)).astype(bool) #same shape as the array
mask = np.logical_and(mask, arr == 0)
arr[mask] = 2 # replace 0's with whatever value

answered Mar 23, 2016 at 12:59

Thiru

3,4237 gold badges38 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

hpaulj · Accepted Answer · 2016-03-23 17:31:17Z

Others have suggested logical_and, but you have objected that it involves too much copying. But first let's set up an interative case that does this

In [353]: arr=np.zeros((10,10))
In [354]: arr[3:7,3:7]=1

In [355]: tups=[(slice(5),slice(5)),
                (slice(0,5),slice(3,8)),
                (slice(4,9),slice(1,6))]

In [356]: for i,tup in enumerate(tups):
    mask1=np.logical_and(mask,arr[tup]==0)
    arr[tup][mask1]=i+1
   .....:     

In [357]: arr
Out[357]: 
array([[ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  2.,  2.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  2.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  2.,  0.,  0.],
       [ 0.,  3.,  3.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  3.,  3.,  3.,  0.,  0.,  0.,  0.],
       [ 0.,  3.,  3.,  3.,  3.,  3.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

arr[tup]==0 is another mask. It's the only way you can tell numpy that you are interested in changing only the 0s. It does not automatically treat 0s differently from 1s or 3s. I don't see any way around using logical_and to create a new mask at each step.

Application of a boolean mask does involve flat indexing - that is, the result is a 1d array (whether on the right or left hand side)

Look at the result of applying the masks from that last iteration

In [360]: arr[tup][mask]
Out[360]: 
array([ 1.,  1.,  1.,  1.,  1.,  3.,  3.,  1.,  1.,  1.,  3.,  3.,  1.,
        1.,  1.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.])

In [361]: arr[tup][mask1]
Out[361]: array([ 3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.])

Here's an alternative using np.where:

for i,tup in enumerate(tups):
    arr[tup]=np.where(arr[tup]==0,i+1,arr[tup])

That's more concise, but involves writing the whole arr[tup] slice each time.

In [374]: %%timeit arr=np.zeros((10,10),int);arr[3:7,3:7]=1
   .....: for i,tup in enumerate(tups):
    arr[tup]=np.where(arr[tup]==0,i+1,arr[tup])
   .....: 
1000 loops, best of 3: 134 us per loop

In [375]: %%timeit arr=np.zeros((10,10),int);arr[3:7,3:7]=1
   .....: for i,tup in enumerate(tups):
    mask1=np.logical_and(mask,arr[tup]==0)
    arr[tup][mask1]=i+1p
   .....: 
10000 loops, best of 3: 64.9 us per loop

Warning, when using arr[tup][mask]=..., arr[tup] must be a view, such as produced by slicing. Other indexing produces a copy, which blocks changes to the original array.

Thanks for this very clear answer. I guess arr[tup][np.logical_and(mask, arr[tup]==0)] = value is the best shot.

Imanol Luengo · Accepted Answer · 2016-03-23 16:04:13Z

If you want to apply a sliding window approach, you could use extend a bit @Thiru's approach to have it working:

>>> arr = np.zeros((10,10))
>>> arr[3:7,3:7] = 1
>>> mask = np.ones((5,5)).astype(bool)

Update the array accordingly:

>>> CONSTANT = 2
>>> arr[5:,5:] += np.logical_and(mask, arr[5:, 5:] == 0) * CONSTANT
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  1.,  1.,  1.,  1.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.]])

The addition will keep the non-zero elements as they are, logical_and will create a mask that multiplied by a constant will add 0 values where the array is non-zero and CONSTANT otherwise.

flyingmeatball · Accepted Answer · 2016-03-23 13:33:05Z

0

You can easily do this using pandas. To transform to 3d array, you'll need to use a multi-index in pandas.

import pandas as pd
import numpy as np

arr = np.zeros((10,10))
arr[3:7,3:7] = 1    

df = pd.DataFrame(arr)
df.loc[5:,5:] = df.loc[5:,5:].replace(0,2)

edited Mar 23, 2016 at 13:33

answered Mar 23, 2016 at 12:53

flyingmeatball

8,0179 gold badges48 silver badges65 bronze badges

2 Comments

a.smiet Over a year ago

Replacing the zeros is straightforward: arr[arr==0] = value. I don't need to replace all zeros but only those that are included in the mask.

flyingmeatball Over a year ago

Ok try that? Should only do the replace on the 5:5 square you had as your mask.

B. M. · Accepted Answer · 2016-03-23 17:37:38Z

0

Such local problem use fancy indexing (True/False mask) which is generally costly, due to multiple pass on the array.

Numba (or cython)is often a good source of improvement in this case:

def s1(a):
    a[N//2:,N//2:][N//2:, N//2:] == 0] = 30 

from numba import jit
@jit(nopython=True)
def s2(a):
    for i in range(N//2,N):
        for j in range(N//2,N):
            if a[i,j]==0 : a[i,j]=30

Tests for a 100x100 array :

In [8]: %timeit s1(a)
10000 loops, best of 3: 65.5 µs per loop

In [9]: %timeit s2(a)
100000 loops, best of 3: 10.5 µs per loop

answered Mar 23, 2016 at 17:37

B. M.

18.7k2 gold badges40 silver badges56 bronze badges

Collectives™ on Stack Overflow

how to replace only zeros of a numpy array using a mask

5 Answers 5

Comments

1 Comment

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

1 Comment

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related