2

I have a 2d array like

small = np.arange(9).reshape((3, 3))

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

I want to apply a padding, each row to pad variable 0s on the left. And to make sure the result 2d array is of shape 3 x 8. (padding 0s on the right)

offset = np.array([1, 3, 2])

So that the result looks like

array([[ 0.,  0.,  1.,  2.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  3.,  4.,  5.,  0.,  0.],
       [ 0.,  0.,  6.,  7.,  8.,  0.,  0.,  0.]])

What's the best way to achieve that?

thanks to @Divakar solution. I ran some benchmarks on the following methods.

def f1(small, offset, ncols):
    nrows, num_small_cols = small.shape
    big = np.zeros((nrows, ncols))
    inner = np.empty_like(small, dtype=np.int64)
    for i in range(num_small_cols):
        inner[:, i] = offset + i
    big[np.arange(nrows)[:, None], inner] = small
    return big

def f2(small, offset, ncols):
    n = small.shape[1]
    r = np.arange(ncols)
    offset2 = offset[:,None]
    # This took a lot of time
    mask = (offset2 <= r) & (offset2 + n > r)
    out = np.zeros_like(mask, dtype=np.float64)
    out[mask] = small.ravel()
    return out

def f3(small, offset, ncols):
    n = small.shape[1]
    m = ncols - n
    small_pad = np.zeros((len(small), n + 2*m))
    small_pad[:,m:m+n] = small    
    w = view_as_windows(small_pad, (1,ncols))[:,:,0]
    return w[np.arange(len(offset)), ncols-offset-n]

n = 10000
offset = np.repeat(np.array([1, 3, 2]), n)
small = np.random.rand(n * 3, 5)

%timeit f1(small, offset, 9)
# 1.32 ms

%timeit f2(small, offset, 9)
# 2.24 ms

%timeit f3(small, offset, 9)
# 1.3 ms
2
  • As it turns out using an array for indexing is faster for the strided one. Edited my post with it. So, could you please re-run those tests? Commented Apr 26, 2018 at 16:14
  • @Divakar Oh, I didn't know np.arange is so much better than range in this case. Commented Apr 26, 2018 at 17:27

2 Answers 2

4

Approach #1

We can use broadcasting to create a mask for assigning into those positions and then assign into a zeros-intialized array -

def pad_offsetpos(small, ncols):
    n = small.shape[1]
    r = np.arange(ncols)
    mask = (offset[:,None] <= r) & (offset[:,None]+n > r)
    out = np.zeros(mask.shape)
    out[mask] = small.ravel()
    return out

Sample run -

In [327]: small
Out[327]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [328]: offset
Out[328]: array([1, 3, 2])

In [329]: pad_offsetpos(small, ncols=8)
Out[329]: 
array([[0., 0., 1., 2., 0., 0., 0., 0.],
       [0., 0., 0., 3., 4., 5., 0., 0.],
       [0., 0., 6., 7., 8., 0., 0., 0.]])

Approach #2

We can also leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows for efficient patch extraction after padding the input array with enough zeros on either sides -

from skimage.util.shape import view_as_windows

def pad_offsetpos_strided(small, ncols):
    n = small.shape[1]
    m = ncols - n
    small_pad = np.zeros((len(small), n + 2*m))
    small_pad[:,m:m+n] = small    
    w = view_as_windows(small_pad, (1,ncols))[:,:,0]
    return w[np.arange(len(offset)), ncols-offset-n]
Sign up to request clarification or add additional context in comments.

Comments

0
PaddedMat=numpy.zeros(shape=[3,8],dtype="float")

and then loop to fill it.

PaddedMat[i,offset[i]:offset[i]+3]=small[i,:]

etc...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.