How can I vectorize this for loop in numpy?

Question

The code is below:

import numpy as np
X = np.array(range(15)).reshape(5,3)  # X's element value is meaningless
flag = np.random.randn(5,4)
y = np.array([0, 1, 2, 3, 0])  # Y's element value in range(flag.shape[1]) and Y.shape[0] equals X.shape[0]
dW = np.zeros((3, 4))  # dW.shape equals (X.shape[1], flag.shape[1])
for i in xrange(5):
    for j in xrange(4):
        if flag[i,j] > 0:
            dW[:,j] += X[i,:].T
            dW[:,y[i]] -= X[i,:].T

To compute dW more efficiently, how to vectorize this for loop?

Eric · Accepted Answer · 2016-02-15 18:02:15Z

5

Here's how I'd do it:

# has shape (x.shape[1],) + flag.shape
masked = np.where(flag > 0, X.T[...,np.newaxis], 0)

# sum over the i index
dW = masked.sum(axis=1)

# sum over the j index
np.subtract.at(dW, np.s_[:,y], masked.sum(axis=2))

# dW[:,y] -= masked.sum(axis=2) does not work here

See the documentation of ufunc.at for an explanation of that last comment

edited Feb 15, 2016 at 18:02

answered Feb 15, 2016 at 16:52

Eric

98.1k54 gold badges257 silver badges389 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

hpaulj Over a year ago

With repeat values in the y index, -= can have buffering problems. That's where the ufunc .at can help.

Century Xu Over a year ago

So the first line masked = np.where(flag > 0, X.T[...,np.newaxis], 0) use broadcasting, brilliant! So one way to vectorization is increasing dimension?

Divakar · Accepted Answer · 2016-02-15 17:15:02Z

1

Here's a vectorized approach based upon np.add.reduceat -

# --------------------- Setup output array ----------------------------------
dWOut = np.zeros((X.shape[1], flag.shape[1]))

# ------ STAGE #1 : Vectorize calculations for "dW[:,j] += X[i,:].T" --------
# Get indices where flag's transposed version has > 0
idx1 = np.argwhere(flag.T > 0)

# Row-extended version of X using idx1's col2 that corresponds to i-iterator
X_ext1 = X[idx1[:,1]]

# Get the indices at which we need to columns change
shift_idx1 = np.append(0,np.where(np.diff(idx1[:,0])>0)[0]+1)

# Use the changing indices as boundaries for add.reduceat to add 
# groups of rows from extended version of X
dWOut[:,np.unique(idx1[:,0])] += np.add.reduceat(X_ext1,shift_idx1,axis=0).T

# ------ STAGE #2 : Vectorize calculations for "dW[:,y[i]] -= X[i,:].T" -------
# Repeat same philsophy for this second stage, except we need to index into y.
# So, that would involve sorting and also the iterator involved is just "i".
idx2 = idx1[idx1[:,1].argsort()]
cols_idx1 = y[idx2[:,1]]
X_ext2 = X[idx2[:,1]]
sort_idx = (y[idx2[:,1]]).argsort()
X_ext2 = X_ext2[sort_idx]
shift_idx2 = np.append(0,np.where(np.diff(cols_idx1[sort_idx])>0)[0]+1)
dWOut[:,np.unique(cols_idx1)] -= np.add.reduceat(X_ext2,shift_idx2,axis=0).T

edited Feb 15, 2016 at 17:15

answered Feb 15, 2016 at 17:00

Divakar

222k19 gold badges273 silver badges374 bronze badges

1 Comment

Century Xu Over a year ago

Thanks. Althougn it has lines of codes, it is easier to understand.

datenwolf · Accepted Answer · 2016-03-01 10:50:37Z

-1

You can do this:

ff = (flag > 0) * 1
ff = ff.reshape((5, 4, 1, 1))
XX = ff * X
[ii, jj] = np.meshgrid(np.arange(5), np.arange(4))
dW[:, jj] += XX[ii, jj, ii, :].transpose((2, 0, 1))
dW[:, y[ii]] -= XX[ii, jj, ii, :].transpose((2, 0, 1))

You can further merge and fold these expressions to get a one-liner but it won't add any more performance.

Update #1: Yep, sorry this is not giving correct results, I had a typo in my check

edited Mar 1, 2016 at 10:50

datenwolf

163k13 gold badges197 silver badges316 bronze badges

answered Feb 15, 2016 at 16:56

ALGOholic

6543 silver badges7 bronze badges

2 Comments

Eric Over a year ago

Since X is only ever indexed at [x, y, x, z], can it not be made 3 dimensional?

Eric Over a year ago

I think your first three lines are better spelt XX = np.where(flag[...,np.newaxis,np.newaxis], X, 0)

Collectives™ on Stack Overflow

How can I vectorize this for loop in numpy?

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related