Numpy ":" operator broadcasting issues

Question

In the following code I have written 2 methods that theoretically(in my mind) should do the same thing. Unfortunately they don't, I am unable to find out why they don't do the same thing per the numpy documentation.

import numpy as np


dW = np.zeros((20, 10))
y = [1 for _ in range(100)]
X =  np.ones((100, 20))

# ===================
# Method 1  (works!)
# ===================
for i in range(len(y)):
  dW[:, y[i]] -=  X[i]


# ===================
# Method 2 (does not work)
# ===================
dW[:, y] -=  X.T

javidcf · Accepted Answer · 2018-02-27 15:39:50Z

1

As indicated, in principle you cannot operate multiple times over the same element in a single operation, due to how buffering works in NumPy. For that purpose there is the at function, which can be used on about any standard NumPy function (add, subtract, etc.). For your case, you can do:

import numpy as np

dW = np.zeros((20, 10))
y = [1 for _ in range(100)]
X =  np.ones((100, 20))
# at modifies in place dW, does not return a new array
np.subtract.at(dW, (slice(None), y), X.T)

answered Feb 27, 2018 at 15:39

javidcf

59.9k7 gold badges87 silver badges134 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Paul Panzer · Accepted Answer · 2018-02-27 16:12:35Z

This is a column-wise version of this question.

The answer there can be adapted to work column-wise as follows:

Approach 1: np.<ufunc>.at

>>> np.subtract.at(dW, (slice(None), y), X.T)

Approach 2: np.bincount

>>> m, n = dW.shape
>>> dW -= np.bincount(np.add.outer(np.arange(m) * n, y).ravel(), (X.T).ravel(), dW.size).reshape(m, n)

Please note that the bincount based solution - even though it involves more steps - is faster by a factor of ~6.

>>> from timeit import repeat
>>> kwds = dict(globals=globals(), number=5000)
>>>
>>> repeat('np.subtract.at(dW, (slice(None), y), X.T); np.add.at(dW, (slice(None), y), X.T)', **kwds)
[1.590626839082688, 1.5769231889862567, 1.5802007300080732]
>>> repeat('_= dW; _ -= np.bincount(np.add.outer(np.arange(m) * n, y).ravel(), (X.T).ravel(), dW.size).reshape(m, n); _ += np.bincount(np.add.outer(np.arange(m) * n, y).ravel(), (X.T).ravel(), dW.size).reshape(m, n)', **kwds)
[0.2582490430213511, 0.25572817400097847, 0.25478115503210574]

user9240949 · Accepted Answer · 2018-02-28 11:55:13Z

1

I found a third solution for this problem. Normal Matrix multiplication:

ind = np.zeros((X.shape[0],dW.shape[1]))
ind[range(X.shape[0]),y] = -1
dW = X.T.dot(ind)

I did some experiment using the proposed methods above on some neural network data. In my example X.shape = (500,3073), W.shape = (3073,10) and ind.shape = (500,10).

The subtract version takes about 0.2 second (slowest). The Matrix multiplication method 0.01 s (fastest). Normal loop 0.015 and then bincountmethod 0.04 s. Note that in the question y is vector of ones. This is not my case. The case with only ones can be solved with a simple sum.

answered Feb 28, 2018 at 11:55

user9240949

Comments

Sunnysinh Solanki · Accepted Answer · 2018-02-27 15:33:36Z

0

Option 1:

for i in range(len(y)):
  dW[:, y[i]] -=  X[i]

This works because you are looping through and updating value which was updated last time.

Option 2:

dW[:, [1,1,1,1,....1,1,1]] -=  [[1,1,1,1...1],
                                [1,1,1,1...1],
                                .
                                .
                                [1,1,1,1...1]]

It does not work because update happens to 1st index at same time in parallel not in serial way. Initially all are 0 so subtracting results in -1.

answered Feb 27, 2018 at 15:33

Sunnysinh Solanki

5514 silver badges10 bronze badges

Collectives™ on Stack Overflow

Numpy ":" operator broadcasting issues

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related