0

Would the following two np.dot give the same result for a square array x?

import numpy as np
x = np.arange(4 * 4).reshape(4, 4)
np.dot(x, x.T, out=x)  # method 1
x[:] = np.dot(x, x.T)  # method 2

Thanks.

Why I ask:

x += x.T is not the same as x += x.T.copy()

I don't know how does the internal of np.dot work. Does np.dot similarly treat the out argument as a view? is it ok if out is one of the matrices to be multiplied?

The numpy that I am using is from anaconda, which is using mkl as a backend.

5
  • 2
    Did you try it out? You could check the id of the arrays to see if a new object was created Commented Jan 12, 2019 at 17:18
  • i worry about undefined behavior. trying is not enough. it is c behind python. it is not documented. Commented Jan 12, 2019 at 17:19
  • We don't have undefined behaviour. Seeing that the two methods are equivalent is no different than trusting that a single method has fully defined behaviour for this task Commented Jan 12, 2019 at 17:19
  • When do you expect the results to be different? Are there specific cases when it is not similar? Commented Jan 12, 2019 at 17:22
  • Why are you considering using the second case - x[:] = ...?? Commented Jan 12, 2019 at 17:24

3 Answers 3

3

Yes, they are the same, but performance wise I see interesting results for integer arrays:

import perfplot

def f1(x):
    x = x.copy()
    np.dot(x, x.T, out=x)
    return x

def f2(x):
    x = x.copy()
    x[:] = np.dot(x, x.T)
    return x    

perfplot.show(
    setup=lambda n: np.arange(n * n).reshape(n, n),
    kernels=[f1, f2],
    labels=['out=...', 're-assignment'],
    n_range=[2**k for k in range(0, 9)],
    xlabel='N',
    equality_check=np.allclose
)

enter image description here

I've used perfplot to generate plot timings.


For float arrays, there is absolutely no difference.

perfplot.show(
    setup=lambda n: np.arange(n * n).reshape(n, n).astype(float),
    kernels=[f1, f2],
    labels=['out=...', 're-assignment'],
    n_range=[2**k for k in range(0, 9)],
    xlabel='N',
    equality_check=np.allclose
)

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

x = x.copy() memory allocation might be the main part of the timing. Or not... numpy can secretly reuse arrays it allocated.
@Rzu I don't think so, copying is not the bottleneck here, it is linear in complexity while computing the dot is O(N^3). And in any case, copying is done for both functions, so it is fair.
I worry more about if the two methods would give the same result. The 1st method should always be at least as fast as the 2nd method, if not faster. By the way, Is your numpy using mkl as a backend?
@Rzu I add equality_check=np.allclose as a parameter to perfplot.show. If the output of the two methods was different, the function throws an AssertionError. You will see that no error is thrown, meaning it works. This was a moot point because they are identical operations differing in how the reassignment is done, that's all. And yes, my NumPy uses MKL.
1

Yes, both methods produce identical arrays.

import numpy as np

def method_1():
    x = np.arange(4 * 4).reshape(4, 4)
    np.dot(x, x.T, out=x)
    return x

def method_2():
    x = np.arange(4 * 4).reshape(4, 4)
    x[:] = np.dot(x, x.T)
    return x

array_1 = method_1()
array_2 = method_2()

print(np.array_equal(array_1, array_2))

gives the output:

True

2 Comments

is your numpy using mkl as a backend?
Yes, my numpy build uses mkl.
1

I have an older version of numpy installed (1.11.0) where method #1 produces some weird output. I understand this is not the expected behavior, and was fixed in later versions; but just in case this happens to someone else:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
>>> import numpy as np
>>> x = np.arange(4 * 4).reshape(4, 4)
>>> np.dot(x, x.T, out=x)
array([[                  14,                   94,                 1011,
                       15589],
       [              115715,          13389961335,         120510577872,
               1861218976248],
       [              182547,       21820147595568,  1728119013671256390,
         5747205779608970957],
       [              249379,       29808359122268,  7151350849816304816,
        -3559891853923251270]])
>>> np.version.version
'1.11.0'

As far as I can test, at least since numpy 1.14.1 the method #1 gives the expected output; as the method #2 does with both versions.

1 Comment

Thank you. I will ask for a numpy version >= 1.14.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.