1

I am writing a function which accepts a numpy array a of length 200, and matrix M of size 200 x 200, and does the following operation :

for i in range(len(a)):
    x = a[i]
    for j in range(len(a)):
        y = a[j]
        z = M[i][j]
        d[i][j] = 2 * z/(y+x)
return d

How can I vectorize this piece of code to boost my runtime?

2
  • You have to specify the shape of a and M Commented Mar 17, 2020 at 9:28
  • 1
    please check the updated description Commented Mar 17, 2020 at 9:30

4 Answers 4

2

Numpy's ufuncs all have an outer method to perform operations "cross-wise" on two arrays. So to avoid most intermediate calculation and vectorize as far as possible:

def f(M, a):
    return 2 * M / np.add.outer(a, a)

Answer for the old version of the question (left, because it's still useful):

For such things, I found it best to always work in steps, and try to find the right einsum expression.

# the definition given in the original question,
# before the z / (y + x) update
def f0():
    d = np.empty((3,3))
    for i in range(len(a)):
        x = a[i]
        for j in range(len(a)):
            y = a[j]
            z = M[i][j]
            d[i][j] = 2 * x/(y+z)
    return d

# rewrite things inlined
def f1():
    d = np.empty((3,3))
    for i in range(len(a)):
        for j in range(len(a)):
            d[i, j] = 2 * a[i]/(a[j] + M[i, j])
    return d

# factor out broadcasting
def f2():
    d = np.empty((3,3))
    for i in range(len(a)):
        m = a + M[i, :]
        for j in range(len(a)):
            d[i,j] = 2 * a[i]/m[j]
    return d

# more broadcasting
def f3():
    d = np.empty((3,3))
    m = a + M
    for i in range(len(a)):
        for j in range(len(a)):
            d[i,j] = 2 * a[i]/m[i,j]
    return d

# now turn loops into einsums
def f4():
    d = np.empty((3,3))
    m = 1/(a + M)
    d[:,:] = 2 * np.einsum('i,ij->ij', a, m)
    return d

# collect everything
def f5():
    return np.einsum('i,ij->ij', a, 2 / (a + M))
Sign up to request clarification or add additional context in comments.

11 Comments

Thank you for your inputs! I compared outputs of f5 and my function, but the outputs weren't the same( I checked using np.allclose()).
That's interesting. It works for me on dummy inputs.
Can you inspect the actual differences by eye? Are your results completely off?
yeah, the max value in the 200 * 200 matrix for my solution is ~ 1.76, with the einsum operation its > 80000. I'm testing the functions on dummy data generated using np.random.random()
how are you testing with dummy inputs?
|
1

You could do something like

d = 2*numpy.atleast_2d(a).T/(a+M)

Comments

1

Apart from numpy-vectorization using Numba would also a simple and performant method, to speed up code with loops. Example

import numpy as np
import numba as nb

@nb.njit(fastmath=True,error_model="numpy",parallel=True)
def calc(a,M):
    d=np.empty((a.shape[0],a.shape[0]))
    for i in nb.prange(a.shape[0]):
        x = a[i]
        for j in range(a.shape[0]):
            y = a[j]
            z = M[i,j]
            d[i,j] = 2. * z/(y+x)
    return d

Timings

M=np.random.rand(200,200)
a=np.random.rand(200)

d=calc(a,M) #first call takes longer due to compilation overhead
%timeit d=calc(a,M)
#parallel=True there is only quite limited speedup because of the small problem (200x200)
#11 µs ± 51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#parallel=False
#21.2 µs ± 191 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

#pure numpy solution (hpaulj)
%timeit d = 2 * M/(a[:,None]+a[None,:])
#75.7 µs ± 386 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

#without compilation
#20.8 ms ± 500 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Comments

0

With a pair of sample arrays (you should have provided these):

In [196]: a = np.arange(1,4); M = np.arange(1,10).reshape(3,3)                                                       
In [197]: a                                                                                                          
Out[197]: array([1, 2, 3])
In [198]: M                                                                                                          
Out[198]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
In [199]: d = 2 * M/(a[:,None]+a[None,:])                                                                            
In [200]: d                                                                                                          
Out[200]: 
array([[1.        , 1.33333333, 1.5       ],
       [2.66666667, 2.5       , 2.4       ],
       [3.5       , 3.2       , 3.        ]])

The a[None,:] could be simplified to a, but I wanted to clarify the broadcasting use to calculate this outer product. There are various tools in numpy for do this. I like the None indexing because it is simple and idiomatic.

testing against your code (again, you should have provided such a result):

In [202]: def foo(a): 
     ...:     d = np.zeros((3,3)) 
     ...:     for i in range(len(a)): 
     ...:         x = a[i] 
     ...:         for j in range(len(a)): 
     ...:             y = a[j] 
     ...:             z = M[i][j] 
     ...:             d[i][j] = 2 * z/(y+x) 
     ...:     return d 
     ...:                                                                                                            
In [203]: foo(a)                                                                                                     
Out[203]: 
array([[1.        , 1.33333333, 1.5       ],
       [2.66666667, 2.5       , 2.4       ],
       [3.5       , 3.2       , 3.        ]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.