3

The below code works. However, the line calculating rez is likely not written in the way intended by numpy developers. How can I rewrite it in an efficient way, without converting to and from lists?

Also, if you happen to know a good guide on how to write mathematical functions in python, which would work equally with numbers and np.arrays, please share :)

import numpy as np
from matplotlib import pyplot as plt

def L2normSquared(X, Y, x, y):
    return sum((X-x)**2 + (Y-y)**2)

X = np.random.normal(0, 1, 10)
Y = np.random.normal(0, 1, 10)

PX = np.arange(-1, 1, 0.1)
PY = np.arange(-1, 1, 0.1)
PXm, PYm = np.meshgrid(PX, PY)

rez = np.array([[L2normSquared(X, Y, x, y) for y in PY] for x in PX])

print(rez.shape)

plt.imshow(rez, cmap='jet', interpolation='nearest', origin='lower', extent=[-1, 1, -1, 1])
plt.show()

Edit: Let me explain what I am trying to do

I generate 10 random points in 2D. Then I define a loss function for any arbitrary point (x,y) in 2D. Result of that function is the sum of euclidian distances of all 10 fixed points to this arbitrary point. Finally, I want to plot this loss function using 2d imshow method

Edit 2: According to Nils Werner's answer, one can use 3D arrays and broadcasting, producing the following code

import numpy as np
from matplotlib import pyplot as plt

def L2normSquared(X, Y, x, y):
    return np.sum((X-x)**2 + (Y-y)**2, axis=0)

X = np.random.normal(0, 1, 10)
Y = np.random.normal(0, 1, 10)

PX = np.arange(-1, 1, 0.1)
PY = np.arange(-1, 1, 0.1)
PXm, PYm = np.meshgrid(PX, PY)


rez = L2normSquared(X[:, None, None], Y[:, None, None], PXm, PYm)

print(rez.shape)

plt.imshow(rez, cmap='jet', interpolation='nearest', origin='lower', extent=[-1, 1, -1, 1])
plt.show()

However, this code is actually slower than list comprehension (for 10000 random coordinates, step 0.01, it is roughly 2-3 times slower). For even larger inputs, there is a memory crash, which leads me to believe that this method internally results in 3D array dynamic programming, which does not scale well in terms of memory allocation.

Edit 3: I am very sorry, but my minimal example was too minimal. In the original problem I am facing, the coordinates X and Y do not decouple to allow them to be calculated separately. One of the original functions is

def gaussian(X, Y, x, y):
  return sum(np.exp(-(X-x)**2 -(Y-y)**2))
7
  • I don't understand what operation you are trying to do here Commented Nov 8, 2017 at 14:12
  • Sorry, check edit Commented Nov 8, 2017 at 14:16
  • So, X and Y are 10 elems, but PX and PY are 10000? Commented Nov 8, 2017 at 14:53
  • Easy test: X,Y have 10 elements, PX, PY have steps 0.1 ;;; Medium test: X,Y have 10000 elements, PX, PY have steps 0.1 ;;; Hard test: X,Y have 10000 elements, PX, PY have steps 0.01 ;;; List comprehension can do hard test Commented Nov 8, 2017 at 14:56
  • So, did either of the posted solutions work for you? Commented Nov 21, 2017 at 12:07

3 Answers 3

1

The idea behind meshgrid is that you get two or more arrays that you can simply pass into an operation themselves. So ideally, we wouldn't need a for loop at all.

But since you are doing an "outer difference" between X and PX, after which you summing over the X axis, you also need to use broadcasting to first do the outer product, and finally sum over the correct axis:

import numpy as np
from matplotlib import pyplot as plt

def L2normSquared(X, Y, x, y):
    return np.sum((X-x)**2 + (Y-y)**2, axis=0)

X = np.random.normal(0, 1, 10)
Y = np.random.normal(0, 1, 10)

PX = np.arange(-1, 1, 0.1)
PY = np.arange(-1, 1, 0.1)
PXm, PYm = np.meshgrid(PX, PY)

rez = L2normSquared(X[:, None, None], Y[:, None, None], PXm, PYm)
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the reply. The output of this and original function is not the same. Columns seem mixed up
Ok, there is still a problem. For small amounts of data, your method is faster than mine, but for large amounts of data (10000 points, step 0.01) the method you suggest is 2x-3x slower than mine. Could it be that you are creating 3D arrays, and memory allocation becomes an issue?
Yes, that might be the case.
As far as I know, a square of sums does not equal a sum of squares. I don't think it works in this case, and my actual use case is more complicated, I don't think it can be easily decomposed in this way
You are right, the results only looked similar :-)
|
1

A more systematic approach is to distinguish the sum from the norm, since your L2normSquared function can't see the particular roles of x vs X.

def normSquared(X, Y, x, y):
    return (X-x)**2 + (Y-y)**2

X = np.random.normal(0, 1, 10)
Y = np.random.normal(0, 1, 10)
x = y = np.arange(-1, 1, 0.1)

rez=normSquared(*np.meshgrid(X,Y,x,y)).sum(axis=(0,1))

normedSquared understand arrays and numbers, it's the sum wich is ambiguous here.

No need to reinvent meshgrid.

EDIT

To postpone memory problem, you can sum earlier, and se sparse meshes :

n=1000000
X = np.random.normal(0, 1, n)
Y = np.random.normal(0, 1, n)
x = np.arange(-1, 1, 0.1)
y = np.arange(-1, 1, 0.1)

Xm,Ym,xm,ym = mesh = np.meshgrid(X,Y,x,y,sparse=True)

rez=((Xm-xm)**2).sum((0,1))+((Ym-ym)**2).sum((0,1))

This solves the problem in one second for n= 1,000,000 .

5 Comments

Your method crashes for 10000 random numbers with MemoryError, whereas list comprehension has no problem with that and even larger inputs. 4D arrays is a bad idea, there has to be something less memory-taxing
Hey B.M. thanks for the effort. It looks like my minimal example was too minimal. In my original problem the sum does not decouple, I just made it too easy. One of the original functions is sum(exp(-(X-x)**2 - (Y-y)**2))
For this case sum(exp(a+b)) = sum_XY (expa expb) = sum_X(expa) * sum_Y(expb). but for other cases your approach can be necessary, without big efficiency loss, since inner loops are vectored.
Thanks again for the comment. I believe that your mathematical statement is not correct, as the product of two sums has interference terms (a+b)*(c+d)=ac+ad+bc+bd, whereas you only want the diagonal terms ac+bd
No interference here. b=np.exp((Ym-ym)**2).sum(0)*np.exp((Xm-xm)**2).sum(1) , a=np.exp((Xm-xm)**2 + (Ym-ym)**2).sum((0,1)) . np.allclose(a,b) is True.
1

Use matrix-multiplication magic with numpy.dot for both memory efficiency and performance without messing around with meshgrids, like so -

a = np.exp(-(PX[:,None]-X)**2)
b = np.exp(-(PY[:,None]-Y)**2)
out = a.dot(b.T)

Benchmarking

Approaches and relevant functions -

def gaussian(X, Y, x, y):
  return sum(np.exp(-(X-x)**2 -(Y-y)**2))

def org_method(PX, X, PY, Y):
    PXm, PYm = np.meshgrid(PX, PY)
    return np.array([[gaussian(X, Y, x, y) for y in PY] for x in PX])

def matmult_method(PX, X, PY, Y):
    a = np.exp(-(PX[:,None]-X)**2)
    b = np.exp(-(PY[:,None]-Y)**2)
    return a.dot(b.T)

Timings and verification -

In [256]: X = np.random.normal(0, 1, 1000)
     ...: Y = np.random.normal(0, 1, 1000)
     ...: 
     ...: PX = np.arange(-1, 1, 0.01)
     ...: PY = np.arange(-1, 1, 0.01)

In [257]: np.allclose(org_method(PX, X, PY, Y), matmult_method(PX, X, PY, Y))
Out[257]: True

In [258]: %timeit org_method(PX, X, PY, Y)
1 loop, best of 3: 3.76 s per loop

In [259]: %timeit matmult_method(PX, X, PY, Y)
100 loops, best of 3: 12.2 ms per loop

Thus, we are getting a speedup of 300x+ there.

2 Comments

Hey Divakar, thanks for reply. I am sorry for having stated a problem that is significantly easier than the problem I have to solve. Please check Edit3 of my original post
@AleksejsFomins Think I have cracked it. Check out the updates.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.