4

I have two arrays of 2D coordinate points (x,y)

a = [ (x1,y1), (x2,y2), ... (xN,yN) ]
b = [ (X1,Y1), (X2,Y2), ... (XN,YN) ]

How can I find the Euclidean distances between each aligned pairs (xi,yi) to (Xi,Yi) in an 1xN array?

The scipy.spatial.cdist function gives me distances between all pairs in an NxN array.

If I just use norm function to calculate the distance one by one it seems to be slow.

Is there a built in function to do this?

3 Answers 3

10

I'm not seeing a built-in, but you could do it yourself pretty easily.

distances = (a-b)**2
distances = distances.sum(axis=-1)
distances = np.sqrt(distances)
Sign up to request clarification or add additional context in comments.

3 Comments

It amounts to the same, but it is faster to do the squaring and adding with np.dot: delta = a-b; dist = np.dot(delta, delta); dist = np.sqrt(dist)
I don't think dot vectorizes like that; it computes matrix products for 2-d inputs. You could probably do something with einsum, but I don't know the Einstein summation convention, so it's hard for me to give answers using it.
Oops! You are absolutely right, it's inner1d that does it: import numpy.core.umath_tests as ut; delta = a-b; dist = np.sqrt(dnp.inner1d(delta, delta)). Alternatively dist = np.sqrt(np.einsum('ij, ij->i', delta, delta)).
2

hypot is another valid alternative

a, b = randn(10, 2), randn(10, 2)
ahat, bhat = (a - b).T
r = hypot(ahat, bhat)

Result of timeits between manual calculation and hypot:

Manual:

timeit sqrt(((a - b) ** 2).sum(-1))
100000 loops, best of 3: 10.3 µs per loop

Using hypot:

timeit hypot(ahat, bhat)
1000000 loops, best of 3: 1.3 µs per loop

Now how about some adult-sized arrays:

a, b = randn(1e7, 2), randn(1e7, 2)
ahat, bhat = (a - b).T

timeit -r10 -n3 hypot(ahat, bhat)
3 loops, best of 10: 208 ms per loop

timeit -r10 -n3 sqrt(((a - b) ** 2).sum(-1))
3 loops, best of 10: 224 ms per loop

Not much of a performance difference between the two methods. You can squeeze out a tiny bit more from the latter by avoiding pow:

d = a - b

timeit -r10 -n3 sqrt((d * d).sum(-1))
3 loops, best of 10: 184 ms per loop

Comments

0

try adding [:, np.newaxis, :] to the first parameter

np.linalg.norm(grid[:, np.newaxis, :] - scenario.target, axis=-1)

ref Numpy Broadcast to perform euclidean distance vectorized

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.