To find minkowski distance between 2 multidimensional arrays in python

Question

I have a dataframe 'df', from which I want to extract values and put in 2 different arrays that would be 3D arrays. Then I want to find minkowski distances between both array for whole sets of values in the dataset and append those (according to p_values) to the original data frame. But I'm not able to create function properly

my df looks like:

    x1         y1       z1        x2        y2        z2
0  0.040928  0.250813  0.258730  0.050584  0.298290  0.273055
1  0.000000  0.174905  0.228518  0.011435  0.215528  0.233548
2  0.990905  0.746038  0.790401  0.972913  0.755414  0.822155
3  0.914052  0.669185  0.707238  0.922316  0.676172  0.734213
4  0.909504  0.480774  0.484074  0.915810  0.503221  0.489242

then I defined 2 arrays p1 and p2 as:

p1 = df[["x1", "y1", "z1"]].to_numpy() 
p2 = df[["x2", "y2", "z2"]].to_numpy()

Now I want to calculate minkowski values for different values of p, between both arrays:

from math import sqrt
 
# calculate minkowski distance
def minkowski_distance(a, b, p):
    return sum(abs(e1-e2)**p for e1, e2 in zip(a,b))**(1/p)

dist = minkowski_distance(p1,p2, 2)
dist
array([13.0317225 ,  9.36364486,  7.56526207])

I want my resultant data frame to look like:

x1  y1  z1  x2  y2  z2  m(1)  m(2)  m(3) ...

where m(1) represents minkowski distance for p=1 and so on And all the rows of this data frame should correspond to the row value for which distance is to be calculated i.e.

(x1, y1, z1) <---------m--------> (x2,y2,z2)

It gives cumultaive sort of results as shown by variable 'dist' for all values of x1,y1,z1 and x2,y2,z2. — Sukhmani Kaur Thethi
– Sukhmani Kaur Thethi, Commented Jan 22, 2022 at 8:43

mathfux · Accepted Answer · 2022-01-22 08:08:24Z

1

You could try to calculate Minkowski distance in a vectorised way:

def minkowski_distance(a, b, p=2):
    return np.sum(np.abs(a - b)**p, axis=1)**(1/p)

for p in range(1, 4):
    df[f'm({p})'] = minkowski_distance(p1, p2, p)

answered Jan 22, 2022 at 8:08

mathfux

5,9792 gold badges21 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Sukhmani Kaur Thethi Over a year ago

This worked for me. Thanks a lot @mathfux Can you please explain to me why didn't my code work this way? taking up each row of vectors and correspondingly giving the values?

mathfux Over a year ago

Alrigth, let's take an example. You want to sum [np.array([4, 9, 16]), np.array([0, 4, 9]), np.array([4, 1, 9]), np.array([4, 16, 0]), np.array([9, 9, 4])]. That's a bad idea. I didn't expect it to work but it just adds your columns. It's equivalent to np.sum(arr, axis=0). You need an axis=1. Another thing, you need to refuse iteration of arrays because numpy is not designed for it

Collectives™ on Stack Overflow

To find minkowski distance between 2 multidimensional arrays in python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related