1

I am trying to calculate the Euclidean Distance between two datasets in python. I can do this using the following:

np.linalg.norm(df-signal)

With df and signal being my two datasets. This returns a single numerical value (i.e, 8258155.579535276), which is fine. My issue is that I want it to return the difference between each column in the dataset. Something like this:

AFNLWGT     4.867376e+10
AGI         3.769233e+09
EMCONTRB    1.202935e+07
FEDTAX      8.095078e+07
PTOTVAL     2.500056e+09
STATETAX    1.007451e+07
TAXINC      2.027124e+09
POTHVAL     1.158428e+08
INTVAL      1.606913e+07
PEARNVAL    2.038357e+09
FICA        1.080950e+07
WSALVAL     1.986075e+09
ERNVAL      1.905109e+09

I'm fairly new to Python so would really appreciate any help possible.

2
  • As explained in the documentation, you can specify the axis parameter in np.linalg.norm. For column-wise distance use axis=0. Commented Apr 11, 2020 at 11:31
  • That's returning all the numerical values, is it possible to have it similar to the desired way I mention in the question? That's what I get when I use "np.square(np.subtract(df, signal)).mean()". Commented Apr 11, 2020 at 11:36

1 Answer 1

2

To have the columnwise norm with column headers you can use pandas.DataFrame.aggregate together with np.linalg.norm:

import pandas as pd
import numpy as np

norms = (df-signal).aggregate(np.linalg.norm)

Notice that, by default, .aggregate operates along the 0-axis (hence columns).

However this will be much slower than the numpy implementation:

norms = pd.Series(np.linalg.norm(df.to_numpy()-signal.to_numpy(), axis=0), 
                  index=df.columns)

With test data of size 100x2, the latter is 20x faster.

Sign up to request clarification or add additional context in comments.

1 Comment

I also added the way I would actually do it using numpy for performance.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.