2

I have two data frames which represent data pulled from two different pedometers that record how many miles a given person ran for a specific month.

I want to calculate the difference in "Miles Run" for rows in DF 1 and DF 2 which have the same "Month of Year" and "Person". For example, DF 1 and DF 2 have the miles Joe ran in January and the miles Bob ran in February. For both of these common rows, I want to calculate how different "Miles Run" is for both.

Any idea how to pull out rows from two DF's that have 2 matching column values?

DF 1:

Month of Year   Miles Run   Person 
   January      6.7458      Joe 
   February     1.3808      Bob
   March        11.2689     Jill  
   April        9.8917      Sarah  

DF 2:

Month of Year   Miles Run   Person 
   November     5.5234      Andrew 
   December     7.4523      Kyle
   January      9.1189      Joe  
   February     7.4343      Bob

1 Answer 1

2

Use set_index and let Pandas use intrinsic data alignment to perform subtraction:

(DF1.set_index(['Month of Year','Person']) - DF2.set_index(['Month of Year','Person'])).fillna(0)

Output:

                      Miles Run
Month of Year Person           
April         Sarah      0.0000
December      Kyle       0.0000
February      Bob       -6.0535
January       Joe       -2.3731
March         Jill       0.0000
November      Andrew     0.0000
Sign up to request clarification or add additional context in comments.

1 Comment

Works perfectly. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.