Comparing the rows of two pandas data frames by the column values

Question

I have two data frames which represent data pulled from two different pedometers that record how many miles a given person ran for a specific month.

I want to calculate the difference in "Miles Run" for rows in DF 1 and DF 2 which have the same "Month of Year" and "Person". For example, DF 1 and DF 2 have the miles Joe ran in January and the miles Bob ran in February. For both of these common rows, I want to calculate how different "Miles Run" is for both.

Any idea how to pull out rows from two DF's that have 2 matching column values?

DF 1:

Month of Year   Miles Run   Person 
   January      6.7458      Joe 
   February     1.3808      Bob
   March        11.2689     Jill  
   April        9.8917      Sarah

DF 2:

Month of Year   Miles Run   Person 
   November     5.5234      Andrew 
   December     7.4523      Kyle
   January      9.1189      Joe  
   February     7.4343      Bob

Scott Boston · Accepted Answer · 2017-07-19 23:13:50Z

2

Use set_index and let Pandas use intrinsic data alignment to perform subtraction:

(DF1.set_index(['Month of Year','Person']) - DF2.set_index(['Month of Year','Person'])).fillna(0)

Output:

                      Miles Run
Month of Year Person           
April         Sarah      0.0000
December      Kyle       0.0000
February      Bob       -6.0535
January       Joe       -2.3731
March         Jill       0.0000
November      Andrew     0.0000

edited Jul 19, 2017 at 23:13

answered Jul 19, 2017 at 23:11

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Shabina Rayan Over a year ago

Works perfectly. Thanks

Collectives™ on Stack Overflow

Comparing the rows of two pandas data frames by the column values

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related