1

I am having two dataframes which have exact same data structure. I need to compare them to see if they have any difference in records due to any column value being different.

I am using below code to do it and it works perfectly to report if things tie or untie between these two dataframes.

df=pd.concat([df1, df2])
df = df.reset_index(drop=True)
df_gpby = df.groupby(list(df.columns))
idx = [x[0] for x in df_gpby.groups.values() if len(x) == 1]
if df.reindex(idx).empty:
    print('everything is good.')
else:
    print('things do not tie out')
    df.reindex(idx).to_csv('diff.csv', index=False)
    

Though diff.csv tells me what all is missing or is different, what it doesn't tell is which record belonged to which dataframe initially and which column values differ between the initial dataframes for a given record. Is there a way to somehow get this information in my final output ?

Sample dataframes.

   Name | Age| Gender
0| Naxi | 27 | Male
1| Karan| 25 | Male
2| Tanya| 27 | Female


   Name | Age| Gender
0| Naxi | 27 | Male
1| Tanya| 27 | Female
2| Karan| 24 | Male

output I want

   Name | Age| Gender | Dataframe
   Karan| 24 | Male   | df2
   Karan| 25 | Male   | df1
2
  • can you add sample dataframes? Commented Apr 19, 2021 at 14:59
  • @Nk03 added the dataframes Commented Apr 19, 2021 at 15:15

1 Answer 1

3

You can add 1 column to each dataframe and then ignore that column while dropping duplicates (after pd.concat).

df1['Dataframe'] = 'df1'
df2['Dataframe'] = 'df2'
df=pd.concat([df1, df2])
diff_df =  df.drop_duplicates(subset=['Name', 'Age', 'Gender'], keep=False)
print(diff_df)

Output -

    Name  Age Gender Dataframe
2  Karan   24   Male       df1
1  Karan   25   Male       df2

Index in the output will help you to locate the correct row in the initial dataframe.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.