0

I have two data frames with same column but different values, out of which some are same and some are different. I want to compare both columns and keep the common values.

df1 :

  A B C
  1 1 1
  2 4 6
  3 7 9
  4 9 0
  6 0 1

df2 :

  A D E
  1 5 7
  5 6 9
  2 3 5
  7 6 8
  3 7 0

This is what I am expecting after comparison

df2 :

  A D E
  1 5 7
  2 3 5
  3 7 0
3
  • Perhaps this can be useful? stackoverflow.com/questions/26921943/… Commented Jul 26, 2019 at 11:53
  • I guess you want to use something like merging df1 and df2 based on the column A Commented Jul 26, 2019 at 11:56
  • I don't want to merge them, I want to save the common points only. Commented Jul 26, 2019 at 12:20

1 Answer 1

1

You can use pd.Index.intersection() to find the matching columns and do a inner merge finally reindex() to keep df2.columns:

match=df2.columns.intersection(df1.columns).tolist() #finds matching cols in both df
df2.merge(df1,on=match).reindex(df2.columns,axis=1) #merge and reindex to df2.columns

   A  D  E
0  1  5  7
1  2  3  5
2  3  7  0
Sign up to request clarification or add additional context in comments.

3 Comments

Using this, I am getting the merged values and not the common ones.
@Bhavishya what is the difference? can you elaborate, this gets exactly what you posted as desired answer
I have around 250k entries in df1 and around 1.1m entries in df2, so after keeping just the common values I should get less than or equal to 250k entries in df2 but I'm getting around 1m entries instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.