3

I got two DataFrame and want remove rows in df1 where we have same value in column 'a' in df2. Moreover one common value in df2 will only remove one row.

df1 = pd.DataFrame({'a':[1,1,2,3,4,4],'b':[1,2,3,4,5,6],'c':[6,5,4,3,2,1]})
df2 = pd.DataFrame({'a':[2,4,2],'b':[1,2,3],'c':[6,5,4]})
result = pd.DataFrame({'a':[1,1,3,4],'b':[1,2,4,6],'c':[6,5,3,1]})
1
  • Check with isin ~ Commented Aug 24, 2020 at 16:08

3 Answers 3

3

Use Series.isin + Series.duplicated to create a boolean mask and use this mask to filter the rows from df1:

m = df1['a'].isin(df2['a']) & ~df1['a'].duplicated()
df = df1[~m]

Result:

print(df)
   a  b  c
0  1  1  6
1  1  2  5
3  3  4  3
5  4  6  1
Sign up to request clarification or add additional context in comments.

Comments

0

Try This:

import pandas as pd
df1=pd.DataFrame({'a':[1,1,2,3,4,4],'b':[1,2,3,4,5,6],'c':[6,5,4,3,2,1]})
df2=pd.DataFrame({'a':[2,4,2],'b':[1,2,3],'c':[6,5,4]})
df2a = df2['a'].tolist()
def remove_df2_dup(x):
    if x in df2a:
        df2a.remove(x)
        return False
    return True
df1[df1.a.apply(remove_df2_dup)]

It creates a list from df2['a'], then checks that list against each value of df1['a'], removing values from the list each time there's a match in df1

Comments

0

try this

df1=pd.DataFrame({'a':[1,1,2,3,4,4],'b':[1,2,3,4,5,6],'c':[6,5,4,3,2,1]})
df2=pd.DataFrame({'a':[2,4,2],'b':[1,2,3],'c':[6,5,4]})

for x in df2.a:
    if x in df1.a:
        df1.drop(df1[df1.a==x].index[0], inplace=True)

print(df1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.