1

I have two dataframes: 1 major df and 1 with rows that I want to delete in the major one (dfmatch). The major df has more columns than the dfmatch.

I only want to delete the rows in major df if column1, column2 AND column3 equals with the value in the corresponinding columns of dfmatch.

Column extra1 and extra2 should be available in dfnew as well.

My current script only shows the column headers instead of the remaining rows:

file = 'testdf.csv'
colnames=['column1', 'column2', 'column3', 'extra1', 'extra2'] 
df = pd.read_csv(file, names=colnames, header=None)

file = 'testdfmatch.csv'
colnames=['column1', 'column2', 'column3'] 
dfmatch = pd.read_csv(file, names=colnames, header=None)

dfnew = pd.concat([dfmatch,df,df], sort=False).drop_duplicates(['column1', 'column2', 'column3'], keep=False)

2 Answers 2

2

Sample data would have been useful. Lets try pd.merge, indicator=

dfnew  = pd.merge(df, dfmatch, how='left', indicator='Exist')
dfnew  = dfnew .loc[dfnew ['Exist'] != 'both']
dfnew.drop(columns=['Exist'], inplace=True) 
print(dfnew)
Sign up to request clarification or add additional context in comments.

Comments

0

The above code does what you want.

dfnew=df.append(dfmatch,ignore_index=True)
defnew.drop_duplicates(subset=['column1', 'column2', 'column3'],
                 keep = 'first', inplace = True)

It adds dfmatch below df creating dfnew. Then it removes the duplicate rows only using column1, 2 and 3 as a subset. It keeps only the first occurrence that corresponds to the initial rows from df which include extra1 and extra2.

I wouldn't suggest though using float values as a subset due to the float precision handling in python. Rows with NaN on extra1 and extra2 indicate that were originally on dfmatch.

4 Comments

The "duplicate" rows should be removed, instead of keeping the first occurence.
@Scripter Then you can set keep=False to remove all duplicates.
Thanks. Does this answer has memory advantages compared to the other answer?
@Scripter I am not sure about that. I ll have to try the previously suggested solutions myself to see if it does.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.