1

Let's be the following two DataFrames in python:

df:

code_1 other
19001 white
19009 blue
19008 red

df_1:

code_1 code_2
19001 00001
19001 00002
19009 00003
19008 00001

I want to merge df with df_1:

    df_merge = pd.merge(df, df_1, how="left", on=['code_1'])

df_merge:

code_1 other code_2
19001 white 00001
19001 white 00002
19009 blue 00003
19008 red 00004

I want the merge to remove duplicates in the case of code_1 and only do the merge for the first row. I could do a drop_duplicates for [other, code_1], but I would like to know if it is possible to include some parameter in the merge function to do it directly.

Expected result:

code_1 other code_2
19001 white 00001
19009 blue 00003
19008 red 00004

1 Answer 1

1

In my opinion there is no specifc parameter for pandas.merge() that fit your needs, but you could reduce the result by dropping duplicates before merging, assumed there are only duplicates in df_1:

df_merge = df.merge(df_1.drop_duplicates('code_1'), how="left", on=['code_1'])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.