1

I have a dataframe with two columns. The first column, say A, has duplicates, the second does not.

I have tried

df["A"].drop_duplicates(inplace=True)

but that returns the same number of rows. How can I drop the rows where the value in column "A" is the same?

Example:

John Miller
John Smith
Mark Robinson
Jeffrey Robinson

should return

John Miller
Mark Robinson
Jeffrey Robinson

1 Answer 1

2

Use drop_duplicates with parameter subset:

df.drop_duplicates(subset=['A'],inplace=True)
print (df)
         A         B
0     John    Miller
2     Mark  Robinson
3  Jeffrey  Robinson

Docs:

subset : column label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns

Sign up to request clarification or add additional context in comments.

1 Comment

Great, this is what I wanted.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.