47

I have a DataFrame that has duplicated rows. I'd like to get a DataFrame with a unique index and no duplicates. It's ok to discard the duplicated values. Is this possible? Would it be a done by groupby?

2 Answers 2

86
In [29]: df.drop_duplicates()
Out[29]: 
   b  c
1  2  3
3  4  0
7  5  9
Sign up to request clarification or add additional context in comments.

4 Comments

It's worthwhile to note this takes either the first or last occurrence. So you need to sort by some other quantity first (if you're lucky) or do some complicated groupby logic anyway.
This is wrong. drop_duplicates acts on the values only (at least in my version). You need to reset_index if you want to drop on index and values or just work with the index if you want to have a unique index. Maybe there is another way besides groupby to enforce unique index?
Use df.drop_duplicates(inplace=True) if you don't want to assign a new variable.
this does not give a dataframe with unique index, the solution by @Adam Greenhall below, however works for that
11

Figured out one way to do it by reading the split-apply-combine documentation examples.

df = pandas.DataFrame({'b':[2,2,4,5], 'c': [3,3,0,9]}, index=[1,1,3,7])
df_unique = df.groupby(level=0).first()

df
   b  c
1  2  3
1  2  3
3  4  0
7  5  9

df_unique
   b  c
1  2  3
3  4  0
7  5  9

3 Comments

This relies on the row index being duplicated for rows where the data fields (b,c) are duplicated, effectively making the index part of your row as vector that you want to be unique (not duplicated).
If you have duplicated index entries, this is the answer you want.
I was getting ValueError: Index contains duplicate entries, cannot reshape when doing unstack on a MultIndex but this solution works for that only I had to do df_unique = df.groupby(level=[0,1]).first()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.