How to remove duplicate values using pandas and keep any one [duplicate]

Question

I have a data-frame which looks like:

A       B       C       D       E
a       aa      1       2       3
b       aa      4       5       6
c       cc      7       8       9
d       cc      11      10      3
e       dd      71      81      91

As rows (1,2) and rows (3,4) has duplicate values of column B. I want to keep only one of them.

The Final output should be:

A       B       C       D       E
a       aa      1       2       3
c       cc      7       8       9
e       dd      71      81      91

How can I use pandas to accomplish this?

pradeexsu · Accepted Answer · 2020-10-03 18:37:41Z

3

DataFrame.drop_duplicates(subset="B", keep='first')

keep: keep is to control how to consider duplicate value.

It has only three distinct values and the default is ‘first’.
If ‘first’, it considers the first value as unique and the rest of the same values as duplicate.
If ‘last’, it considers the last value as unique and the rest of the same values as duplicate. If False, it considers all of the same values as duplicates

answered Oct 3, 2020 at 18:37

pradeexsu

1,1351 gold badge13 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BENY · Accepted Answer · 2020-10-03 18:32:27Z

3

Try drop_duplicates

df = df.drop_duplicates('B')
   A   B   C   D   E
0  a  aa   1   2   3
2  c  cc   7   8   9
4  e  dd  71  81  91

answered Oct 3, 2020 at 18:32

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

Sivaram Rasathurai · Accepted Answer · 2020-10-03 18:37:55Z

2

In the general case, We need to drop across multiple columns. In that case, you need to use as follow

df.drop_duplicates(subset=['A', 'C'], keep=First)

We specify the column names in the subset argument and we use the keep argument to say what we need to keep

first : Drop duplicates except for the first occurrence.
last : Drop duplicates except for the last occurrence.
False : Drop all duplicates.

answered Oct 3, 2020 at 18:37

Sivaram Rasathurai

6,4633 gold badges28 silver badges55 bronze badges

Collectives™ on Stack Overflow

How to remove duplicate values using pandas and keep any one [duplicate]

3 Answers 3

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Linked

Related