Remove duplicate columns in pandas

Question

I try to delete columns with duplicate data in pandas, for example, the following data(They have the same data but different column names)：

df1 = pd.DataFrame({'one': [1, 2, 3, 4], 'two': ['a', 'b', 'c', 'd'], 'three': [1, 2, 3, 4]})
   one two  three
0    1   a      1
1    2   b      2
2    3   c      3
3    4   d      4

I hope to get this result：

  one two
0   1   a
1   2   b
2   3   c
3   4   d

The method I use now is：

df2 = df1.T.drop_duplicates().T

But this is too inefficient, is there a better way?

Hope to get your help, thanks

halfer · Accepted Answer · 2021-05-08 12:06:00Z

1

I tried to improve a little efficiency like this:

In [935]: df_int = df1.select_dtypes(include=['int'])
In [933]: df_other = df1.select_dtypes(exclude=['int'])

In [949]: if df_int.T.drop_duplicates().shape[0] == 1:
     ...:     res = pd.concat([df_int.iloc[:,0], df_other], axis=1)
     ...: 

In [950]: res
Out[950]: 
   one two
0    1   a
1    2   b
2    3   c
3    4   d

To remove transpose completely, you can do something like this:

In [995]: import numpy as np
In [997]: if (pd.DataFrame(np.diff(df_int.values)).sum() == 0).all():
     ...:     res = pd.concat([df_int.iloc[:,0], df_other], axis=1)

edited May 8, 2021 at 12:06

halfer

20.2k20 gold badges110 silver badges207 bronze badges

answered Sep 28, 2020 at 8:49

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Sakura Maozi Over a year ago

Thank you for your help, this can improve efficiency, but the data I need to process is too large, and the transposition takes too much time. If possible, I hope not to use transpose.

Mayank Porwal Over a year ago

I've updated my answer to not have transpose at all. Let me know if this helps you.

Collectives™ on Stack Overflow

Remove duplicate columns in pandas

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related