Return entries with common columns values in pandas DataFrame - python

Question

I have a DataFrame in python pandas which contains several different entries (rows) having also integer values in columns, for example:

   A  B  C  D  E  F  G  H
0  1  2  1  0  1  2  1  2  
1  0  1  1  1  1  2  1  2
2  1  2  1  2  1  2  1  3
3  0  1  1  1  1  2  1  2 
4  2  2  1  2  1  2  1  3

I would return just the rows which contain common values in columns, the result should be:

   A  B  C  D  E  F  G  H  
1  0  1  1  1  1  2  1  2
3  0  1  1  1  1  2  1  2

Thanks in advance

EdChum · Accepted Answer · 2017-05-02 10:28:59Z

2

You can use the boolean mask from duplicated passing param keep=False:

In [3]:
df[df.duplicated(keep=False)]

Out[3]:
   A  B  C  D  E  F  G  H
1  0  1  1  1  1  2  1  2
3  0  1  1  1  1  2  1  2

Here is the mask showing the rows that are duplicates, passing keep=False returns all duplicate rows, by default it would return the first duplicate row:

In [4]:
df.duplicated(keep=False)

Out[4]:
0    False
1     True
2    False
3     True
4    False
dtype: bool

answered May 2, 2017 at 10:28

EdChum

397k204 gold badges837 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jezrael · Accepted Answer · 2017-05-02 10:50:20Z

1

Need duplicated with parameter keep=False for return all duplicates with boolean indexing:

print (df.duplicated(keep=False))
0    False
1     True
2    False
3     True
4    False
dtype: bool

df = df[df.duplicated(keep=False)]
print (df)
   A  B  C  D  E  F  G  H
1  0  1  1  1  1  2  1  2
3  0  1  1  1  1  2  1  2

Also if need remove first or last duplicates rows use:

df1 = df[df.duplicated()]
#same as 'first', default parameter, so an be omit
#df1 = df[df.duplicated(keep='first')]
print (df1)
   A  B  C  D  E  F  G  H
3  0  1  1  1  1  2  1  2

df2 = df[df.duplicated(keep='last')]
print (df2)
   A  B  C  D  E  F  G  H
1  0  1  1  1  1  2  1  2

edited May 2, 2017 at 10:50

answered May 2, 2017 at 10:29

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Collectives™ on Stack Overflow

Return entries with common columns values in pandas DataFrame - python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related