3

I have DataFrame in Python like below where we can see duplicates for some ID:

ID COL1 COL2 COL3
123 XX 111 ENG
123 abc 111 ENG
444 ccc 2 o
444 ccc 2 o
67 a 89 xx

And I need to select rows where is situation like for ID = 123, where rows are duplicated but in some column / columns we have different value, so as an output I need something like below:

ID COL1 COL2 COL3
123 XX 111 ENG
123 abc 111 ENG

How can I do that in Python Pandas? I can add that in my real dataset I have many many more columns so I need to create solution whoch will be good for more columns not only ID,COL1,COL2,COL3 :)

3

3 Answers 3

1

first drop duplicates for all columns then find duplicates for id column. finally select same ids.

df = df.drop_duplicates()
mask = df.duplicated(subset=['ID'],keep=False)
df = df[mask]
Sign up to request clarification or add additional context in comments.

Comments

0

here is one way to do it

# drop the duplicates
df.drop_duplicates(inplace=True)

# groupby ID and filter the ones where group size is greater than 1
df[df.groupby('ID')['ID'].transform('size')>1]

    ID  COL1    COL2    COL3
0   123     XX  111     ENG
1   123     abc     111     ENG

alternately,

# preserve the original DF and create a secondary DF with non-duplicate rows
df2=df.drop_duplicates()

# using loc, select the rows in DF2 that has a group size exceeding 1
df2.loc[df2.groupby('ID')['ID'].transform('size')>1]

Comments

0

Using .query

df = df.query("ID.eq(123)").drop_duplicates().reset_index(drop=True)
print(df)

    ID COL1  COL2 COL3
0  123   XX   111  ENG
1  123  abc   111  ENG

Unless you aren't also trying to filter:

df = df.drop_duplicates().reset_index(drop=True)
print(df)

    ID COL1  COL2 COL3
0  123   XX   111  ENG
1  123  abc   111  ENG
2  444  ccc     2    o
3   67    a    89   xx

2 Comments

This only works when the id field is equal to 123.
Yeah hard to understand what you wanted exactly thats why I also gave a non filtering version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.