Is there any function to remove duplicate values in rows in jupyter?

Question

I have a csv file. I need to remove the duplicates values under street_name. ex: I have multi hwy-1w! enter image description here

I used this query: joinedResult.groupby('roadId')['street_name'].apply(', '.join).reset_index().to_csv(f'./2{areaId}.csv', index = False)

Can you please provide a minimal reproducible example so others can reproduce this problem. Screenshots cannot be reproduced and one can't see a thing on this one anyway. Assuming you have them in a column of a dataframe you can use df["street_name"].unique(). See pandas docs — Mushroomator
– Mushroomator, Commented Mar 23, 2022 at 21:42

liamthorne4 · Accepted Answer · 2022-03-23 21:57:45Z

0

If you want unique per row, this question might be of help. If you want to keep the data in the row and don't care about order of the string in the row after, maybe this could help:

df['street_name'] = df['street_name'].apply(lambda x: ', '.join(set(x.split(', '))

Converting to sets is always a nice way to remove duplicates.

If you need to preserve order, you can use a Counter. It will be slower than using sets though:

from collections import Counter
df['street_name'] = df['street_name'].apply(lambda x: ', '.join(Counter(x.split(', ')).keys()))

edited Mar 23, 2022 at 21:57

answered Mar 23, 2022 at 21:50

liamthorne4

3753 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Is there any function to remove duplicate values in rows in jupyter?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related