Pandas get unique values in one column based off of another column python

Question

Here I have a dataframe like the following:

Variable    Groups
1           [0-10]
1           [0-10]
2           [0-10]
2           [0-10]
3           [0-10]
3           [10-20]
4           [10-20]
4           [10-20]
5           [10-20]
5           [10-20]

I'd like to get only unique values for Variable column, but don't want to lose any duplicates that are in different Groups, so for example this:

Variable    Groups
1           [0-10]
2           [0-10]
3           [0-10]
3           [10-20]
4           [10-20]
5           [10-20]

Note there is still a duplicate 3 because there was one in each group. I've tried

df_unique = df['Groups'].groupby(df['Variable']).unique().apply(pd.Series)

but this is just returning a complete mess. Not sure what to do, help appreciated.

SeaBean · Accepted Answer · 2021-06-23 06:56:25Z

4

You can use SeriesGroupBy.unique() together with .explode() and .reset_index(), as follows:

df.groupby('Variable')['Groups'].unique().explode().reset_index()

Another solution is to use GroupBy.first(), as follows:

df.groupby(['Variable', 'Groups'], as_index=False).first()

Result:

   Variable   Groups
0         1   [0-10]
1         2   [0-10]
2         3   [0-10]
3         3  [10-20]
4         4  [10-20]
5         5  [10-20]

edited Jun 23, 2021 at 6:56

answered Jun 23, 2021 at 0:04

SeaBean

23.4k3 gold badges16 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rhug123 · Accepted Answer · 2021-06-23 01:24:31Z

1

Here is another option:

df.groupby(['variable',df['groups'].explode()]).head(1)

answered Jun 23, 2021 at 1:24

rhug123

8,8801 gold badge14 silver badges27 bronze badges

2 Comments

SeaBean Over a year ago

You don't need to use .explode() here. No effect.

rhug123 Over a year ago

i believe you cannot groupby lists, so explode makes them object dtypes.

Prune · Accepted Answer · 2021-06-22 23:50:14Z

0

You need to write an expression that combines the two columns, and apply unique to the combination.

answered Jun 22, 2021 at 23:50

Prune

78k14 gold badges63 silver badges83 bronze badges

Collectives™ on Stack Overflow

Pandas get unique values in one column based off of another column python

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related