How to replace index in a data frame

Question

I have a data frame as follows:

df = pd.DataFrame({'year': [2010, 2011, 2012, 2015,2016,2017],
                 'sales': [10, 12, 13, 9, 11,7],
                   'Groups': ['AA', 'BB', 'AA', 'AA', 'CC', 'CC']})

what I am trying to do is to map the 'Groups' column with an integer index value so the same group members assigned the same index number. Somrthing like this:

Index year  sales Groups
1     2010     10     AA
2     2011     12     BB
1     2012     13     AA
1     2015      9     AA
3     2016     11     CC
3     2017      7     CC

I was thinking to use set_index, but not sure if that is the right approach.

what I am trying to do is to map the 'Groups' column with an index value so the same group members assigned the same index number. Something like this:

Index year  sales Groups
1     2010     10     AA
2     2011     12     BB
1     2012     13     AA
1     2015      9     AA
3     2016     11     CC
3     2017      7     CC

Thanks for any help.

BENY · Accepted Answer · 2019-04-30 02:15:13Z

2

Using ngroup

df.index=df.groupby('Groups').ngroup()+1

Or factorize and cat.codes

df.index=pd.factorize(df.Groups)[0]+1

df.index=df.Groups.astype('category').cat.codes+1

answered Apr 30, 2019 at 2:15

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Aditya Patel · Accepted Answer · 2019-04-30 02:23:01Z

1

Is there a reason you aren't sorting first?

Or else you can try this:

df = df.sort_values('Groups')
df['index'] = df['Groups'].rank(method='dense')

It will rank your groups and index them appropriately.

answered Apr 30, 2019 at 2:23

Aditya Patel

1681 gold badge1 silver badge11 bronze badges

Collectives™ on Stack Overflow

How to replace index in a data frame

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related