Creating a sub-index in pandas dataframe [duplicate]

Question

Okay this is tricky. I have a pandas dataframe and I am dealing with machine log data. I have an index in the data, but this dataframe has various jobs in it. I wanted to be able to give those individual jobs an index of their own, so that i could compare them with each other. So I want another column with an index beginning with zero, which goes till the end of the job and then resets to zero for the new job. Or do i do this line by line?

Please look at stackoverflow.com/questions/20109391/… and learn how to ask a good pandas question. You need to show your data and your expected output. We can't construct examples from paragraphs of explanation. — cs95
– cs95, Commented Sep 8, 2017 at 7:07

jezrael · Accepted Answer · 2017-09-08 07:12:38Z

4

I think you need set_index with cumcount for count categories:

df = df.set_index(df.groupby('Job Columns').cumcount(), append=True)

Sample:

np.random.seed(456)
df = pd.DataFrame({'Jobs':np.random.choice(['a','b','c'], size=10)})

#solution with sorting
df1 = df.sort_values('Jobs').reset_index(drop=True)
df1 = df1.set_index(df1.groupby('Jobs').cumcount(), append=True)
print (df1)
    Jobs
0 0    a
1 1    a
2 2    a
3 0    b
4 1    b
5 2    b
6 3    b
7 0    c
8 1    c
9 2    c

#solution with no sorting
df2 = df.set_index(df.groupby('Jobs').cumcount(), append=True)
print (df2)
    Jobs
0 0    b
1 1    b
2 0    c
3 0    a
4 1    c
5 2    c
6 1    a
7 2    b
8 2    a
9 3    b

edited Sep 8, 2017 at 7:12

answered Sep 8, 2017 at 7:05

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user3591675 Over a year ago

That solved the problem. You are a pandas genius, I think. Thanks a lot!

Collectives™ on Stack Overflow

Creating a sub-index in pandas dataframe [duplicate]

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related