0

I have an index in a pandas dataframe which repeats the index value. I want to re-index as multi-index where repeated indexes are grouped.

The indexing looks like such:

enter image description here

so I would like all the 112335586 index values would be grouped under the same in index.

I have looked at this question Create pandas dataframe by repeating one row with new multiindex but here the value can be index can be pre-defined but this is not possible as my dataframe is far too large to hard code this.

I also looked at at the multi-index documentation but this also pre-defines the value for the index.

2 Answers 2

4

I believe you need:

s = pd.Series([1,2,3,4], index=[10,10,20,20])
s.index.name = 'EVENT_ID'
print (s)
EVENT_ID
10    1
10    2
20    3
20    4
dtype: int64

s1 = s.index.to_series()
s2 = s1.groupby(s1).cumcount()
s.index = [s.index, s2]
print (s)
EVENT_ID   
10        0    1
          1    2
20        0    3
          1    4
dtype: int64
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for getting back to me @jezrael. For line two of your code, I am getting an error for series having no attribute set_index.
@sandycaskie - Answer was edited, can you check now?
0

Try this:

df.reset_index(inplace=True)
df['sub_idx'] = df.groupby('EVENT_ID').cumcount()
df.set_index(['EVENT_ID','sub_idx'], inplace=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.