Add new indices to particular level of a MultiIndex dataframe pandas

Question

Here is an example of what I am trying to do:

import io
import pandas as pd
data = io.StringIO('''Fruit,Color,Count,Price
Apple,Red,3,$1.29
Apple,Green,9,$0.99
Pear,Red,25,$2.59
Pear,Green,26,$2.79
Lime,Green,99,$0.39
''')
df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color'])

Output:

Out[5]: 
             Count  Price
Fruit Color              
Apple Red        3  $1.29
      Green      9  $0.99
Pear  Red       25  $2.59
      Green     26  $2.79
Lime  Green     99  $0.39

Now lets say I want to count the number of keys in the 'Color' level:

L = []
for i in pd.unique(df.index.get_level_values(0)):
    L.append(range(df.xs(i).shape[0]))

list(np.concatenate(L))

Then I add the resulting list [0,1,0,1,0] as a new column:

df['Bob'] = list(np.concatenate(L))

as so:

             Count  Price  Bob
Fruit Color                   
Apple Red        3  $1.29    0
      Green      9  $0.99    1
Pear  Red       25  $2.59    0
      Green     26  $2.79    1
Lime  Green     99  $0.39    0

My question:

How do I make the Bob column an index on the same level as Color? This is what I want:

                 Count  Price
Fruit Color Bob                   
Apple Red    0    3     $1.29
      Green  1    9     $0.99
Pear  Red    0   25     $2.59
      Green  1   26     $2.79
Lime  Green  0   99     $0.39

cs95 · Accepted Answer · 2018-09-24 22:54:43Z

5

Are you looking for cumcount? If so, you can ditch the loop and vectorize your solution.

df = df.set_index(df.groupby(level=0).cumcount(), append=True)
print(df)
               Count  Price
Fruit Color                
Apple Red   0      3  $1.29
      Green 1      9  $0.99
Pear  Red   0     25  $2.59
      Green 1     26  $2.79
Lime  Green 0     99  $0.39

Or, if you'd prefer doing this in one fell swoop,

df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color', df.groupby('Fruit').cumcount()])
print(df)
               Count  Price
Fruit Color                
Apple Green 0      9  $0.99
      Red   1      3  $1.29
Lime  Green 0     99  $0.39
Pear  Green 1     26  $2.79
      Red   0     25  $2.59

To rename the index, use rename_axis:

df = df.rename_axis(['Fruit', 'Color', 'Bob'])
print(df)
                 Count  Price
Fruit Color Bob              
Apple Red   0        3  $1.29
      Green 1        9  $0.99
Pear  Red   0       25  $2.59
      Green 1       26  $2.79
Lime  Green 0       99  $0.39

edited Sep 24, 2018 at 22:54

answered Sep 24, 2018 at 22:40

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Astrid Over a year ago

But how do you access the cumcount index now? Or how do you give it a name like 'Bob'?

sacuL · Accepted Answer · 2018-09-24 22:39:38Z

3

IIUC, Use the append argument of set_index:

df.set_index('Bob',append=True,inplace=True)
>>> df
                 Count  Price
Fruit Color Bob              
Apple Red   0        3  $1.29
      Green 1        9  $0.99
Pear  Red   0       25  $2.59
      Green 1       26  $2.79
Lime  Green 0       99  $0.39

answered Sep 24, 2018 at 22:39

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Collectives™ on Stack Overflow

Add new indices to particular level of a MultiIndex dataframe pandas

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related