273

I have a dataframe with 2 index levels:

                         value
Trial    measurement
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

Which I want to turn into this:

Trial    measurement       value

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

How can I best do this?

I need this because I want to aggregate the data as instructed here, but I can't select my columns like that if they are in use as indices.

3
  • 5
    Duplicate: stackoverflow.com/questions/18624039/… You want the first suggestion. .reset_index() Commented Nov 21, 2013 at 1:51
  • 1
    many thanks, I actually browsed around for this a lot, but "make multiindex to column" and similar queries always got me threads which wanted to pivot their dataframes... Commented Nov 21, 2013 at 3:49
  • 3
    Always easier to find an answer when you already know it :) Commented Nov 21, 2013 at 4:00

8 Answers 8

340

The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns).

All you have to do call .reset_index() after the name of the DataFrame:

df = df.reset_index()  
Sign up to request clarification or add additional context in comments.

4 Comments

For my case where I had 3 index levels inplace reset did not work. Alternative is assigning newly resetted dataframe to a new one: df2 = df.reset_index()
To reset only a particular level(s), use df.reset_index(level=[...])
Or the side-effect (probably quicker) way: df.reset_index(inplace=True)
df.reset_index(names=['a', 'b']) to provide names/alternative names to the produced columns.
39

This doesn't really apply to your case but could be helpful for others (like myself 5 minutes ago) to know. If one's multindex have the same name like this:

                         value
Trial        Trial
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

df.reset_index(inplace=True) will fail, cause the columns that are created cannot have the same names.

So then you need to rename the multindex with df.index = df.index.set_names(['Trial', 'measurement']) to get:

                           value
Trial    measurement       

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

And then df.reset_index(inplace=True) will work like a charm.

I encountered this problem after grouping by year and month on a datetime-column(not index) called live_date, which meant that both year and month were named live_date.

1 Comment

How to have your Trial values to repeat themselves? I had the same problem and it works except my values don't repeat themselves.
27

There may be situations when df.reset_index() cannot be used (e.g., when you need the index, too). In this case, use index.get_level_values() to access index values directly:

df['Trial'] = df.index.get_level_values(0)
df['measurement'] = df.index.get_level_values(1)

This will assign index values to individual columns and keep the index.

See the docs for further info.

1 Comment

This is soooooooooo useful! It should be possible to do this using much clearer language, e.g. df['measurement'] = df.index.values(1).
19

As @cs95 mentioned in a comment, to drop only one level, use:

df.reset_index(level=[...])

This avoids having to redefine your desired index after reset.

Comments

5

I ran into Karl's issue as well. I just found myself renaming the aggregated column then resetting the index.

df = pd.DataFrame(df.groupby(['arms', 'success'])['success'].sum()).rename(columns={'success':'sum'})

enter image description here

df = df.reset_index()

enter image description here

Comments

3

Short and simple

df2 = pd.DataFrame({'test_col': df['test_col'].describe()})
df2 = df2.reset_index()

Comments

1

A solution that might be helpful in cases when not every column has multiple index levels:

df.columns = df.columns.map(''.join)

Comments

1

Similar to Alex solution in a more generalized form. It keeps the indexes untouched and adds the index levels as new columns with its name.

for i in df.index.names:
    df[i] = df.index.get_level_values(i)

which gives the new columns 'Trial' and 'measurement'

                   value Trial    measurement
Trial measurement             
    1           0     13     1              0     
                1      3     1              1     
                2      4     1              2     
  ...  

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.