Pandas multi-level index to rows

Question

I am trying to transform this DataFrame:

>>> df
D1     D2  
S1 S2  S1 S2
0  -4  1  -6
1  -5  2  -7
3  -8  3  -9

to something like:

>>> df
    0    1    2    3
D1  -4   -5   NaN  -8
D2  NaN  -6   -7   -9

Essentially the values in columns (D1,S1) and (D2,S1) need to be merged into column names and the values in columns (D1,S2) and (D2,S2) become the only data values.

How could I achieve that simply and efficiently ?

unutbu · Accepted Answer · 2015-08-12 12:22:52Z

The desired DataFrame have the Ds in the index. stack moves column level values into the index. So it is natural to think about calling stack:

In [43]: result = df.stack(level=0); result
Out[43]: 
      S1  S2
0 D1   0  -4
  D2   1  -6
1 D1   1  -5
  D2   2  -7
2 D1   3  -8
  D2   3  -9

The desired DataFrame only has Ds in the index, so let's drop the outer level values:

In [44]: result.index = result.index.droplevel(0); result
Out[44]: 
    S1  S2
D1   0  -4
D2   1  -6
D1   1  -5
D2   2  -7
D1   3  -8
D2   3  -9

We now want the S1 column to become a column level value. That can be done by first moving the S1 column into the index,

In [45]: result = result.set_index('S1', append=True); result
Out[45]: 
       S2
   S1    
D1 0   -4
D2 1   -6
D1 1   -5
D2 2   -7
D1 3   -8
D2 3   -9

and then calling unstack to move the index level to a column level:

In [46]: result = result.unstack('S1'); result
Out[46]: 
    S2          
S1   0  1   2  3
D1  -4 -5 NaN -8
D2 NaN -6  -7 -9

Putting it all together,

import pandas as pd
df = pd.DataFrame({('D1', 'S1'): [0, 1, 3],
 ('D1', 'S2'): [-4, -5, -8],
 ('D2', 'S1'): [1, 2, 3],
 ('D2', 'S2'): [-6, -7, -9]})

result = df.stack(level=0)
result.index = result.index.droplevel(0)
result = result.set_index('S1', append=True)
result = result.unstack('S1')
result.columns = result.columns.droplevel(0)
print(result)

yields

S1   0  1   2  3
D1  -4 -5 NaN -8
D2 NaN -6  -7 -9

For reference: The four fundamental DataFrame reshaping operations are

They are fundamental because a DataFrame is just an array with index and column levels. These operators allow you to move values from the array to index levels to column levels and back. Combining of these 4 operations in various ways can solve a lot of DataFrame reshaping problems.

Collectives™ on Stack Overflow

Pandas multi-level index to rows

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related