pandas pivot multi-indexed columns

Question

I want to pivot a multi-indexed datafame but fail with:

 Shape of passed values is (3, 4), indices imply (3, 2)

the code:

import pandas as pd

df = pd.DataFrame({
    'foo': [1,2,3], 'bar':[4,5,6], 'dt':['2020-01-01', '2020-01-01', '2020-01-02'], 'cat':['a', 'b', 'b']
})
df = df.groupby(['dt', 'cat']).describe().loc[:, pd.IndexSlice[:, ['count', '50%']]].reset_index()
columns_of_interest = sorted(df.drop(['dt', 'cat'], axis=1, level=0).columns.get_level_values(0).unique())
df.pivot(index='dt', columns='cat', values=columns_of_interest)

How can it be fixed?

edit

Expected result:

from:

dt  cat     foo     bar
            count   50%     count   50%
0   2020-01-01  a   1.0     1.0     1.0     4.0
1   2020-01-01  b   1.0     2.0     1.0     5.0
2   2020-01-02  b   1.0     3.0     1.0     6.0

to:

value       foo         bar

cat     a       b       a       b
dt

0
1
2

edit 2

basically I want to calculate:

v = 'count'
df['foo'][v].reset_index().pivot(index='dt', columns='cat', values = v)

for each column [foo, bar] and each aggregation [count, 50%] and get a single combined result back.

I.e.:

for c in columns_of_interest:
    print(c)    
    for piv in piv_values:
        print(piv)
        r = df[c][piv].reset_index().pivot(index='dt', columns='cat', values = piv)
        display(r)

1) I am just not sure how to recombine the results yet and 2) how to find a neat solution.

workaround

A rather neat workaround is to flatten the level:

df.columns = ['_'.join(col).strip() for col in df.columns.values]
columns_of_interest = df.columns
df.reset_index().pivot(index='dt', columns='cat', values=columns_of_interest)

There are many ways to "fix" this. We have to know what you were expecting. — piRSquared
– piRSquared, Commented May 14, 2020 at 18:10

Ben.T · Accepted Answer · 2020-05-14 18:48:53Z

1

IIUC, you can use unstack after the groupby (no reset_index):

df = pd.DataFrame({
    'foo': [1,2,3], 'bar':[4,5,6], 
    'dt':['2020-01-01', '2020-01-01', '2020-01-02'], 'cat':['a', 'b', 'b']
})
df_ = df.groupby(['dt', 'cat']).describe()\
        .loc[:, pd.IndexSlice[:, ['count', '50%']]]\
        .unstack() # unstack instead of reset_index

print (df_)
             foo                  bar               
           count       50%      count       50%     
cat            a    b    a    b     a    b    a    b
dt                                                  
2020-01-01   1.0  1.0  1.0  2.0   1.0  1.0  4.0  5.0
2020-01-02   NaN  1.0  NaN  3.0   NaN  1.0  NaN  6.0

answered May 14, 2020 at 18:48

Ben.T

29.7k6 gold badges39 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

pandas pivot multi-indexed columns

edit

edit 2

workaround

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

edit

edit 2

workaround

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related