1

I want to pivot a multi-indexed datafame but fail with:

 Shape of passed values is (3, 4), indices imply (3, 2)

the code:

import pandas as pd

df = pd.DataFrame({
    'foo': [1,2,3], 'bar':[4,5,6], 'dt':['2020-01-01', '2020-01-01', '2020-01-02'], 'cat':['a', 'b', 'b']
})
df = df.groupby(['dt', 'cat']).describe().loc[:, pd.IndexSlice[:, ['count', '50%']]].reset_index()
columns_of_interest = sorted(df.drop(['dt', 'cat'], axis=1, level=0).columns.get_level_values(0).unique())
df.pivot(index='dt', columns='cat', values=columns_of_interest)

How can it be fixed?

edit

Expected result:

from:

dt  cat     foo     bar
            count   50%     count   50%
0   2020-01-01  a   1.0     1.0     1.0     4.0
1   2020-01-01  b   1.0     2.0     1.0     5.0
2   2020-01-02  b   1.0     3.0     1.0     6.0

to:

value       foo         bar

cat     a       b       a       b
dt

0
1
2

edit 2

basically I want to calculate:

v = 'count'
df['foo'][v].reset_index().pivot(index='dt', columns='cat', values = v)

for each column [foo, bar] and each aggregation [count, 50%] and get a single combined result back.

I.e.:

for c in columns_of_interest:
    print(c)    
    for piv in piv_values:
        print(piv)
        r = df[c][piv].reset_index().pivot(index='dt', columns='cat', values = piv)
        display(r)

1) I am just not sure how to recombine the results yet and 2) how to find a neat solution.

workaround

A rather neat workaround is to flatten the level:

df.columns = ['_'.join(col).strip() for col in df.columns.values]
columns_of_interest = df.columns
df.reset_index().pivot(index='dt', columns='cat', values=columns_of_interest)
2
  • 3
    There are many ways to "fix" this. We have to know what you were expecting. Commented May 14, 2020 at 18:10
  • In a naive way I would say a triple index . Commented May 14, 2020 at 18:15

1 Answer 1

1

IIUC, you can use unstack after the groupby (no reset_index):

df = pd.DataFrame({
    'foo': [1,2,3], 'bar':[4,5,6], 
    'dt':['2020-01-01', '2020-01-01', '2020-01-02'], 'cat':['a', 'b', 'b']
})
df_ = df.groupby(['dt', 'cat']).describe()\
        .loc[:, pd.IndexSlice[:, ['count', '50%']]]\
        .unstack() # unstack instead of reset_index

print (df_)
             foo                  bar               
           count       50%      count       50%     
cat            a    b    a    b     a    b    a    b
dt                                                  
2020-01-01   1.0  1.0  1.0  2.0   1.0  1.0  4.0  5.0
2020-01-02   NaN  1.0  NaN  3.0   NaN  1.0  NaN  6.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.