2

I have this sample:

import pandas as pd
import numpy as np
dic = {'name':
       ['j','c','q','j','c','q','j','c','q'],
       'foo or bar':['foo','bar','bar','bar','foo','foo','bar','foo','foo'], 
       'amount':[10,20,30, 20,30,40, 200,300,400]}
x = pd.DataFrame(dic)
x
pd.pivot_table(x, 
               values='amount', 
               index='name', 
               columns='foo or bar', 
               aggfunc=[np.mean, np.sum])

It returns this:

enter image description here

I'd like to just have the highlighted columns. Why can I not specify tuples in the aggfunc argument like this?

pd.pivot_table(x, 
               values='amount', 
               index='name', 
               columns='foo or bar', 
               aggfunc=[(np.mean, 'bar'), (np.sum, 'foo')])

Is using .ix like here (define aggfunc for each values column in pandas pivot table) the only option?

1

2 Answers 2

4

i think you can't specify tuples for the aggfunc parameter, but you can do something like this:

In [259]: p = pd.pivot_table(x,
   .....:                values='amount',
   .....:                index='name',
   .....:                columns='foo or bar',
   .....:                aggfunc=[np.mean, np.sum])

In [260]: p
Out[260]:
           mean       sum
foo or bar  bar  foo  bar  foo
name
c            20  165   20  330
j           110   10  220   10
q            30  220   30  440

In [261]: p.columns = ['{0[0]}_{0[1]}'.format(col) if col[1] else col[0] for col in p.columns.tolist()]

In [262]: p.columns
Out[262]: Index(['mean_bar', 'mean_foo', 'sum_bar', 'sum_foo'], dtype='object')

In [264]: p[['mean_bar','sum_foo']]
Out[264]:
      mean_bar  sum_foo
name
c           20      330
j          110       10
q           30      440
Sign up to request clarification or add additional context in comments.

Comments

2

To be able to do that as in the answer you provided you need to create appropriate columns for that. You could do that with:

x['foo'] = x.loc[x['foo or bar'] == 'foo', 'amount']
x['bar'] = x.loc[x['foo or bar'] == 'bar', 'amount']

In [81]: x
Out[81]: 
   amount foo or bar name    foo    bar
0      10        foo    j   10.0    NaN
1      20        bar    c    NaN   20.0
2      30        bar    q    NaN   30.0
3      20        bar    j    NaN   20.0
4      30        foo    c   30.0    NaN
5      40        foo    q   40.0    NaN
6     200        bar    j    NaN  200.0
7     300        foo    c  300.0    NaN
8     400        foo    q  400.0    NaN

And then you could use following:

In [82]: x.pivot_table(values=['foo','bar'], index='name', aggfunc={'bar':np.mean, 'foo':sum})
Out[82]: 
        bar    foo
name              
c      20.0  330.0
j     110.0   10.0
q      30.0  440.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.