groupby in pandas with different functions for different columns

Question

Best to illustrate by example:

I would like to aggregate a DataFrame by col1 and col2, summing results on col3 and col4 and averaging results on col5

If I just wanted to sum on col3-5 I'd use df.groupby(['col1','col2']).sum()

Would be good to have sample data and expected result?

Zero
– Zero

2015-10-19 15:00:45 +00:00
Commented Oct 19, 2015 at 15:00 — Zero
– Zero, Commented Oct 19, 2015 at 15:00

Anand S Kumar · Accepted Answer · 2015-10-19 15:11:05Z

20

You can use the Groupby.agg() (or Groupby.aggregate()) method for this.

aggregate() function can accept a dictionary as argument, in which case it treats the keys as the column names and the value as the function to use for aggregating. As given in the documentation -

By passing a dict to aggregate you can apply a different aggregation to the columns of a DataFrame.

Example -

import numpy as np
result = df.groupby(['col1','col2']).agg({'col3':'sum','col4':'sum','col5':np.average})

Demo -

In [50]: df = pd.DataFrame([[1,2,3,4,5],[1,2,6,7,8],[2,3,4,5,6]],columns=list('ABCDE'))

In [51]: df
Out[51]:
   A  B  C  D  E
0  1  2  3  4  5
1  1  2  6  7  8
2  2  3  4  5  6

In [52]: df.groupby(['A','B']).aggregate({'C':np.sum,'D':np.sum,'E':np.average})
Out[52]:
     C    E   D
A B
1 2  9  6.5  11
2 3  4  6.0   5

answered Oct 19, 2015 at 15:11

Anand S Kumar

91.5k18 gold badges196 silver badges179 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

eran Over a year ago

Thanks, is there a default type for all columns not mentioned?

Anand S Kumar Over a year ago

I am sorry didn't get your question.

eran Over a year ago

Say I want to sum over two specific columns, and average over all the rest, without specifically naming them

Anand S Kumar Over a year ago

I don't think you can do that, but you can use dictionary comprehension to create the dictionary , example - {k:np.sum if k in {'col3','col4'} else k:np.average for k in df.columns if k not in {'col1','col2'} .

eran Over a year ago

Great. Thank you very much.

Collectives™ on Stack Overflow

groupby in pandas with different functions for different columns

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related