Python Pandas: efficiently aggregating different functions on different columns and combining the resulting columns together

Question

So far my approach to the task described in the title is quite straightforward, yet it seems somewhat inefficient/unpythonic. An example of what I usually do is as follows:

The original Pandas DataFramedf has 6 columns: 'open', 'high', 'low', 'close', 'volume', 'new dt'

import pandas as pd

df_gb = df.groupby('new dt')

arr_high = df_gb['high'].max()
arr_low = df_gb['low'].min()
arr_open = df_gb['open'].first()
arr_close = df_gb['close'].last()
arr_volumne = df_gb['volume'].sum()

df2 = pd.concat([arr_open,
                 arr_high,
                 arr_low,
                 arr_close,
                 arr_volumne], axis = 'columns')

It may seem already efficient at first glance, but when I have 20 functions waiting to apply on 20 different columns, it quickly becomes unpythonic/inefficient.

Is there any way to make it more efficient/pythonic? Thank you in advance

Buckeye14Guy · Accepted Answer · 2019-08-22 04:33:34Z

1

If you have 20 different functions you will have to properly match columns with functions anyways. The term pythonic can be subjective so this is not the correct answer but potentially useful. Your approach is pythonic in my opinion and it kinda details what is happening properly

# as long as the columns are ordered with the proper functions
# you may have to change the ordering here
columns_to_agg = (column for column in df.columns if column != 'new dt')

# if the functions are all methods of pandas.Series just use strings
agg_methods = ['first', 'max', 'min', 'last', 'sum']

# construct a dictionary and use it as aggregator
agg_dict = dict((el[0], el[1]) for el in zip(columns_to_agg, agg_methods))
df_gb = df.groupby('new dt', as_index=False).agg(agg_dict)

If you have custom functions you wanted to apply to, say volume, you could do


def custom_f(series):
    return pd.notnull(series).sum()
agg_methods = ['first', 'max', 'min', 'last', custom_f]

Everything else will be fine. You could even do this to apply sum and custom_f to your volume column

agg_methods = ['first', 'max', 'min', 'last', ['sum', custom_f]]

edited Aug 22, 2019 at 4:33

answered Aug 22, 2019 at 4:16

Buckeye14Guy

8066 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

mathguy Over a year ago

Thanks for your reply. I will try to see if it works.

Buckeye14Guy Over a year ago

np. let me know if it does

mathguy Over a year ago

what happens if some functions are user-defined(as in, not defined in pandas as a string like 'first', 'max')?

Buckeye14Guy Over a year ago

You can have that function as part of the dictionary. Let me update my answer

mathguy Over a year ago

Really appreciate it

|

Prabhat Sharma · Accepted Answer · 2019-08-22 04:35:06Z

1

In [3]: import pandas as pd                                                     
In [4]: import numpy as np                                                      
In [5]: df = pd.DataFrame([[1, 2, 3],[4, 5, 6],[7, 8, 9], 
...: [np.nan, np.nan, np.nan]],columns=['A', 'B', 'C']) 

In [6]: df.agg({'A' : ['sum', 'min'], 'B' : ['min', 'max']})                    
Out[6]: 
        A    B
max   NaN  8.0
min   1.0  2.0
sum  12.0  NaN

For functions as column:

In [11]: df.agg({'A' : ['sum'], 'B' : ['min', 'max']}).T                        
Out[11]: 
   max  min   sum
A  NaN  NaN  12.0
B  8.0  2.0   NaN

For using custom functions you can do like this:

In [12]: df.agg({'A' : ['sum',lambda x:x.mean()], 'B' : ['min', 'max']}).T      
Out[12]: 
   <lambda>  max  min   sum
A       4.0  NaN  NaN  12.0
B       NaN  8.0  2.0   NaN

answered Aug 22, 2019 at 4:35

Prabhat Sharma

157 bronze badges

1 Comment

mathguy Over a year ago

Thanks, very nice to know custom functions can be used in this way

Collectives™ on Stack Overflow

Python Pandas: efficiently aggregating different functions on different columns and combining the resulting columns together

2 Answers 2

6 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related