How to pass argument to func in `pandas.resampler.agg()` when using dict input?

Question

I am trying to resample a pandas dataframe, and for some columns I would like to sum on. additionally, I want to get None/nan as result when there is no rows in a resampling period. For aggregation on a single column, I can do the following:

df = pd.DataFrame(index=[pd.to_datetime('2020-01-01')], columns=['value'])
df.resample('5min').agg("sum", min_count=1)

according to pandas doc, the keyword argument min_count will be passed to resample.Resampler.sum associated with the string "sum". and the result is desired.

           value
2020-01-01  None

However, this won't work if I pass a dictionary as agg input, e.g.

df = pd.DataFrame(index=[pd.to_datetime('2020-01-01')], columns=['value'])
df.resample('5min').agg({'value': 'sum'}, min_count=1)

will output:

           value
2020-01-01     0

I would like to know the right way to pass arguments to the aggregation functions specified inside the dict.

mozway · Accepted Answer · 2025-08-28 11:00:18Z

2

This is currently not possible. There is/was a similar issue with agg.

Assuming multiple columns:

df = pd.DataFrame(index=[pd.to_datetime('2020-01-01')],
                  columns=['value', 'value2', 'value3'])

If you want to apply the same aggregation, just slice before resample.agg:

out = df.resample('5min')[['value', 'value2']].agg('sum', min_count=1)

Output:

           value value2
2020-01-01  None   None

If you need different aggregation functions, use a dictionary and concat:

funcs = {'value': 'sum', 'value2': 'min'}

r = df.resample('5min')
out = pd.concat({k: r[k].agg([v], min_count=1)
                 for k, v in funcs.items()}, axis=1)

Output:

           value value2
             sum    min
2020-01-01  None    NaN

And if you need different aggregation functions and different kwargs:

funcs = {'value': 'sum', 'value2': 'min'}
kwargs = {'value2': {'min_count': 1}}

r = df.resample('5min')

out = pd.concat({k: r[k].agg([v], **kwargs.get(k, {}))
                 for k, v in funcs.items()}, axis=1)

Output:

           value value2
             sum    min
2020-01-01     0    NaN

answered Aug 28 at 11:00

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

KamiKimi 3 Aug 28 at 11:31

will the method you proposed has impact on performance much?

mozway Aug 28 at 11:41

It will have an impact, but using a dictionary already has a significant impact in the first place. For instance, using a input with 10K rows and 3 columns and pre-computing the resampler, this gives r.agg('sum') -> 402 µs ± 40.6 µs ; r.agg({'value': 'sum', 'value2': 'sum', 'value3': 'sum'}) -> 1.46 ms ± 112 µs ; and for the concat approach -> 2.15 ms ± 60.4 µs.

mozway Aug 28 at 11:44

If you have multiple columns with the same aggregation/kwargs, then the best would be to combine those in a single agg call, then concat with other aggregations. If you want more details oriented to performance you might want to provide a reproducible example, and maybe open a follow-up question?

Collectives™ on Stack Overflow

How to pass argument to func in `pandas.resampler.agg()` when using dict input?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related