1

Lets say I have the following dataframe representing the dietary habits of my pet frog

date       bugs_eaten_today
2019-01-31 0
2019-01-30 5
2019-01-29 6
2019-01-28 7
2019-01-27 2
...

Now I want to calculate a new column bugs_eaten_past_20_days

date       bugs_eaten_today bugs_eaten_paast_20_days
2019-01-31 0                48
2019-01-30 5                38
2019-01-29 6                57
2019-01-28 7                63
2019-01-27 2                21
...

How would I go about doing this? (Note that we don't have data for last 20 rows, so they will just be NaN)

0

1 Answer 1

1

You can do a rolling sum (with 20 rather than 3):

In [11]: df.bugs_eaten_today.rolling(3, 1).sum()
Out[11]:
0     0.0
1     5.0
2    11.0
3    18.0
4    15.0
Name: bugs_eaten_today, dtype: float64

You have to do this in reverse, since the index is reversed:

In [12]: df[::-1].bugs_eaten_today.rolling(3, 1).sum()
Out[12]:
4     2.0
3     9.0
2    15.0
1    18.0
0    11.0
Name: bugs_eaten_today, dtype: float64

In [13]: df['bugs_eaten_paast_20_days'] = df[::-1].bugs_eaten_today.rolling(3, 1).sum()

It's probably more robust to use date as the index and roll over 20D(ays):

In [21]: df1 = df.set_index('date').sort_index()

In [22]: df1.bugs_eaten_today.rolling('3D', 1).sum()
Out[22]:
date
2019-01-27     2.0
2019-01-28     9.0
2019-01-29    15.0
2019-01-30    18.0
2019-01-31    11.0
Name: bugs_eaten_today, dtype: float64
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the response, but what if the column I want generated depended on a more complex function, as opposed to just a simple sum. Would there be a more general way to iterate through the dataframe with a sliding 20 unit window (looking either ahead or behind)?
@AlanSTACK There's rolling.apply which accepts a user's function (similar to DataFrame apply): pandas.pydata.org/pandas-docs/stable/reference/api/…
Thank you. Before I accept your answer, do you think you could provide an example of doing what you did in your response with rolling apply - so more novice users can quickly understand what to do?
@AlanSTACK something like df1.bugs_eaten_today.rolling('3D', 1).apply(lambda x: x.sum(), raw=False) (though that is a silly example, you can do whatever python function - takes a Series of those (upto) 20 rows.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.