Reset cumsum if over limit (python)

Question

The following numpy snippet will return a cumsum of the input array, which resets every time a NaN is encountered.

v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
result = np.cumsum(v)

In a similar fashion, how can I calculate a cumsum which resets if the cumsum is over some value using vectorized pandas or numpy operations?

E.g. for limit = 5, in = [1,1,1,1,1,1,1,1,1,1], out = [1,2,3,4,5,1,2,3,4,5]

Alex Riley · Accepted Answer · 2014-10-28 22:31:12Z

4

If the numbers in your array are all positive, it is probably simplest to use cumsum() and then the modulo operator:

>>> a = np.array([1,1,1,1,1,1,1,1,1,1])
>>> limit = 5
>>> x = a.cumsum() % limit
>>> x
array([1, 2, 3, 4, 0, 1, 2, 3, 4, 0])

You can then set any zero values back to the limit to get the desired array:

>>> x[x == 0] = limit
>>> x
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

Here's one possible general solution using Pandas' expanding_apply method. (I've not tested it extensively...)

First define a modified cumsum function:

import pandas as pd

def cumsum_limit(x):
    q = np.sum(x[:-1])
    if q > 0:
        q = q%5
    r = x[-1]
    if q+r <= 5:
        return q+r
    elif (q+r)%5 == 0:
        return 5
    else:
        return (q+r)%5

a = np.array([1,1,1,1,1,1,1,1,1,1]) # your example array

Apply the function to the array like this:

>>> pd.expanding_apply(a, lambda x: cumsum_limit(x))
array([ 1.,  2.,  3.,  4.,  5.,  1.,  2.,  3.,  4.,  5.])

Here's the function applied to another more interesting Series:

>>> s = pd.Series([3, -8, 4, 5, -3, 501, 7, -100, 98, 3])
>>> pd.expanding_apply(s, lambda x: cumsum_limit(x)) 
0     3
1    -5
2    -1
3     4
4     1
5     2
6     4
7   -96
8     2
9     5
dtype: float64

edited Oct 28, 2014 at 22:31

answered Oct 28, 2014 at 9:39

Alex Riley

178k46 gold badges274 silver badges247 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Michael Over a year ago

Do you also have a modification which resets the cumulative sum after it has passed a limit, so adjusting the example slightly so it does not result in this >>> a = np.array([2,2,2,2,2,2,2,2]) >>> limit = 5 >>> x = a.cumsum() % limit >>> x array([2,4, 1, 3, 0, 2, 4, 1]) but in this: array([2,4, 0, 2, 4, 0, 2, 4])

Collectives™ on Stack Overflow

Reset cumsum if over limit (python)

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related