2

The following numpy snippet will return a cumsum of the input array, which resets every time a NaN is encountered.

v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
result = np.cumsum(v)

In a similar fashion, how can I calculate a cumsum which resets if the cumsum is over some value using vectorized pandas or numpy operations?

E.g. for limit = 5, in = [1,1,1,1,1,1,1,1,1,1], out = [1,2,3,4,5,1,2,3,4,5]

1 Answer 1

4

If the numbers in your array are all positive, it is probably simplest to use cumsum() and then the modulo operator:

>>> a = np.array([1,1,1,1,1,1,1,1,1,1])
>>> limit = 5
>>> x = a.cumsum() % limit
>>> x
array([1, 2, 3, 4, 0, 1, 2, 3, 4, 0])

You can then set any zero values back to the limit to get the desired array:

>>> x[x == 0] = limit
>>> x
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

Here's one possible general solution using Pandas' expanding_apply method. (I've not tested it extensively...)

First define a modified cumsum function:

import pandas as pd

def cumsum_limit(x):
    q = np.sum(x[:-1])
    if q > 0:
        q = q%5
    r = x[-1]
    if q+r <= 5:
        return q+r
    elif (q+r)%5 == 0:
        return 5
    else:
        return (q+r)%5

a = np.array([1,1,1,1,1,1,1,1,1,1]) # your example array

Apply the function to the array like this:

>>> pd.expanding_apply(a, lambda x: cumsum_limit(x))
array([ 1.,  2.,  3.,  4.,  5.,  1.,  2.,  3.,  4.,  5.])

Here's the function applied to another more interesting Series:

>>> s = pd.Series([3, -8, 4, 5, -3, 501, 7, -100, 98, 3])
>>> pd.expanding_apply(s, lambda x: cumsum_limit(x)) 
0     3
1    -5
2    -1
3     4
4     1
5     2
6     4
7   -96
8     2
9     5
dtype: float64
Sign up to request clarification or add additional context in comments.

1 Comment

Do you also have a modification which resets the cumulative sum after it has passed a limit, so adjusting the example slightly so it does not result in this >>> a = np.array([2,2,2,2,2,2,2,2]) >>> limit = 5 >>> x = a.cumsum() % limit >>> x array([2,4, 1, 3, 0, 2, 4, 1]) but in this: array([2,4, 0, 2, 4, 0, 2, 4])

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.