3

Given boundary value k, is there a vectorized way to replace each number n with consecutive descending numbers from n-1 to k? For example, if k is 0 the I'd like to replace np.array([3,4,2,2,1,3,1]) with np.array([2,1,0,3,2,1,0,1,0,1,0,0,2,1,0,0]). Every item of input array is greater than k.

I have tried combination of np.repeat and np.cumsum but it seems evasive solution:

x = np.array([3,4,2,2,1,3,1])
y = np.repeat(x, x)
t = -np.ones(y.shape[0])
t[np.r_[0, np.cumsum(x)[:-1]]] = x-1
np.cumsum(t)

Is there any other way? I expect smth like inverse of np.add.reduceat that is able to broadcast integers to decreasing sequences instead of minimizing them.

2 Answers 2

2

Here's another way with array-assignment to skip the repeat part -

def func1(a):
    l = a.sum()
    out = np.full(l, -1, dtype=int)
    out[0] = a[0]-1
    idx = a.cumsum()[:-1]
    out[idx] = a[1:]-1
    return out.cumsum()

Benchmarking

# OP's soln
def OP(x):
    y = np.repeat(x, x)
    t = -np.ones(y.shape[0], dtype=int)
    t[np.r_[0, np.cumsum(x)[:-1]]] = x-1
    return np.cumsum(t)

Using benchit package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.

import benchit

a = np.array([3,4,2,2,1,3,1])
in_ = [np.resize(a,n) for n in [10, 100, 1000, 10000]]
funcs = [OP, func1]
t = benchit.timings(funcs, in_)
t.plot(logx=True, save='timings.png')

enter image description here

Extend to take k as arg

def func1(a, k):
    l = a.sum()+len(a)*(-k)
    out = np.full(l, -1, dtype=int)
    out[0] = a[0]-1
    idx = (a-k).cumsum()[:-1]
    out[idx] = a[1:]-1-k
    return out.cumsum()

Sample run -

In [120]: a
Out[120]: array([3, 4, 2, 2, 1, 3, 1])

In [121]: func1(a, k=-1)
Out[121]: 
array([ 2,  1,  0, -1,  3,  2,  1,  0, -1,  1,  0, -1,  1,  0, -1,  0, -1,
        2,  1,  0, -1,  0, -1])
Sign up to request clarification or add additional context in comments.

1 Comment

So, no hidden methods or approaches in numpy functionality as I've been waiting. 2x speedup is much better though, thanks for collaboration. This helped me to speedup itertools.combinations up to 1.5 times!
0

This is concise and probably ok for efficiency; I don't think apply is vectorized here, so you will be limited mostly be the number of elements in the original array (less so their value is my guess):

import pandas as pd
x = np.array([3,4,2,2,1,3,1])

values = pd.Series(x).apply(lambda val: np.arange(val-1,-1,-1)).values
output = np.concatenate(values)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.