4

I have a numpy array with only -1, 1 and 0, like this:

np.array([1,1,-1,-1,0,-1,1])

I would like a new array that counts the -1 encountered. The counter must reset when a 0 appears and remain the same when it's a 1:

Desired output:

np.array([0,0,1,2,0,1,1])

The solution must be very little time consuming when used with larger array (up to 100 000)


Edit: Thanks for your contribution, I've a working solution for now.

I'm still looking for a non-iterative way to solve it (no for loop). Maybe with a pandas Series and the cumsum() method ?

1
  • please add how large is target array to your question Commented Dec 9, 2021 at 7:11

5 Answers 5

2

Maybe with a pandas Series and the cumsum() method?

Yes, use Series.cumsum and Series.groupby:

s = pd.Series([1, 1, -1, -1, 0, -1, 1])

s.eq(-1).groupby(s.eq(0).cumsum()).cumsum().to_numpy()
# array([0, 0, 1, 2, 0, 1, 1])

Step-by-step

  1. Create pseudo-groups that reset when equal to 0:

    groups = s.eq(0).cumsum()
    # array([0, 0, 0, 0, 1, 1, 1])
    
  2. Then groupby these pseudo-groups and cumsum when equal to -1:

    s.eq(-1).groupby(groups).cumsum().to_numpy()
    # array([0, 0, 1, 2, 0, 1, 1])
    

Timings

not time consuming when used with larger array (up to 100,000)

groupby + cumsum is ~8x faster than looping, given np.random.choice([-1, 0, 1], size=100_000):

%timeit series_cumsum(a)
# 3.29 ms ± 721 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit miki_loop(a)
# 26.5 ms ± 925 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit skyrider_loop(a)
# 26.8 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sign up to request clarification or add additional context in comments.

Comments

1

Let's first save your numpy array in a variable:

a = np.array([1,1,-1,-1,0,-1,1])

I define a variabel, count to hold the value you care about, and set it to be zero. Then I define a list to hold the new elements. Let's call it l. Then I iterate on elemnts of a and in each ieration I name the element i. Inside each iteration, I implement the logic:

  • if i is -1, then increase counter
  • else, if i is 0, reset the counter
  • and do nothing otherwise And finally, I append the counter to l. Lastly, convert l to be a numpy array, out.
l = []
count = 0
for i in a:
    if i == -1:
        count+=1
    elif i==0: 
        count = 0
    l.append(count)
out = np.array(l)
out

2 Comments

While this code may answer the question, including an explanation of how or why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.
dear @ppwater, Is it better now?
1

I seem to get a 10x speedup over Pandas solution with numba for this benchmark:

from numba import jit

inp1 = np.array([1,1,-1,-1,0,-1,1], dtype=int)
inp2 = np.random.randint(-1, 10, size=10**6)

@jit
def with_numba(arr):
  val = 0
  put = np.zeros_like(arr)
  for i in range(arr.size):
    if arr[i] == -1:
      val += 1
    elif arr[i] == 0:
      val = 0
    put[i] = val

  return put

def with_pandas(inp):
  s = pd.Series(inp)
  return s.eq(-1).groupby(s.eq(0).cumsum()).cumsum().to_numpy()
  
assert (with_numba(inp1) == with_pandas(inp1)).all()
assert (with_numba(inp2) == with_pandas(inp2)).all()

%timeit with_numba(inp2)
# 100 loops, best of 5: 4.57 ms per loop
%timeit with_pandas(inp2)
# 10 loops, best of 5: 46.3 ms per loop

Comments

0

Use a for loop. Set a variable which starts at 1 and reset it each time you encounter a different number. For example:

counter = 1;
outputArray = [];
for number in npArray:
    if number == -1:
        outputArray.append(counter)
        counter += 1
    elif number == 1:
        outputArray.append(0)
    else:
        outputArray.append(0)
        counter = 1
print(outputArray)

6 Comments

Your solution won't work. When 1 is encounterd the counter must be constant but your solution will append a new 0 in the outputArray.
If that's a problem, please edit the question to include that.
This code won't work if npArray is like [-1,-1,1,-1,-1] the output will be [1, 2, 0, 1, 2] but it must be [1,2,0,3,4] if I get the question right
Now I get it. I'll edit :)
Yes that's right @Miki and thanks Skyrider
|
0

Here is a fix for @skyrider's code

npArray = [1,1,-1,-1,0,-1,1]
counter = 0
outputArray = []
for number in npArray:
    if number == -1:
        counter += 1
        outputArray.append(counter)
    elif number == 0:
        outputArray.append(0)
        counter = 0
    else:
        outputArray.append(counter)
print(outputArray)

5 Comments

The problem is when 1 is encountered in the middle: the counter must be constant but your solution will append a new 0 in the outputArray instead
what do you mean by constant your mean like when 1 is encountered in the middle it should not include it or...
What i mean is when 1 is encountered the chain must append the current state of the counter. np.array([1,1,-1,-1,0,-1,1]) becomes: np.array([0,0,1,2,0,1,1])
It's more like a cumsum with reset when 0 appears
Ok so I think it's fixed now check it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.