The problem
I have a 1-dimensional numpy array filled mostly with zeros but also containing some groups of non-zero values.
>> import numpy as np
>> a = np.zeros(10)
>> a[2:4] = 2
>> a[6:9] = 3
>> print a
[ 0. 0. 2. 2. 0. 0. 3. 3. 3. 0.]
I want to get the array that contains only the last non-zero group. In other words, all but the last non-zero group should be replaced by zeros. (The groups could be only 1 element long). Like so:
[ 0. 0. 0. 0. 0. 0. 3. 3. 3. 0.]
Non-robust solution
This seems to do the trick. Reverse the array and find the first index where the change between elements is negative. Then replace all subsequent elements with zero. Then flip back. It's a bit long-winded:
>> b = a[::-1]
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
[ 0. 0. 0. 0. 0. 0. 3. 3. 3. 0.]
Fails for a specific case
However, it is not robust and fails in the following case (because the where command returns an empty list of indices):
>> a = np.zeros(10)
>> a[0:4] = 2
>> print a
[ 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.]
>> b = a[::-1]
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
Traceback (most recent call last):
File "<ipython-input-81-8cba57558ba8>", line 1, in <module>
runfile('C:/Users/name/test1.py', wdir='C:/Users/name')
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/name/test1.py", line 21, in <module>
b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
IndexError: index 0 is out of bounds for axis 0 with size 0
Fix
So I need to introduce an if clause:
>> b = a[::-1]
>> if len(np.where(np.ediff1d(b) < 0)[0]) > 0:
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
[ 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.]
Is there a more elegant way to do it?
UPDATE Following on from Divakar's excellent answer and mtrw's question, I would like to extend the specification. The method should also work if the input array has non-zero values that are negative and for groups of non-zero numbers that change within the grouping.
e.g. np.array([1, 0, 0, 4, 5, 4, 5, 0, 0])
This means methods where we check for a positive or negative difference between elements, in order to find the group boundaries, would not work so well.