Vary the elements in a numpy array

Question

I have a numpy array:

a = [3., 0., 4., 2., 0., 0., 0.]

I would like a new array, created from this, where the non zero elements are converted to their value in zeros and zero elements are converted to a single number equal to the number of consecutive zeros i.e:

b = [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 3.]

Looking for a vectorized way to do this as the array will have > 1 million elements. Any help much appreciated.

I am going to be surprised if this can be vectorized but good luck :) — Hammer
– Hammer, Commented Oct 17, 2013 at 0:40

Bi Rico · Accepted Answer · 2013-10-17 01:46:22Z

8

This should do the trick, it roughly works by 1) finding all the consecutive zeros and counting them, 2) computing the size of the output array and initializing it with zeros, 3) placing the counts from part 1 in the correct places.

def cz(a):
    a = np.asarray(a, int)

    # Find where sequences of zeros start and end
    wz = np.zeros(len(a) + 2, dtype=bool)
    wz[1:-1] = a == 0
    change = wz[1:] != wz[:-1]
    edges = np.where(change)[0]
    # Take the difference to get the number of zeros in each sequence
    consecutive_zeros = edges[1::2] - edges[::2]

    # Figure out where to put consecutive_zeros
    idx = a.cumsum()
    n = idx[-1] if len(idx) > 0 else 0
    idx = idx[edges[::2]]
    idx += np.arange(len(idx))

    # Create output array and populate with values for consecutive_zeros
    out = np.zeros(len(consecutive_zeros) + n)
    out[idx] = consecutive_zeros
    return out

edited Oct 17, 2013 at 1:46

answered Oct 17, 2013 at 0:57

Bi Rico

25.9k3 gold badges57 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

tommw Over a year ago

Excellent, works very well. Orders of magnitudes faster than the loops I was attempting to use.

Daniel · Accepted Answer · 2013-10-17 02:04:23Z

4

For some variety:

a = np.array([3., 0., 4., 2., 0., 0., 0.],dtype=np.int)

inds = np.cumsum(a)

#Find first occurrences and values thereof.
uvals,zero_pos = np.unique(inds,return_index=True)
zero_pos = np.hstack((zero_pos,a.shape[0]))+1

#Gets zero lengths
values =  np.diff(zero_pos)-1
mask = (uvals!=0)

#Ignore where we have consecutive values
zero_inds = uvals[mask]
zero_inds += np.arange(zero_inds.shape[0])

#Create output array and apply zero values
out = np.zeros(inds[-1] + zero_inds.shape[0])
out[zero_inds] = values[mask]

out
[ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  3.]

Mainly varies in the fact that we can use np.unique to find first occurrences of an array as long as it is monotonically increasing.

edited Oct 17, 2013 at 2:04

answered Oct 17, 2013 at 1:57

Daniel

19.6k7 gold badges64 silver badges74 bronze badges

3 Comments

Bi Rico Over a year ago

Nice answer. I think it's a little off if a has a leading 0, but that should be easy to fix.

askewchan Over a year ago

Both are nice, @BiRico's is still a bit faster on my machine.

Daniel Over a year ago

@BiRico Good point, it is lacking that aspect. Interestingly you need large values in your a array (a>300) before this method becomes faster.

Collectives™ on Stack Overflow

Vary the elements in a numpy array

2 Answers 2

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related