1

I have a 1D numpy array. The difference between two succeeding values in this array is either one or larger than one. I want to cut the array into parts for every occurrence that the difference is larger than one. Hence:

arr = numpy.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])

should become

[array([77, 78, 79, 80]), array([90, 91, 92]), array([100, 101, 102, 103, 104])]

I have the following code that does the trick but I have the feeling I am being to complicated here. There has to be a better/more pythonic way. Anyone with a more elegant approach?

import numpy

def split(arr, cut_idxs):

  empty_arr = []
  for idx in range(-1, cut_idxs.shape[0]):  
    if idx == -1:
      l, r = 0, cut_idxs[0]
    elif (idx != -1) and (idx != cut_idxs.shape[0] - 1):
      l, r = cut_idxs[idx] + 1, cut_idxs[idx + 1]
    elif idx == cut_idxs.shape[0] - 1:
      l, r = cut_idxs[-1] + 1, arr.shape[0]

    empty_arr.append(arr[l:r + 1]) 

  return empty_arr 


arr = numpy.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])
cuts = numpy.where(numpy.ediff1d(arr) > 2)[0]

print split(arr, cuts)

2 Answers 2

2

One Pythonic way would be -

np.split(arr, np.flatnonzero(np.diff(arr)>1)+1)

Sample run -

In [10]: arr
Out[10]: array([ 77,  78,  79,  80,  90,  91,  92, 100, 101, 102, 103, 104])

In [11]: np.split(arr, np.flatnonzero(np.diff(arr)>1)+1)
Out[11]: 
[array([77, 78, 79, 80]),
 array([90, 91, 92]),
 array([100, 101, 102, 103, 104])]

Another with slicing -

In [16]: cut_idx = np.r_[0,np.flatnonzero(np.diff(arr)>1)+1,len(arr)]
             # Or np.flatnonzero(np.r_[True, np.diff(arr)>1, True])

In [17]: [arr[i:j] for i,j in zip(cut_idx[:-1],cut_idx[1:])]
Out[17]: 
[array([77, 78, 79, 80]),
 array([90, 91, 92]),
 array([100, 101, 102, 103, 104])]
Sign up to request clarification or add additional context in comments.

2 Comments

Is there a benefit of using np.flatnonzero instead of np.where?
@TheDude Nope, np.flatnonzero is just more explicit that we are working with 1D arrays.
0

Another way with slicing, getting the appropriate indices using np.diff:

import numpy as np
def split(arr):
    idx = np.pad(np.where(np.diff(arr) > 1)[0]+1, (1,1),
             'constant', constant_values = (0, len(arr)))
    return [arr[idx[i]: idx[i+1]] for i in range(len(idx)-1)]

Result:

arr = np.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])
>>> split(arr)
[array([77, 78, 79, 80]), array([90, 91, 92]), array([100, 101, 102, 103, 104])]

In your case, your slicing "map" idx ends up being: array([ 0, 4, 7, 12]), which is where the diff is greater than 1 (indices 4 and 7), padded by a zero on the left, and the length of your array (12) on the right using np.pad

But np.split, as suggested by @Divakar seems to be the way to go

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.