Cutting numpy array based on index

Question

I have a 1D numpy array. The difference between two succeeding values in this array is either one or larger than one. I want to cut the array into parts for every occurrence that the difference is larger than one. Hence:

arr = numpy.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])

should become

[array([77, 78, 79, 80]), array([90, 91, 92]), array([100, 101, 102, 103, 104])]

I have the following code that does the trick but I have the feeling I am being to complicated here. There has to be a better/more pythonic way. Anyone with a more elegant approach?

import numpy

def split(arr, cut_idxs):

  empty_arr = []
  for idx in range(-1, cut_idxs.shape[0]):  
    if idx == -1:
      l, r = 0, cut_idxs[0]
    elif (idx != -1) and (idx != cut_idxs.shape[0] - 1):
      l, r = cut_idxs[idx] + 1, cut_idxs[idx + 1]
    elif idx == cut_idxs.shape[0] - 1:
      l, r = cut_idxs[-1] + 1, arr.shape[0]

    empty_arr.append(arr[l:r + 1]) 

  return empty_arr 


arr = numpy.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])
cuts = numpy.where(numpy.ediff1d(arr) > 2)[0]

print split(arr, cuts)

Divakar · Accepted Answer · 2018-05-01 16:22:45Z

2

One Pythonic way would be -

np.split(arr, np.flatnonzero(np.diff(arr)>1)+1)

Sample run -

In [10]: arr
Out[10]: array([ 77,  78,  79,  80,  90,  91,  92, 100, 101, 102, 103, 104])

In [11]: np.split(arr, np.flatnonzero(np.diff(arr)>1)+1)
Out[11]: 
[array([77, 78, 79, 80]),
 array([90, 91, 92]),
 array([100, 101, 102, 103, 104])]

Another with slicing -

In [16]: cut_idx = np.r_[0,np.flatnonzero(np.diff(arr)>1)+1,len(arr)]
             # Or np.flatnonzero(np.r_[True, np.diff(arr)>1, True])

In [17]: [arr[i:j] for i,j in zip(cut_idx[:-1],cut_idx[1:])]
Out[17]: 
[array([77, 78, 79, 80]),
 array([90, 91, 92]),
 array([100, 101, 102, 103, 104])]

answered May 1, 2018 at 16:22

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

The Dude Over a year ago

Is there a benefit of using np.flatnonzero instead of np.where?

Divakar Over a year ago

@TheDude Nope, np.flatnonzero is just more explicit that we are working with 1D arrays.

sacuL · Accepted Answer · 2018-05-01 16:43:12Z

0

Another way with slicing, getting the appropriate indices using np.diff:

import numpy as np
def split(arr):
    idx = np.pad(np.where(np.diff(arr) > 1)[0]+1, (1,1),
             'constant', constant_values = (0, len(arr)))
    return [arr[idx[i]: idx[i+1]] for i in range(len(idx)-1)]

Result:

arr = np.array([77, 78, 79, 80, 90, 91, 92, 100, 101, 102, 103, 104])
>>> split(arr)
[array([77, 78, 79, 80]), array([90, 91, 92]), array([100, 101, 102, 103, 104])]

In your case, your slicing "map" idx ends up being: array([ 0, 4, 7, 12]), which is where the diff is greater than 1 (indices 4 and 7), padded by a zero on the left, and the length of your array (12) on the right using np.pad

But np.split, as suggested by @Divakar seems to be the way to go

edited May 1, 2018 at 16:43

answered May 1, 2018 at 16:35

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Collectives™ on Stack Overflow

Cutting numpy array based on index

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related