1

Suppose I have two NumPy arrays

x = [[5, 2, 8],
     [4, 9, 1],
     [7, 8, 9],
     [1, 3, 5],
     [1, 2, 3],
     [1, 2, 4]]
y = [0, 0, 1, 1, 1, 2] 

I want to efficiently split the array x into sub-arrays according to the values in y.

My desired outputs would be

z_0 = [[5, 2, 8],
       [4, 9, 1]]
z_1 = [[7, 8, 9],
       [1, 3, 5],
       [1, 2, 3]]
z_2 = [[1, 2, 4]]

Assuming that y starts with zero and is sorted in ascending order, what is the most efficient way to do this?

Note: This question is the sorted version of this question: Split a NumPy array into subarrays according to the values (not sorted, but grouped) of another array

1
  • Those aren't numpy arrays Commented Mar 19, 2021 at 6:03

1 Answer 1

2

If y is grouped (doesn't have to be sorted), you can use diff to get the split points:

indices = np.flatnonzero(np.diff(y)) + 1

You can pass those directly to np.split:

z = np.split(x, indices, axis=0)

If you want to know the labels too:

labels = y[np.r_[0, indices]]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.