3

I want to split a numpy array based on the values of two columns. I want to split at the index after both of the first two columns reach their maximum (simultaneously). Each column reaches its maximum several times. The maximum of each column can be seen individually (when the other one is not in its maximum), But I need to separate when they are both at their maximum value. Lets say I have

arr =  [[ 1., 5, 12],
        [ 1., 9,  5],
        [15., 5,  5],
        [25., 7,  4],
        [25., 9,  4],
        [1.5, 4, 10],
        [ 1., 8,  7],
        [20., 5,  6],
        [25., 8,  3],
        [25., 9,  3]]

I want to get:

arr_1 = [[ 1., 5, 12],
         [ 1., 9,  5],
         [15., 5,  5],
         [25., 7,  4],
         [25., 9,  4]]

arr_2 = [[1.5, 4, 10],
         [ 1., 8,  7],
         [20., 5,  6],
         [25., 8,  3],
         [25., 9,  3]]
1
  • Added a couple of lines to the end of my answer Commented Sep 25, 2020 at 13:01

2 Answers 2

3

Assuming you want the output to be a list of lists, you can iterate over the elements of the original array and look for a "separating" element.

One possible implementation:

def split_at_max(arr):
    m0 = max(a[0] for a in arr)
    m1 = max(a[1] for a in arr)
    res = [[]]
    for i,a in enumerate(arr):
        res[-1].append(a)
        if (a[:2] == [m0, m1]) and (i != len(arr) - 1):
            res.append([])
   return res
Sign up to request clarification or add additional context in comments.

4 Comments

Dear @N.C, I appreciate your clever solution. Sorry that I made a mistake and showed my data as a list. In fact, they are numpy arrays. Do you any way to mimic your solution for an array rather than a list? Thanks in advance,
In order to fix the function to handle numpy arrays you can use the condition: (a[0] == m0) and (a[1] == m1) in the if statement. What is the desired type of the output?
I prefer to have my results as numpy arrays. Thanks for replying and giving time.
@Ali_d. I've added a pure numpy solution
1

You can create a boolean mask of all locations where an array is equal to its maximum:

max_val = arr[:, :2].max(axis=0)
mask = arr[:, :2] == max_val

Then make a row mask of all places where all the columns match:

row_mask = mask.all(axis=1)

You want the locations of the index after the match, so you can do one of the following:

shifted_row_mask = np.r_[False, row_mask [:-1]]
index = np.flatnonzero(shifted_row_mask)

Or

index = np.flatnonzero(row_mask[:-1]) + 1

In both cases, you want to discard the last element to prevent overflow, and add one.

Now you can just call np.split:

result = np.split(arr, index, axis=0)

This can all be written as a nice, totally illegible, one-liner:

result = np.split(arr, np.flatnonzero((arr[:, :2] == arr[:, :2].max(axis=0)).all(axis=1)[:-1]) + 1, axis=0)

If you want the output in the exact format you showed, restrict the number of indices to 1, and unpack the result of np.split:

arr_1, arr_2 = np.split(arr, index[0], axis=0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.