how to split an array based on maximum values of two columns

Question

I want to split a numpy array based on the values of two columns. I want to split at the index after both of the first two columns reach their maximum (simultaneously). Each column reaches its maximum several times. The maximum of each column can be seen individually (when the other one is not in its maximum), But I need to separate when they are both at their maximum value. Lets say I have

arr =  [[ 1., 5, 12],
        [ 1., 9,  5],
        [15., 5,  5],
        [25., 7,  4],
        [25., 9,  4],
        [1.5, 4, 10],
        [ 1., 8,  7],
        [20., 5,  6],
        [25., 8,  3],
        [25., 9,  3]]

I want to get:

arr_1 = [[ 1., 5, 12],
         [ 1., 9,  5],
         [15., 5,  5],
         [25., 7,  4],
         [25., 9,  4]]

arr_2 = [[1.5, 4, 10],
         [ 1., 8,  7],
         [20., 5,  6],
         [25., 8,  3],
         [25., 9,  3]]

Added a couple of lines to the end of my answer

Mad Physicist
– Mad Physicist

2020-09-25 13:01:40 +00:00
Commented Sep 25, 2020 at 13:01 — Mad Physicist
– Mad Physicist, Commented Sep 25, 2020 at 13:01

N.C · Accepted Answer · 2020-09-25 12:11:57Z

3

Assuming you want the output to be a list of lists, you can iterate over the elements of the original array and look for a "separating" element.

One possible implementation:

def split_at_max(arr):
    m0 = max(a[0] for a in arr)
    m1 = max(a[1] for a in arr)
    res = [[]]
    for i,a in enumerate(arr):
        res[-1].append(a)
        if (a[:2] == [m0, m1]) and (i != len(arr) - 1):
            res.append([])
   return res

answered Sep 25, 2020 at 12:11

N.C

3255 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Link_tester Over a year ago

Dear @N.C, I appreciate your clever solution. Sorry that I made a mistake and showed my data as a list. In fact, they are numpy arrays. Do you any way to mimic your solution for an array rather than a list? Thanks in advance,

N.C Over a year ago

In order to fix the function to handle numpy arrays you can use the condition: (a[0] == m0) and (a[1] == m1) in the if statement. What is the desired type of the output?

Link_tester Over a year ago

I prefer to have my results as numpy arrays. Thanks for replying and giving time.

Mad Physicist Over a year ago

@Ali_d. I've added a pure numpy solution

Mad Physicist · Accepted Answer · 2020-09-25 13:01:03Z

You can create a boolean mask of all locations where an array is equal to its maximum:

max_val = arr[:, :2].max(axis=0)
mask = arr[:, :2] == max_val

Then make a row mask of all places where all the columns match:

row_mask = mask.all(axis=1)

You want the locations of the index after the match, so you can do one of the following:

shifted_row_mask = np.r_[False, row_mask [:-1]]
index = np.flatnonzero(shifted_row_mask)

Or

index = np.flatnonzero(row_mask[:-1]) + 1

In both cases, you want to discard the last element to prevent overflow, and add one.

Now you can just call np.split:

result = np.split(arr, index, axis=0)

This can all be written as a nice, totally illegible, one-liner:

result = np.split(arr, np.flatnonzero((arr[:, :2] == arr[:, :2].max(axis=0)).all(axis=1)[:-1]) + 1, axis=0)

If you want the output in the exact format you showed, restrict the number of indices to 1, and unpack the result of np.split:

arr_1, arr_2 = np.split(arr, index[0], axis=0)

Collectives™ on Stack Overflow

how to split an array based on maximum values of two columns

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related