Slice a numpy 2d array using another 2d array

Question

I have a 2D array of (4,5) and another 2D array of (4,2) shape. The second array contains the start and end indices that I need to filter out from first array i.e., I want to slice the first array using second array.

np.random.seed(0)
a = np.random.randint(0,999,(4,5))
a
array([[684, 559, 629, 192, 835],
       [763, 707, 359,   9, 723],
       [277, 754, 804, 599,  70],
       [472, 600, 396, 314, 705]])

idx = np.array([[2,4],
                [0,3],
                [2,3],
                [1,3]
               ])

Expected output - can be either of following two formats. Only reason for padding with zeros is that variable length 2d arrays are not supported.

[[629, 192, 835, 0, 0],
 [763, 707, 359, 9, 0],
 [804, 599, 0, 0, 0],
 [600, 396, 314, 0, 0]
]

[[0, 0, 629, 192, 835],
 [763, 707, 359, 9, 0],
 [0, 0, 804, 599, 0],
 [0, 600, 396, 314, 0]
]

PaulS · Accepted Answer · 2024-12-12 10:46:30Z

Another possible solution, which uses:

np.arange to create a range of column indices based on the number of columns in a.
A boolean mask m is created using logical operations to check if each column index falls within the range specified by idx. The np.newaxis is used to align dimensions for broadcasting.
np.where is used to create a_mask, where elements in a are replaced with 0 if the corresponding value in m is False.
np.argsort is used to get the indices that would sort each row of m (negated) in ascending order.
np.take_along_axis is used to rearrange the elements of a_mask based on the sorted indices.

cols = np.arange(a.shape[1])
m = (cols >= idx[:, 0, np.newaxis]) & (cols <= idx[:, 1, np.newaxis])

a_mask = np.where(m, a, 0)
sort_idx = np.argsort(~m, axis=1)
np.take_along_axis(a_mask, sort_idx, axis=1)

NB: Notice that a_mask contains the unsorted version of the solution (that is essentially the approach followed by @mozway).

Output:

array([[629, 192, 835,   0,   0],
       [763, 707, 359,   9,   0],
       [804, 599,   0,   0,   0],
       [600, 396, 314,   0,   0]])

# a_mask
array([[  0,   0, 629, 192, 835],
       [763, 707, 359,   9,   0],
       [  0,   0, 804, 599,   0],
       [  0, 600, 396, 314,   0]])

mozway · Accepted Answer · 2024-12-12 09:37:38Z

Use broadcasting and numpy.arange to compute a mask, then apply numpy.where to select the True values from a and 0 otherwise:

i = np.arange(a.shape[1])[None]

out = np.where((i >= idx[:, [0]]) & (i <= idx[:, [1]]), a, 0)

Output:

array([[  0,   0, 629, 192, 835],
       [763, 707, 359,   9,   0],
       [  0,   0, 804, 599,   0],
       [  0, 600, 396, 314,   0]])

Intermediates:

# i
array([[0, 1, 2, 3, 4]])

# (i >= idx[:, [0]]) & (i <= idx[:, [1]])
array([[False, False,  True,  True,  True],
       [ True,  True,  True,  True, False],
       [False, False,  True,  True, False],
       [False,  True,  True,  True, False]])

If you want the first output with the values on the left, this is a bit longer, you can compute the indices to fill an array of zeros_like:

# like above
i = np.arange(a.shape[1])[None] # array([[0, 1, 2, 3, 4]]),
m = (i >= idx[:, [0]]) & (i <= idx[:, [1]])

# prepare output with 0s
out = np.zeros_like(a)

# get row indices
r, _ = np.where(m) # r: array([0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3, 3]),

# compute column indices
_, cnt = np.unique(r, return_counts=True) # cnt: array([3, 4, 2, 3])
c = np.arange(cnt.sum()) - np.repeat(np.r_[0, cnt[:-1]].cumsum(), cnt)
# c: array([0, 1, 2, 0, 1, 2, 3, 0, 1, 0, 1, 2])

# fill with valid values
out[r, c] = a[m]

Output:

array([[629, 192, 835,   0,   0],
       [763, 707, 359,   9,   0],
       [804, 599,   0,   0,   0],
       [600, 396, 314,   0,   0]])

Suramuthu R · Accepted Answer · 2024-12-12 09:14:19Z

Explanation given as comments inside the code:

import numpy as np

# Input arrays
np.random.seed(0)
a = np.random.randint(0, 999, (4, 5))
idx = np.array([[2, 4],
                [0, 3],
                [2, 3],
                [1, 3]])

# Prepare the output array with zeros
output_padded_start = np.zeros_like(a)
output_padded_end = np.zeros_like(a)

# Slice and place the values
for i in range(a.shape[0]):
    start, end = idx[i]
    sliced_values = a[i, start:end + 1]
    output_padded_start[i, :len(sliced_values)] = sliced_values  # Padding with zeros at the end
    output_padded_end[i, -len(sliced_values):] = sliced_values  # Padding with zeros at the start


print("Padded at the end:")
print(output_padded_start)
print("\nPadded at the start:")
print(output_padded_end)


# Output
Padded at the end:
[[629 192 835   0   0]
 [763 707 359   9   0]
 [804 599   0   0   0]
 [600 396 314   0   0]]

Padded at the start:
[[  0   0 629 192 835]
 [763 707 359   9   0]
 [  0   0 804 599   0]
 [  0 600 396 314   0]]

ThomasIsCoding · Accepted Answer · 2024-12-12 10:14:44Z

0

Probably you can try

v = np.arange(a.shape[1])
a*list(map(lambda x: np.isin(v, range(x[0],x[1]+1)), idx))

which gives

[[  0   0 629 192 835]
 [763 707 359   9   0]
 [  0   0 804 599   0]
 [  0 600 396 314   0]]

answered Dec 12, 2024 at 10:14

ThomasIsCoding

106k9 gold badges38 silver badges110 bronze badges

Collectives™ on Stack Overflow

Slice a numpy 2d array using another 2d array

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related