5

I have a 2D array of (4,5) and another 2D array of (4,2) shape. The second array contains the start and end indices that I need to filter out from first array i.e., I want to slice the first array using second array.

np.random.seed(0)
a = np.random.randint(0,999,(4,5))
a
array([[684, 559, 629, 192, 835],
       [763, 707, 359,   9, 723],
       [277, 754, 804, 599,  70],
       [472, 600, 396, 314, 705]])
idx = np.array([[2,4],
                [0,3],
                [2,3],
                [1,3]
               ])

Expected output - can be either of following two formats. Only reason for padding with zeros is that variable length 2d arrays are not supported.

[[629, 192, 835, 0, 0],
 [763, 707, 359, 9, 0],
 [804, 599, 0, 0, 0],
 [600, 396, 314, 0, 0]
]
[[0, 0, 629, 192, 835],
 [763, 707, 359, 9, 0],
 [0, 0, 804, 599, 0],
 [0, 600, 396, 314, 0]
]
0

4 Answers 4

4

Another possible solution, which uses:

  • np.arange to create a range of column indices based on the number of columns in a.

  • A boolean mask m is created using logical operations to check if each column index falls within the range specified by idx. The np.newaxis is used to align dimensions for broadcasting.

  • np.where is used to create a_mask, where elements in a are replaced with 0 if the corresponding value in m is False.

  • np.argsort is used to get the indices that would sort each row of m (negated) in ascending order.

  • np.take_along_axis is used to rearrange the elements of a_mask based on the sorted indices.

cols = np.arange(a.shape[1])
m = (cols >= idx[:, 0, np.newaxis]) & (cols <= idx[:, 1, np.newaxis])

a_mask = np.where(m, a, 0)
sort_idx = np.argsort(~m, axis=1)
np.take_along_axis(a_mask, sort_idx, axis=1)

NB: Notice that a_mask contains the unsorted version of the solution (that is essentially the approach followed by @mozway).


Output:

array([[629, 192, 835,   0,   0],
       [763, 707, 359,   9,   0],
       [804, 599,   0,   0,   0],
       [600, 396, 314,   0,   0]])

# a_mask
array([[  0,   0, 629, 192, 835],
       [763, 707, 359,   9,   0],
       [  0,   0, 804, 599,   0],
       [  0, 600, 396, 314,   0]])
Sign up to request clarification or add additional context in comments.

Comments

3

Use broadcasting and numpy.arange to compute a mask, then apply numpy.where to select the True values from a and 0 otherwise:

i = np.arange(a.shape[1])[None]

out = np.where((i >= idx[:, [0]]) & (i <= idx[:, [1]]), a, 0)

Output:

array([[  0,   0, 629, 192, 835],
       [763, 707, 359,   9,   0],
       [  0,   0, 804, 599,   0],
       [  0, 600, 396, 314,   0]])

Intermediates:

# i
array([[0, 1, 2, 3, 4]])

# (i >= idx[:, [0]]) & (i <= idx[:, [1]])
array([[False, False,  True,  True,  True],
       [ True,  True,  True,  True, False],
       [False, False,  True,  True, False],
       [False,  True,  True,  True, False]])

If you want the first output with the values on the left, this is a bit longer, you can compute the indices to fill an array of zeros_like:

# like above
i = np.arange(a.shape[1])[None] # array([[0, 1, 2, 3, 4]]),
m = (i >= idx[:, [0]]) & (i <= idx[:, [1]])

# prepare output with 0s
out = np.zeros_like(a)

# get row indices
r, _ = np.where(m) # r: array([0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3, 3]),

# compute column indices
_, cnt = np.unique(r, return_counts=True) # cnt: array([3, 4, 2, 3])
c = np.arange(cnt.sum()) - np.repeat(np.r_[0, cnt[:-1]].cumsum(), cnt)
# c: array([0, 1, 2, 0, 1, 2, 3, 0, 1, 0, 1, 2])

# fill with valid values
out[r, c] = a[m]

Output:

array([[629, 192, 835,   0,   0],
       [763, 707, 359,   9,   0],
       [804, 599,   0,   0,   0],
       [600, 396, 314,   0,   0]])

Comments

0

Explanation given as comments inside the code:

import numpy as np

# Input arrays
np.random.seed(0)
a = np.random.randint(0, 999, (4, 5))
idx = np.array([[2, 4],
                [0, 3],
                [2, 3],
                [1, 3]])

# Prepare the output array with zeros
output_padded_start = np.zeros_like(a)
output_padded_end = np.zeros_like(a)

# Slice and place the values
for i in range(a.shape[0]):
    start, end = idx[i]
    sliced_values = a[i, start:end + 1]
    output_padded_start[i, :len(sliced_values)] = sliced_values  # Padding with zeros at the end
    output_padded_end[i, -len(sliced_values):] = sliced_values  # Padding with zeros at the start


print("Padded at the end:")
print(output_padded_start)
print("\nPadded at the start:")
print(output_padded_end)


# Output
Padded at the end:
[[629 192 835   0   0]
 [763 707 359   9   0]
 [804 599   0   0   0]
 [600 396 314   0   0]]

Padded at the start:
[[  0   0 629 192 835]
 [763 707 359   9   0]
 [  0   0 804 599   0]
 [  0 600 396 314   0]]

Comments

0

Probably you can try

v = np.arange(a.shape[1])
a*list(map(lambda x: np.isin(v, range(x[0],x[1]+1)), idx))

which gives

[[  0   0 629 192 835]
 [763 707 359   9   0]
 [  0   0 804 599   0]
 [  0 600 396 314   0]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.