4

I have a quite large m times n numpy matrix M filled with non-zero values and an array x of length m, where each entry indicates the row index, after which the matrix elements should be set to zero. So for example, if n=5 and x[i]=3, then the i-th row of the matrix be set to [M_i1, M_i2, M_i3, 0, 0].

If all entries of x had the same value k, I could simply use slicing with something like M[:,k:]=0, but I could not figure out an efficient way to this with different values for each row without looping over all rows and use slicing for each row.

I thougt about creating a matrix that looks like [[1]*x[1] + [0]*(n-x[1]),...,[1]*x[m] + [0]*(n-x[m])] and use it for boolean indexing but also don't know how to create this without looping.

The non-vectorized solution looks like this:

for i in range(m):
    if x[i] < n:
        M[i,x[i]:] = 0

with example input

M = np.array([[1,2,3],[4,5,6]])
m, n =  2, 3
x = np.array([1,2])

and output

array([[1, 0, 0],
       [4, 5, 0]])

Does anyone have a vectorized solution for this problem?

Thank you very much!

8
  • Hello, post the non-vectorized solution ? Commented Apr 13, 2021 at 17:09
  • @ce.teuf added it Commented Apr 13, 2021 at 17:27
  • 1
    Please post sample input and output Commented Apr 13, 2021 at 17:34
  • So some x may be too large? And nothing is supposed to happen for those rows? Commented Apr 13, 2021 at 17:44
  • @hpaulj the maximal value of x is n, so it can only happen that it's too large by a value of 1. Commented Apr 13, 2021 at 17:47

2 Answers 2

1

You can use multi-dimensional boolean indexing:

M[x[:,None]<=np.arange(M.shape[1])] = 0

example:

M = [[7, 8, 4, 2, 3, 9, 1, 8, 4, 3],
     [2, 1, 6, 1, 5, 2, 2, 2, 9, 2],
     [6, 1, 6, 8, 4, 3, 6, 9, 2, 6],
     [5, 4, 0, 8, 3, 0, 0, 1, 8, 7],
     [8, 7, 8, 8, 9, 2, 0, 8, 0, 2]]
x = [4, 4, 0, 6, 2]

output:

      [[7, 8, 4, 2, 0, 0, 0, 0, 0, 0],
       [2, 1, 6, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [5, 4, 0, 8, 3, 0, 0, 0, 0, 0],
       [8, 7, 0, 0, 0, 0, 0, 0, 0, 0]]
Sign up to request clarification or add additional context in comments.

17 Comments

Set one of the elements of x to 10 or more. Does it still work?
I've tried this approach. It is indeed vectorized but way slower than the "non-vectorized" method using a for loop. The problem is it implicitly generates a m by n boolean matrix, which is very inefficient.
@ShihaoXu. That's not very inefficient if you do it right. What did the timing look like? I'm getting that this is 1.5x faster even for a trivial array, and only gets better from there
@MadPhysicist I assumed the indexes are in bound, but even if it is greater than bound, it should work. Should it not?
@ShihaoXu I would be surprised if it is slower than loop. Specially for large enough arrays. What method do you use to time it?
|
0

This looks like a mask-smearing exercise. At each row, you want to smear starting with the element at np.minimum(x[row], n):

mask = np.zeros(M.shape, bool)
mask[np.flatnonzero(x < n), x[x < n]] = True
M[np.cumsum(mask, axis=1, dtype=bool)] = 0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.