2

Is there any efficient numpy way to do the following: Assume I have some matix M of size R X C. Now assume I have another matrix E which is of shape R X a (where a is just some constant a < C), which contains row indices of M (and -1 for padding, i.e., every element of E is in {-1, 0, .., R-1}). For example,

M=array([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]])

E = array([[ 0,  1],
           [ 2, -1],
           [-1,  0]])

Now, given those matrices, I want to generate a third matrix P, where the i'th row of P will contain the sum of the following rows of M : E[i,:]. In the example, P will be,

P[0,:] = M[0,:] + M[1,:]
P[1,:] = M[2,:]
P[2,:] = M[0,:]

Yes, doing it with a loop is pretty straight forward and easy, I was wondering if there is any fancy numpy way to make it more efficient (assuming that I want to do it with large matrices, e.g., 200 X 200.

Thanks!

1
  • Could you have 2 or more -1 per row in E? Commented Jul 10, 2020 at 7:23

3 Answers 3

2

One way would be to sum with indexed on original array and then subtract out the summations caused by the last indexed ones by -1s -

out = M[E].sum(1) - M[-1]*(E==-1).sum(1)[:,None]

Another way would be pad zeros at the end of M, so that those -1 would index into those zeros and hence have no effect on the final sum after indexing -

M1 = np.vstack((M, np.zeros((1,M.shape[1]), dtype=M.dtype)))
out = M1[E].sum(1)

If there is exactly one or lesser -1 per row in E, we can optimize further -

out = M[E].sum(1)
m = (E==-1).any(1)
out[m] -= M[-1]

Another based on tensor-multiplication -

np.einsum('ij,kli->kj',M, (E[...,None]==np.arange(M.shape[1])))
Sign up to request clarification or add additional context in comments.

Comments

1

You could index M with E, and np.sum where the actual indices in E are greater or equal to 0. For that we have the where parameter:

np.sum(M[E], where=(E>=0)[...,None], axis=1)

array([[5, 7, 9],
       [7, 8, 9],
       [1, 2, 3]])

Where we have that:

M[E]
array([[[1, 2, 3],
        [4, 5, 6]],

       [[7, 8, 9],
        [7, 8, 9]],

       [[7, 8, 9],
        [1, 2, 3]]])

Is added on the rows:

(E>=0)[...,None]
array([[[ True],
        [ True]],

       [[ True],
        [False]],

       [[False],
        [ True]]])

Comments

1

Probably not the fastest but maybe educational: The operation you are describing can be thought of as matrix multiplication with a certain adjacency matrix:

from scipy import sparse

# construct adjacency matrix
indices = E[E!=-1]
indptr = np.concatenate([[0],np.count_nonzero(E!=-1,axis=1).cumsum()])
data = np.ones_like(indptr)
aux = sparse.csr_matrix((data,indices,indptr))

# multiply
aux*M
# array([[5, 7, 9],
#        [7, 8, 9],
#        [1, 2, 3]], dtype=int64)

2 Comments

Inspired by this matrix multiplication, got me thinking and ended up with a tensor multiplication one.
@DIvakar you one-upped me there regarding the sparsity of the auxiliary matrix ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.