selective row sum matrix in numpy

Question

Is there any efficient numpy way to do the following: Assume I have some matix M of size R X C. Now assume I have another matrix E which is of shape R X a (where a is just some constant a < C), which contains row indices of M (and -1 for padding, i.e., every element of E is in {-1, 0, .., R-1}). For example,

M=array([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]])

E = array([[ 0,  1],
           [ 2, -1],
           [-1,  0]])

Now, given those matrices, I want to generate a third matrix P, where the i'th row of P will contain the sum of the following rows of M : E[i,:]. In the example, P will be,

P[0,:] = M[0,:] + M[1,:]
P[1,:] = M[2,:]
P[2,:] = M[0,:]

Yes, doing it with a loop is pretty straight forward and easy, I was wondering if there is any fancy numpy way to make it more efficient (assuming that I want to do it with large matrices, e.g., 200 X 200.

Thanks!

Could you have 2 or more -1 per row in E?

Divakar
– Divakar

2020-07-10 07:23:25 +00:00
Commented Jul 10, 2020 at 7:23 — Divakar
– Divakar, Commented Jul 10, 2020 at 7:23

Divakar · Accepted Answer · 2020-07-10 08:16:25Z

2

One way would be to sum with indexed on original array and then subtract out the summations caused by the last indexed ones by -1s -

out = M[E].sum(1) - M[-1]*(E==-1).sum(1)[:,None]

Another way would be pad zeros at the end of M, so that those -1 would index into those zeros and hence have no effect on the final sum after indexing -

M1 = np.vstack((M, np.zeros((1,M.shape[1]), dtype=M.dtype)))
out = M1[E].sum(1)

If there is exactly one or lesser -1 per row in E, we can optimize further -

out = M[E].sum(1)
m = (E==-1).any(1)
out[m] -= M[-1]

Another based on tensor-multiplication -

np.einsum('ij,kli->kj',M, (E[...,None]==np.arange(M.shape[1])))

edited Jul 10, 2020 at 8:16

answered Jul 10, 2020 at 7:09

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

yatu · Accepted Answer · 2020-07-10 07:19:40Z

1

You could index M with E, and np.sum where the actual indices in E are greater or equal to 0. For that we have the where parameter:

np.sum(M[E], where=(E>=0)[...,None], axis=1)

array([[5, 7, 9],
       [7, 8, 9],
       [1, 2, 3]])

Where we have that:

M[E]
array([[[1, 2, 3],
        [4, 5, 6]],

       [[7, 8, 9],
        [7, 8, 9]],

       [[7, 8, 9],
        [1, 2, 3]]])

Is added on the rows:

(E>=0)[...,None]
array([[[ True],
        [ True]],

       [[ True],
        [False]],

       [[False],
        [ True]]])

edited Jul 10, 2020 at 7:19

answered Jul 10, 2020 at 7:08

yatu

88.7k12 gold badges93 silver badges148 bronze badges

Comments

Paul Panzer · Accepted Answer · 2020-07-10 07:28:29Z

1

Probably not the fastest but maybe educational: The operation you are describing can be thought of as matrix multiplication with a certain adjacency matrix:

from scipy import sparse

# construct adjacency matrix
indices = E[E!=-1]
indptr = np.concatenate([[0],np.count_nonzero(E!=-1,axis=1).cumsum()])
data = np.ones_like(indptr)
aux = sparse.csr_matrix((data,indices,indptr))

# multiply
aux*M
# array([[5, 7, 9],
#        [7, 8, 9],
#        [1, 2, 3]], dtype=int64)

answered Jul 10, 2020 at 7:28

Paul Panzer

53.3k3 gold badges60 silver badges103 bronze badges

2 Comments

Divakar Over a year ago

Inspired by this matrix multiplication, got me thinking and ended up with a tensor multiplication one.

Paul Panzer Over a year ago

@DIvakar you one-upped me there regarding the sparsity of the auxiliary matrix ;-)

Collectives™ on Stack Overflow

selective row sum matrix in numpy

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related