4

This task is pretty trivial in NumPy like so

import numpy as np

a= np.array([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
a + a[1]

Output:

array([[ 4,  4,  9,  2, 16],
       [ 6,  4, 12,  4, 14],
       [ 3,  2,  6, 10,  7],
       [ 4,  2,  6,  2, 10]])

See how the vector dimensions are automatically broadcasted to each row of the matrix.

But when it comes to sparse matrices, there is a dimension mismatch error.

from scipy.sparse import *

a= csr_matrix([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
a + a[1]

Output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-74c48fe5106e> in <module>()
      2 
      3 a= csr_matrix([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
----> 4 a + a[1]

/opt/anaconda2/lib/python2.7/site-packages/scipy/sparse/compressed.pyc in __add__(self, other)
    337         elif isspmatrix(other):
    338             if (other.shape != self.shape):
--> 339                 raise ValueError("inconsistent shapes")
    340 
    341             return self._binopt(other,'_plus_')

ValueError: inconsistent shapes

There is a function for sparse multiplication, e.g., a.multiply(a[1]) for a * a[1] (which does its job perfectly), but I couldn't find one for addition.

I'm new to sparse matrices. Please help.

2 Answers 2

6

numpy style broadcasting has not been implemented for sparse matrices.

Multiplication, especially matrix multiplication, is well developed. In fact actions like row sum and selection of rows are implemented as matrix multiplications - e.g. M * <column vector of 1s>. Multiplication often results in a matrix that's as sparse if not more so.

Addition/subtraction is not well developed. It often results in a denser matrix. The extreme case is adding a scalar to all elements. Even in your example, the result is dense. Both a and a[1,:] have to be quite sparse to justify a pure sparse addition.

In [713]: a= np.array([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
In [714]: aM = sparse.csr_matrix(a)
In [715]: aM
Out[715]: 
<4x5 sparse matrix of type '<class 'numpy.int32'>'
    with 12 stored elements in Compressed Sparse Row format>

We can replicate the selected row by matrix multiplication - first the broadcasted dense approach:

In [719]: np.ones((4,1))*aM[1,:]
Out[719]: 
array([[ 3.,  2.,  6.,  2.,  7.],
       [ 3.,  2.,  6.,  2.,  7.],
       [ 3.,  2.,  6.,  2.,  7.],
       [ 3.,  2.,  6.,  2.,  7.]])
In [720]: np.ones((4,1))*aM[1,:]+aM    # dense matrix addition
Out[720]: 
matrix([[  4.,   4.,   9.,   2.,  16.],
        [  6.,   4.,  12.,   4.,  14.],
        [  3.,   2.,   6.,  10.,   7.],
        [  4.,   2.,   6.,   2.,  10.]])

sparse matrix multiplication:

In [721]: sparse.csr_matrix(np.ones((4,1)))*aM[1,:]
Out[721]: 
<4x5 sparse matrix of type '<class 'numpy.float64'>'
    with 20 stored elements in Compressed Sparse Row format>

sparse matrix addition:

In [722]: sparse.csr_matrix(np.ones((4,1)))*aM[1,:]+aM
Out[722]: 
<4x5 sparse matrix of type '<class 'numpy.float64'>'
    with 20 stored elements in Compressed Sparse Row format>
In [723]: _.A
Out[723]: 
array([[  4.,   4.,   9.,   2.,  16.],
       [  6.,   4.,  12.,   4.,  14.],
       [  3.,   2.,   6.,  10.,   7.],
       [  4.,   2.,   6.,   2.,  10.]])

This would be a better demonstration if aM and especially aM[1:] was sparse. I could also have specified the np.ones as int dtype to match aM. And making it a csc matrix would be more compact.

Sign up to request clarification or add additional context in comments.

3 Comments

Pretty useful. I always thought that statements like Line-722 would throw the same error. So, why didn't SciPy include an optimized sparse vector addition while it has a decent vector multiplication routine?
In [722] we are adding 2 matrices of the same shape. Elementwise addition is just more work than elementwise multiplication. c[i,j] will be nonzero if either a[i,j] or b[i,j] is nonzero when adding; when multiplying it can skip c[i,j] if either is zero.
Look up the docs for csr_matrix.multiply. Click on [source] to see the file where it and calculations are implemented, or atleast the python portion. The hard work is done in compiled code.
0

Try with:

from scipy.sparse import *

a= csr_matrix([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
a.todense()+a[1].todense()

it will be:

matrix([[ 4,  4,  9,  2, 16],
        [ 6,  4, 12,  4, 14],
        [ 3,  2,  6, 10,  7],
        [ 4,  2,  6,  2, 10]])

Update:

Make the addition matrix b with same dimension and full with a[1], then add them:

from scipy.sparse import *
import numpy as np
an_array=np.array([[1,2,3,0,9],[3,2,6,2,7],[0,0,0,8,0],[1,0,0,0,3]])
a = csr_matrix(an_array)
b = csr_matrix([an_array[1] for i in range(len(an_array))])
a+b

2 Comments

Converting them to a dense representation would be an obvious (good) solution. However, the above example is only an MWE; we only go for a sparse matrix when the structure is arbitrarily large with only a few elements. Converting them to a dense form for the single operation would defeat their purpose (in which case it would be better to stick to NumPy from the beginning).
maybe you can create another sparse matrix with same dimension and all value come from a[1], then you can sum them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.