How to eliminate zeros in sparse matrix in (Python)?

Question

I need a sparse matrix (I'm using Compressed Sparse Row Format (CSR) from scipy.sparse) to do some computation. I have it in a form of (data, (row, col)) tuple. Unfortunately some of the rows and columns will be all equal zero and I would like to get rid of those zeros. Right now I have:

[In]:
     from scipy.sparse import csr_matrix
     aa = csr_matrix((1,2,3), ((0,2,2), (0,1,2))
     aa.todense()
[Out]:
     matrix([[1, 0, 0],
             [0, 0, 0],
             [0, 2, 3]], dtype=int64)

And I would like to have:

[Out]:
    matrix([[1, 0, 0],
            [0, 2, 3]], dtype=int64)

After using the method eliminate_zeros() on the object I get None:

[In]:
     aa2 = csr_matrix.eliminate_zeros(aa)
     type(aa2)
[Out]:
     <class 'NoneType'>

Why does that method turn it into None?

Is there any other way to get a sparse matrix (doesn't have to be CSR) and get rid of empty rows/columns easily?

I'm using Python 3.4.0.

Is it possible that the method is operating in-place, and modified the original aa variable? Then it wouldn't return anything (i.e. None). — Waleed Khan
– Waleed Khan, Commented Jul 30, 2015 at 19:44
And it isn't what the OP wants to do. It removes unneeded 0s from the .data, but does not change dimensions. — hpaulj
– hpaulj, Commented Jul 30, 2015 at 21:37
elminate_zeros does not change aa=sparse.csr_matrix(((1,2,3),((0,2,2),(0,1,2)))), since it doesn't have extra 0s. — hpaulj
– hpaulj, Commented Jul 30, 2015 at 21:44
what means by in-place method? how should I use this function @WaleedKhan — Firestar-Reimu
– Firestar-Reimu, Commented Oct 13, 2024 at 14:52

Jaime · Accepted Answer · 2015-07-30 20:28:18Z

5

In CSR format it is relatively easy to get rid of the all-zero rows:

>>> import scipy.sparse as sps
>>> a = sps.csr_matrix([[1, 0, 0], [0, 0, 0], [0, 2, 3]])
>>> a.indptr
array([0, 1, 1, 3])
>>> mask = np.concatenate(([True], a.indptr[1:] != a.indptr[:-1]))
>>> mask  # 1st occurrence of unique a.indptr entries
array([ True,  True, False,  True], dtype=bool)
>>> sps.csr_matrix((a.data, a.indices, a.indptr[mask])).A
array([[1, 0, 0],
       [0, 2, 3]])

You could then convert your sparse array to CSC format, and the exact same trick will get rid of the all zero columns then.

I am not sure of how well will it perform, but the much more readable syntax:

>>> a[a.getnnz(axis=1) != 0][:, a.getnnz(axis=0) != 0].A
array([[1, 0, 0],
       [0, 2, 3]])

also works.

answered Jul 30, 2015 at 20:28

Jaime

67.7k19 gold badges128 silver badges164 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

hpaulj Over a year ago

Your first solution depends on the array being prunned. Depending on why are there 0s, it may need eliminate_zeros() to match the results of the second method.

Collectives™ on Stack Overflow

How to eliminate zeros in sparse matrix in (Python)?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related