4

I need a sparse matrix (I'm using Compressed Sparse Row Format (CSR) from scipy.sparse) to do some computation. I have it in a form of (data, (row, col)) tuple. Unfortunately some of the rows and columns will be all equal zero and I would like to get rid of those zeros. Right now I have:

[In]:
     from scipy.sparse import csr_matrix
     aa = csr_matrix((1,2,3), ((0,2,2), (0,1,2))
     aa.todense()
[Out]:
     matrix([[1, 0, 0],
             [0, 0, 0],
             [0, 2, 3]], dtype=int64)

And I would like to have:

[Out]:
    matrix([[1, 0, 0],
            [0, 2, 3]], dtype=int64)

After using the method eliminate_zeros() on the object I get None:

[In]:
     aa2 = csr_matrix.eliminate_zeros(aa)
     type(aa2)
[Out]:
     <class 'NoneType'>

Why does that method turn it into None?

Is there any other way to get a sparse matrix (doesn't have to be CSR) and get rid of empty rows/columns easily?

I'm using Python 3.4.0.

5
  • 2
    Is it possible that the method is operating in-place, and modified the original aa variable? Then it wouldn't return anything (i.e. None). Commented Jul 30, 2015 at 19:44
  • 1
    It's in-place. It says so in the documentation, in fact. Commented Jul 30, 2015 at 19:49
  • 1
    And it isn't what the OP wants to do. It removes unneeded 0s from the .data, but does not change dimensions. Commented Jul 30, 2015 at 21:37
  • elminate_zeros does not change aa=sparse.csr_matrix(((1,2,3),((0,2,2),(0,1,2)))), since it doesn't have extra 0s. Commented Jul 30, 2015 at 21:44
  • what means by in-place method? how should I use this function @WaleedKhan Commented Oct 13, 2024 at 14:52

1 Answer 1

5

In CSR format it is relatively easy to get rid of the all-zero rows:

>>> import scipy.sparse as sps
>>> a = sps.csr_matrix([[1, 0, 0], [0, 0, 0], [0, 2, 3]])
>>> a.indptr
array([0, 1, 1, 3])
>>> mask = np.concatenate(([True], a.indptr[1:] != a.indptr[:-1]))
>>> mask  # 1st occurrence of unique a.indptr entries
array([ True,  True, False,  True], dtype=bool)
>>> sps.csr_matrix((a.data, a.indices, a.indptr[mask])).A
array([[1, 0, 0],
       [0, 2, 3]])

You could then convert your sparse array to CSC format, and the exact same trick will get rid of the all zero columns then.

I am not sure of how well will it perform, but the much more readable syntax:

>>> a[a.getnnz(axis=1) != 0][:, a.getnnz(axis=0) != 0].A
array([[1, 0, 0],
       [0, 2, 3]])

also works.

Sign up to request clarification or add additional context in comments.

1 Comment

Your first solution depends on the array being prunned. Depending on why are there 0s, it may need eliminate_zeros() to match the results of the second method.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.