1

I want to cast a DataFrame to sparse matrix using csr_matrix from scipy library, but first I have to convert it to a SparseDataFrame. In previous versions of pandas I used pd.SparseDataFrame(df).to_coo() for such purposes, but since pandas 1.0.0 this method is deprecated. Does anyone know how to perform such conversion using latest pandas api. I used this migration guide and tried various combination but still unable to achieve desired result. Following the guide, when I do the following

csr_matrix(pd.DataFrame.sparse.from_spmatrix(df).to_coo())

I get this error

AttributeError: 'DataFrame' object has no attribute 'tocsc'

Can anyone help me how to solve this? Also I do find other posts, but couldn't helped me in my case link link link

1
  • Unless it's obvious what's producing the error, always include the full traceback. Commented Aug 5, 2020 at 19:23

1 Answer 1

3

IIUC and using the third link you shared, you can convert your df data to sparse data using pd.SparseDtype, like this

df_sparsed = df.astype(pd.SparseDtype("float", np.nan)

You can read more about pd.SparseDtype here to choose right parameters for your data and then use it in your above command like this:

csr_matrix(df_sparsed.sparse.to_coo()) # Note you need .sparse accessor to access .to_coo()

Simple one liner will be

csr_matrix(df.astype(pd.SparseDtype("float", np.nan)).sparse.to_coo())
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.