0

Consider A, an n by j matrix, and B, an m by j matrix, both in SciPy with m<n. Is there any way that I can find the indices of the rows of A which are identical to rows of B?

I have tried for loops and tried to convert them into Numpy arrays. In my case, they're not working because I'm dealing with huge matrices. Here is the link to the same question for Numpy arrays.

Edit:

An Example for A, B, and the desired output:

>>> import numpy as np
>>> from scipy.sparse import csc_matrix

>>> row = np.array([0, 2, 2, 0, 1, 2])
>>> col = np.array([0, 0, 1, 2, 2, 2])
>>> data = np.array([1, 3, 3, 4, 5, 6])

>>> A = csc_matrix((data, (row, col)), shape=(5, 3))
>>> A.toarray()
array([[1, 0, 4],
       [0, 0, 5],
       [3, 3, 6],
       [0, 0, 0],
       [0, 0, 0]])

>>> row = np.array([0, 2, 2, 0, 1, 2])
>>> col = np.array([0, 0, 1, 2, 2, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> B = csc_matrix((data, (row, col)), shape=(4, 3))
>>> B.toarray()
array([[1, 0, 4],
       [0, 0, 5],
       [2, 3, 6], 
       [0, 0, 0]])

Desired output:

def some_function(A,B): 
  # Some operations
  return indices
>>> some_function(A,B)
[0, 1, 3, 4]
4
  • No. scipy.sparse matrices are good for linear algebra kinds of things, like matrix multiplication. In fact they implement indexing with that kind of multiplication. They don't broadcast as in your link, and row by row iteration is slow. The best you can do is work with the indptr of the csr format directly. Commented Jan 30, 2023 at 17:29
  • Maybe you could ','.join each row into a string and hash it, keeping a dict that maps hash to row indexes. Collect the hashes of the two matrices into two sets and perform set intersection, then with the resulting set and the dict find out the indices of the identical rows. Commented Jan 30, 2023 at 17:35
  • Don't convert to a string. Ensure that the sparse classes are the same, form sets of tuples derived from the structure members, and perform set intersection. But none of this is reproducible with no data and no code. Commented Jan 30, 2023 at 19:17
  • Thank you for your comments. I've edited my question. Now, it has data and desired output. Commented Jan 31, 2023 at 0:47

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.