I'm working with recommender systems but I'm struggling with the access times of the scipy sparse matrices.
In this case, I'm implementing TrustSVD so I need an efficient structure to operate both in columns and rows (CSR, CSC). I've thought about using both structures, dictionaries,... but either way this is always too slow, especially compared with the numpy matrix operations.
for u, j in zip(*ratings.nonzero()):
items_rated_by_u = ratings[u, :].nonzero()[1]
users_who_rated_j = ratings[:, j].nonzero()[0]
# More code...
Extra: Each loop takes around 0.033s, so iterating once through 35,000 ratings means to wait 19min per iteration (SGD) and for a minimum of 25 iterations we're talking about 8h. Moreover, here I'm just talking about accessing, if I include the factorization part it would take around 2 days.