I have a large [numpy] vector X, and a comparison function f(x,y). I need to find all the pairs of elements of X for which f(X[I],X[j])<T for some threshold T. This works well:
good_inds = {}
for i in range(0,len(X)):
for j in range(x+1,len(X)):
score = f(X[i],X[j])
if score<T:
good_inds[x,y] = score
This actually builds a dictionary which is a representation of a sparse matrix. The problem is that it's rather slow, and I wish to parallelise this process. Please advise.
xandyare constants within the scope of this snippet, so why use a dictionary? Did you meanx --> X[i]andy --> X[j]?fdoes, e.g. what sort of constraints can be exploited. Roland's answer is great if there's nothing more known about the problem, but you'd get much more relevant answers if you said thatXandYare bothnumpyarrays andfa simple algebraic expression