I am trying to find a way to speed my code up.
In short, I have a trained model which I use something like to obtain predicts, sort them and output a rank.
def predict(feed_dict, truth):
# Feed dict contains about 10K candidates to obtain scores
pred = self.sess.run([self.mdl.predict_op], feed_dict)
pred = np.array(pred)
# With the scores, I sort them by likelihood
sort = np.argsort(pred)[::-1]
# I find the rank of the ground truth
rank = np.where(sort==truth)[0][0] + 1
return rank
However, this process is extremely slow. I have about 10K test samples. I believe session doesn't work well with standard multiprocessing libraries in python while the multi-cpu/gpu support is only available for tensorflow ops.
Is there any elegant way to speed this up via multiprocessing? Or do I have to implement it as part of the computational graph in TF.
Thanks a lot!
tf.nn.top_k(pred)[1]is the same as yournp.argsortline. If you turn everything into TF graph you won't need multiprocessing -- parallelsession.runcalls can be started from different Python threads in the same process.tf.wherein version 0.12 (tf.selectin earlier versions)