1

I have a function to compute features and then save the features into pickle.

test_knn_feats = NNF.predict(X_test) 
np.save('data/knn_feats_%s_test.npy' % metric , test_knn_feats)

In the function, if n_jobs is more than 1, then code below will execute.

fest_feats =[]
pool = Pool(processes = self.n_jobs) 
for i in range(X.shape[0]):
    test_feats.append(pool.apply_async(self.get_features_for_one(X[i:i+1])))
pool.close()
pool.join()

return np.vstack(test_feats)

However, there is error occur:

TypeError                                 Traceback (most recent call last)
<ipython-input-96-4f707b7cd533> in <module>()
     12     print(test_knn_feats)
     13     # Dump the features to disk
---> 14     np.save('data/knn_feats_%s_test.npy' % metric , test_knn_feats)

/opt/conda/lib/python3.6/site-packages/numpy/lib/npyio.py in save(file, arr, allow_pickle, fix_imports)
    507         arr = np.asanyarray(arr)
    508         format.write_array(fid, arr, allow_pickle=allow_pickle,
--> 509                            pickle_kwargs=pickle_kwargs)
    510     finally:
    511         if own_fid:

/opt/conda/lib/python3.6/site-packages/numpy/lib/format.py in write_array(fp, array, version, allow_pickle, pickle_kwargs)
    574         if pickle_kwargs is None:
    575             pickle_kwargs = {}
--> 576         pickle.dump(array, fp, protocol=2, **pickle_kwargs)
    577     elif array.flags.f_contiguous and not array.flags.c_contiguous:
    578         if isfileobj(fp):

The function get_features_for_one will return a list, shown below.

...
knn_feats = np.hstack(return_list)
assert knn_feats.shape == (239,) or knn_feats.shape == (239, 1)
return knn_feats

*Update:

test_feats =[]      
pool = Pool(processes = self.n_jobs) 
for i in range(X.shape[0]):
    test_feats.append(pool.apply_async(self.get_features_for_one, (X[i:i+1],)))
test_feats= [res.get() for res in test_feats]        
pool.close()
pool.join()
return np.vstack(test_feats)
3
  • What is test_knn_feats.dtype before you save it? What does multithreading have to do with this? Are you saying the code works if run without multithreading? Commented Dec 24, 2017 at 4:43
  • When I print it, it displays: [[<multiprocessing.pool.ApplyResult object at 0x7f1bb03ae5c0>] [<multiprocessing.pool.ApplyResult object at 0x7f1bb02fe240>] [<multiprocessing.pool.ApplyResult object at 0x7f1bb02fe2b0>] ..., [<multiprocessing.pool.ApplyResult object at 0x7f1bb17b62b0>] [<multiprocessing.pool.ApplyResult object at 0x7f1bb17b63c8>] [<multiprocessing.pool.ApplyResult object at 0x7f1bb17b6518>]] Commented Dec 24, 2017 at 4:45
  • @JohnZwinck it is object dtype Commented Dec 24, 2017 at 4:46

1 Answer 1

1

There are two major bugs here:

test_feats =[] # you called it fest_feats, I assume a typo
pool = Pool(processes = self.n_jobs) 
for i in range(X.shape[0]):
    test_feats.append(pool.apply_async(self.get_features_for_one(X[i:i+1])))
    pool.close()
    pool.join()

return np.vstack(test_feats)
  1. First, you create a Pool. Then for each i you submit one job and then close & join the pool. You should only close & join the pool once, at the end, outside the loop.

  2. test_feats ends up being a list of "futures", not actual data. So vstack() on them makes no sense. You need to call get() on each future to get the result of get_features_for_one() and then pass that list to vstack(). For example np.vstack([res.get() for res in test_feats]).

In short, you problem has nothing to do with the TypeError that you eventually receive from numpy.save()--your problem is that your logic is completely broken and your data is not what you think it is.

Sign up to request clarification or add additional context in comments.

4 Comments

Ahh there is a copy& paste mistake for first point. I close and join the pool outside the loop. I will try on 2nd point.
Tried with test_feats = [res.get() for res in test_feats]. Having TypeError: 'numpy.ndarray' object is not callable , I will take my time to debug it
@MervynLee: You're getting the new error because you're running code which is not what you've posted. Stop doing that.
I try to print res.ready() and it returns True, but it will output the TypeError if I try to call res.get()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.