0

I'm trying to do some multiprocessing in python and have a problem with datatypes getting changed while using Pool.starmap. See snippet below:

import multiprocessing as mp
from itertools import repeat
with mp.Pool(3) as pool:
        pool.starmap(some function, zip([np.array, np.array], repeat(pd.dataframe)))

Before passing the np.array to the the function it is something like:

['some string', some int, some float, 'some string']

But after passing it somehow gets formatted to:

['some string', 'some string', 'some string', 'some string']

Has anyone experienced similar problems so far? Cheers

5
  • What is some function? That seems like a likely culprit. Commented Apr 7, 2021 at 13:56
  • The function just does some math with the array and the df. Works totally fine when calling it directly. Commented Apr 7, 2021 at 13:59
  • Numpy arrays are homogenous, the numbers are converted to strings before any multiprocessing. np.array(['a',1,1.2]) --> array(['a', '1', '1.2'], dtype='<U3') Commented Apr 7, 2021 at 18:47
  • Does Python - strings and integers in Numpy array answer your question? Commented Apr 7, 2021 at 18:58
  • thanks, you helped me a lot. Found a way to use dtype=int for the array instead of object and now it works! Commented Apr 9, 2021 at 8:48

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.