15

I am trying to run a parallel loop on a simple example.
What am I doing wrong?

from joblib import Parallel, delayed  
import multiprocessing

def processInput(i):  
        return i * i

if __name__ == '__main__':

    # what are your inputs, and what operation do you want to 
    # perform on each input. For example...
    inputs = range(1000000)      

    num_cores = multiprocessing.cpu_count()

    results = Parallel(n_jobs=4)(delayed(processInput)(i) for i in inputs) 

    print(results)

The problem with the code is that when executed under Windows environments in Python 3, it opens num_cores instances of python to execute the parallel jobs but only one is active. This should not be the case since the activity of the processor should be 100% instead of 14% (under i7 - 8 logic cores).

Why are the extra instances not doing anything?

2
  • Are you getting any error message ? It runs fine for me... . Indenting should be 4 spaces instead of one... Commented Nov 18, 2015 at 18:46
  • I have the same issue. The problem is that the code only runs on one core not on the n-cores. Commented Feb 2, 2016 at 10:36

2 Answers 2

20
+50

Continuing on your request to provide a working multiprocessing code, I suggest that you use Pool.map() (if the delayed functionality is not important), I'll give you an example, if you're using Python3 it's worth mentioning that you can use starmap(). Also worth mentioning that you can use map_sync()/starmap_async() if the order of the returned results does not have to correspond to the order of inputs.

import multiprocessing as mp

def processInput(i):
   return i * i

if __name__ == '__main__':
    # What are your inputs, and what operation do you want to
    # perform on each input. For example...
    inputs = range(1000000)
    # Removing the processes argument makes the code run on all available cores
    pool = mp.Pool(processes=4)
    results = pool.map(processInput, inputs)
    print(results)
Sign up to request clarification or add additional context in comments.

2 Comments

I love the simplicity of this, so i tried it. I get a TypeError: cannot serialize '_io.TextIOWrapper' object. My function is complex, and i don't have time to dive into it, just a comment about if you have a complex function, this may not work out of the box
Serialization is a major part of each multi-process program. To try and mitigate such issues I recommend examining your complex function and check which part of it really needs the multi-processing solution and try decoupling it from the complex function, this will ease serialization and might even render it unnecessary.
3

On Windows, the multiprocessing module uses the 'spawn' method to start up multiple python interpreter processes. This is relatively slow. Parallel tries to be smart about running the code. In particular, it tries to adjust batch sizes so a batch takes about half a second to execute. (See the batch_size argument at https://pythonhosted.org/joblib/parallel.html)

Your processInput() function runs so fast that Parallel determines that it is faster to run the jobs serially on one processor than to spin up multiple python interpreters and run the code in parallel.

If you want to force your example to run on multiple cores, try setting batch_size to 1000 or making processInput() more complicated so it takes longer to execute.

Edit: Working example on windows that shows multiple processes in use (I'm using windows 7):

from joblib import Parallel, delayed
from os import getpid

def modfib(n):
    # print the process id to see that multiple processes are used, and
    # re-used during the job.
    if n%400 == 0:
        print(getpid(), n)  

    # fibonacci sequence mod 1000000
    a,b = 0,1
    for i in range(n):
        a,b = b,(a+b)%1000000
    return b

if __name__ == "__main__":
    Parallel(n_jobs=-1, verbose=5)(delayed(modfib)(j) for j in range(1000, 4000))

1 Comment

Could you propose a code modification so as the task is effectively executed in parallel? Since the code above is given as an example of joblib use, there should be an example that actually works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.