2

I am running a time consuming program a lot of times. I have the chance to have access to a cluster where I can require 504 processors, but customer service is let's say slow, so I turn to you SO. I am using a very simple application as follow:

import multiprocessing

def function(data):
    data = complicated_function_I_was_given(data)
    with open('unique_id', 'w') as f:
        f.write(data)

pool = multiprocessing.Pool(504)
pool.map(function, data_iterator)

Now, although I can see the processes start (the 'complicated_function_I_was_given' writes a bunch of scrap, but with unique names so I am sure there is no clash), the process seems really slow. I am expecting some data in data_iterator to be processed immediately, although some will take days, yet after 1 day nothing has been produced. Could it be that multiprocessing.Pool() has a limit? Or that it doesn't distributes the processes over different nodes (I know each node has 12 cores)? And I am using python2.6.5.

2 Answers 2

4

Or that it doesn't distributes the processes over different nodes (I know each node has 12 cores)? And I am using python2.6.5.

I think this is your problem: unless your cluster architecture is very unusual, and all the processors appear to be on the same logical machine, then multiprocessing will only have access to the local cores. You probably need to use a different parallelisation library.

See also the answers to this question.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the link, I think you are right. I don't know how I could have miss that question! Now to play with mpi4py than.
1

You might try scaling the work with one of Python's many parallel libraries, I've not heard of scaling work over so many processors with just multiprocessing.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.