How to implement custom control over python multiprocessing.Pool?

Question

Usually i use following code, and it works fine when you do not matter in which order function process_func will handle some parameter:

params = [1,2,3,4,5 ... ]

def process_func():
    ...

pool = new Pool(40)
pool.map(process_func, params)
pool.close()
pool.join()

In example above we have processes of one type, with maximum simultanious number of 40. But.. imagine we have processes (parameters) of different type, which should be executed simultaniously. For example, in my selenium grid i have 40 firefox, 40 chromes. And i have 5000 test cases, some of them prefer chrome, some of them firefox, some of them does not matter.

For example, lets say we have following types:

type firefox: maximum simultanious number: 40
type chrome: maximum simultanious number: 40

In this case our pool will have maximum of 80 simultanious processes, but there is strict rule: 40 of them must be firefox, 40 of them must be chromes.

It means that params won't be taken one after each other. Pool must select value from params list in a way to have maximum of each process type.

How it is possible to achieve that?

Is there a reason not to simply use two pools and two lists of inputs? — Henry Keiter
– Henry Keiter, Commented Oct 23, 2014 at 18:22

skrrgwasme · Accepted Answer · 2014-10-23 18:22:19Z

2

I would modify your process_func() to take one more parameter that tells it which "type" to be and use two separate pools. Adding functools.partial will allow us to still use pool.map():

from functools import partial
from multiprocessing import Pool

params = [1,2,3,4,5 ... ]

def process_func(type, param):
    if type == 'Firefox':
        # do Firefox stuff
    else:
        # do Chrome stuff

chrome_pool = Pool(40)
fox_pool = Pool(40)

chrome_function = partial(process_func, 'Chrome')
fox_function = partial(process_func, 'Firefox')

chrome_pool.map(chrome_func, params)
fox_pool.map(fox_func, params)

chrome_pool.close()
fox_pool.close()
chrome_pool.join()
fox_pool.join()

The functools.partial() function allows us to bind an argument to a specific value, by returning a new function object that will always supply that argument. This approach allows you to limit each "type" (for lack of a better term) to 40 worker processes.

answered Oct 23, 2014 at 18:22

skrrgwasme

9,67112 gold badges58 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

avasin Over a year ago

Hm, wow, and those two pools will work simultaniuosly in one master process?

avasin Over a year ago

Hm... and how i can control tests, that does not have preferred type (browser)? In fact, they should use less loaded browser type.

skrrgwasme Over a year ago

I'm not sure what you mean by "one master process", but this is almost the same as calling Pool(80). You're just enforcing a limit on the number of workers available to each function that is using a specific browser.

skrrgwasme Over a year ago

When you use map, there isn't much variance in the "loading" of a pool. Your indicated number of processes are spun off and given a queue. All of the jobs are submitted to that queue all at once. As each process finishes one job, it grabs the next from the queue. There isn't really any down time until the number of items left in the queue is less than the number of workers. To have the control you're looking for, you would have to use something like apply_async() and implement your own manual control over when jobs are submitted, and find a way to adjust it dynamically.

skrrgwasme Over a year ago

Also, even though you can launch 20 pools at a time, it doesn't mean you should. The "need" for that many pools suggest (to me) that the overall structure of your script may need some work. Once you have your code working, consider heading over to Code Review to get some other people's input on it.

|

Collectives™ on Stack Overflow

How to implement custom control over python multiprocessing.Pool?

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related