Simple process manager using multiprocessing in Python

Question

I have a lot of tasks (independent of each other, represented by some code in Python) that need to be executed. Their execution time varies. I also have limited resources so at most N tasks can be running at the same time. The goal is to finish executing the whole stack of tasks as fast as possible.

It seems that I am looking for some kind of manager that starts new tasks when the resource gets available and collects finished tasks.

Are there any already-made solutions or should I code it myself?
Are there any caveats that I should keep in mind?

Possible duplicate of Filling a queue and managing multiprocessing in python — tevemadar
– tevemadar, Commented Nov 27, 2019 at 13:21

Sam Mason · Accepted Answer · 2019-11-27 18:49:01Z

2

as far as I can tell your main would just become:

def main():
    tasks = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    with multiprocessing.Pool(POOL_SIZE) as pool:
        pool.map(sleep, tasks)

i.e. you've just reimplemented a pool, but inefficiently (Pool reuses Processes where possible) and in not as safely, Pool goes to lots of effort to cleanup on exceptions

answered Nov 27, 2019 at 18:49

Sam Mason

16.5k1 gold badge49 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jeyekomon Over a year ago

Well that's embarrassing. Parallel programming is a big mental step - your code literally cannot be more simple and I still have a hard time imagining what's actually happening behind those one or two lines. Still, it helped me a lot, thank you.

Sam Mason Over a year ago

@Jeyekomon the joy of open source code is it's all there if you want to find out :)

Jeyekomon · Accepted Answer · 2019-11-27 13:14:32Z

0

Here is a simple code snippet that should fit the requirements:

import multiprocessing
import time

POOL_SIZE = 4
STEP = 1


def sleep(seconds: int):
    time.sleep(seconds)


def main():
    tasks = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    pool = [None] * POOL_SIZE

    while tasks or [item for item in pool if item is not None]:
        for i in range(len(pool)):
            if pool[i] is not None and not pool[i].is_alive():
                # Finished task. Clear the resource.
                pool[i] = None

            if pool[i] is None:
                # Free resource. Start new task if any are left.
                if tasks:
                    task = tasks.pop(0)
                    pool[i] = multiprocessing.Process(target=sleep, args=(task,))
                    pool[i].start()

        time.sleep(STEP)


if __name__ == '__main__':
    main()

The manager has a tasks list of arbitrary length, here are tasks for simplicity represented by integers that are being placed as arguments to a sleep function. It also has a pool list, initially empty, representing the available resource.

The manager periodically visits all currently running processes and checks if they are finished or not. It also starts new processes if the resource becomes available. The whole cycle is being repeated until there are no tasks and no currently running processes left. The STEP value is here to save the computing power - you generally don't need to check the running processes every millisecond.

As for the caveats, there are some guidelines that should be kept in mind when using multiprocessing.

answered Nov 27, 2019 at 13:14

Jeyekomon

3,5563 gold badges33 silver badges43 bronze badges

2 Comments

Sam Mason Over a year ago

why aren't you using the suggested Pool().map API?

Jeyekomon Over a year ago

@SamMason I just started learning the multiprocessing module and all those Pipe, Queue, Event, Barrier, Semaphore, Manager, ... classes are yet a bit too much to comprehend for me. I managed to implement the functionality from scratch but I was still interested if (and how) can this be implemented using the multiprocessing classes. That's why I posted this question.

Collectives™ on Stack Overflow

Simple process manager using multiprocessing in Python

2 Answers 2

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related