MPI Parallelization using the master-slave model

Question

I am currently writing a c++ MPI program to parallelize a certain computation. On Rank # 0, I have a priority queue of jobs. What I want to do is as follows. First, the highest priority item in the priority_queue is obtained and removed from the queue. This is sent to all MPI ranks, which participate in the computation. Once the computation finishes, the second item in the queue is then popped and we proceed similarly.

How to efficiently solve this?

Victor Eijkhout · Accepted Answer · 2025-05-04 16:54:02Z

1

The manager-worker (use of the M-S term is advised against in all modern style guides) is not simple to do in MPI. You say you send the highest priority item to all MPI ranks. Do they all compute the same result or is the work distributed over them? First case is wasteful, second case is not really a manager-worker model and you might as well do it without the manager.

If you have a true manager-worker model, where each item is done by a single process, you should use non-blocking sends: the manager sends the top so many items to all available processes. Then it does MPI_Waitany for any process to finish, and that process gets the next item. Then the manager waits again.

You need to come up with some convention how to tell the workers that the whole queue is empty, because they can not see the actual queue. (Unless you use one-sided communication, but it doesn't sound like you're ready for that.)

answered May 4 at 16:54

Victor Eijkhout

6,0002 gold badges29 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

dbrane May 4 at 17:18

Thanks. All ranks need to participate in each task. Actually, what I am trying to do is that each task consists of a large set of model parameters, and I want to distribute the computations among the MPI ranks. So, the queue consists of set A1 -> set A2 ->... and each A1 is itself a set of parameters that is distributed among the workers

Victor Eijkhout May 4 at 17:57

"distribute the computations" is too vague. MPI is about distributed data. Do you distribute each "large set of model parameters"? As in: each set is worked on by all processes together? Doing MPI calls to communicate?

dbrane May 4 at 22:45

Yes, I distribute sets of model parameters to the MPI ranks. Actually, the code works now. However, I think it is an ideal solution. What happens is that before running the actual computation, I divide the dataset of model parameters into equal subsets, and pre-compute an estimate of the difficulty associated to each sub-set. I then put these subsets into a priority queue. Now, at each step of an iteration, the highest priority item is popped and then the model parameters in this subset are distributed to the MPI ranks.

Collectives™ on Stack Overflow

MPI Parallelization using the master-slave model

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related