3

I'm running some small MPI jobs across nodes in a computer lab at my university. There's no queuing system installed, so I have to generate MPI hostfiles myself each time I want to run a job, then run them like so:

mpirun --hostfile mpi_hostfile -n 32 ./mpi_program

I use Open MPI, so right now my hostfiles look something like this:

localhost slots=4
hydra13 slots=4
hydra14 slots=4
hydra2 slots=4
hydra22 slots=4
hydra24 slots=4
hydra26 slots=4
hydra1 slots=4

My question is this: each of the nodes has an Intel® Core™ i7-3770 processor, which is quad-core, but also hyper-threaded. What's considered best practice for Open MPI hostfiles where hyperthreading is concerned? Should I list four or eight slots for each node?

Thanks.

2 Answers 2

1

It depends on your usage. You'll probably want to do some experiments with lots of configurations, but usually what people do if they are using MPI+OpenMP (I'm assuming you meant OpenMP the threading library. Not Open MPI, the MPI library even though your question is tagged OpenMPI.) is to have one MPI process per node and one OpenMP thread per core. I'm not sure how hyperthreading weighs in here, but that's the usual practice.

If, indeed, you mean Open MPI everywhere you mentioned OpenMP, then it's different. If you're only using MPI processes, then usually, people use one MPI process per core.

In the end, you'll need to test out your application with a range of setups and see which is fastest for your machines and your application. There is no silver bullet.

Sign up to request clarification or add additional context in comments.

1 Comment

I believe the OP meant Open MPI when (s)he tagged the question with the openmpi tag.
0

You can run the --use-hwthread-cpus command line parameter for mpirun.

In this case, Open MPI will consider the processor to be a thread provided by hyperthreading. Otherwise, it considers a processor to be a CPU core, which is the default behavior.

For example, in the Xeon Phi (Knights Landing Microarchitecture), each core has four hyperthreaded threads instead of two. Therefore, if you run Open MPI on Xeon Phi with --use-hwthread-cpus, it will allocate four Open MPI processors for each core.

When using this option, Open MPI will refer to the threads provided by Hyper-Threading as "hardware threads". With this technique, you will not oversubscribe, and if some Open MPI processors will run on a virtual machine, it will use the correct number of threads assigned to that virtual machine.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.