error on running mpi job

Question

I'm trying to run a MPI job on a cluster with torque and openmpi 1.3.2 installed and I'm always getting the following error:

"mpirun was unable to launch the specified application as it could not find an executable: Executable: -p Node: compute-101-10.local while attempting to start process rank 0."

I'm using the following script to do the qsub:

#PBS -N mphello
#PBS -l walltime=0:00:30
#PBS -l nodes=compute-101-10+compute-101-15
cd $PBS_O_WORKDIR
mpirun -npersocket 1 -H compute-101-10,compute-101-15 /home/username/mpi_teste/mphello

Any idea why this happens? What I want is to run 1 process in each node (compute-101-10 and compute-101-15). What am I getting wrong here? I've already tried several combinations of the mpirun command, but either the program runs on only one node or it gives me the above error...

Thanks in advance!

can you check that you have OpenMPI 1.3.2 configured on the nodes. The -npersocket option did not exist in OpenMPI 1.2 and this is exactly what mpirun in OpenMPI 1.2 would say if called with this option. Use mpirun --version — Dima Chubarov
– Dima Chubarov, Commented May 13, 2012 at 3:35
I'd post that as an answer than, so that you could close the question. — Dima Chubarov
– Dima Chubarov, Commented May 14, 2012 at 14:04

Dima Chubarov · Accepted Answer · 2012-05-14 14:04:38Z

1

The -npersocket option did not exist in OpenMPI 1.2.

The diagnostics that OpenMPI reported

mpirun was unable to launch the specified application as it could not find an executable: Executable: -p is exactly what mpirun in OpenMPI 1.2 would say if called with this option.

Running mpirun --version will determine which version of OpenMPI is default on the compute nodes.

answered May 14, 2012 at 14:04

Dima Chubarov

17.3k7 gold badges45 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jeff Squyres Over a year ago

BTW, Open MPI 1.3.x is ancient. We just released Open MPI 1.6 yesterday. You should upgrade.

dx_mrt Over a year ago

I know it's ancient, but it's not up to me, it's up to the cluster admin :/

dx_mrt · Accepted Answer · 2012-05-15 17:03:11Z

0

The problem is that the -npersocket flag is only supported by Open MPI 1.3.2 and the cluster where I'm running my code only has Open MPI 1.2 which doesn't support that flag.

A possible way around is to use the flag -loadbalance and specify the nodes where i want the code to run with the flag -H node1,node2,node3,... like this:

mpirun -loadbalance -H node1,node2,...,nodep -np number_of_processes program_name

that way each node will run number_of_processes/p processes, where p the number of nodes where the processes will be run.

answered May 15, 2012 at 17:03

dx_mrt

7177 silver badges14 bronze badges

Collectives™ on Stack Overflow

error on running mpi job

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related