I am newbie in Linux and recently started working with our university super-computer and I need to install my program ( GAMESS Quantum Chemistry Software ) on my own allocated space. I have installed and ran it successfully under 'sockets' but actually I need to run it under 'mpi' ( otherwise there will be little advantage of using a super-computer ).
System Setting:
- OS: Linux64 , Redhat, intel
- MPI: impi
- compiler: ifort
- modules: slurm , intel/intel-15.0.1 , intel/impi-15.0.1
This software runs ' rungms ' and receives arguments as:
rungms [fileName][Version][CPU count ] ( for example: ./rungms Opt 00 4 )
Here is my bash file ( my feeling is this is the main culprit for my problem !):
#!/bin/bash
#Based off of Monte's Original Script for Torque:
#https://gist.github.com/mlunacek/6306340#file-matlab_example-pbs
#These are SBATCH directives specifying name of file, queue, the
#Quality of Service, wall time, Node Count, #of CPUS, and the
#destination output file (which appends node hostname and JobID)
#SBATCH -J OptMPI
#SBATCH --qos janus-debug
#SBATCH -t 00-00:10:00
#SBATCH -N2
#SBATCH --ntasks-per-node=1
#SBATCH -o output-OptMPI-%N-JobID-%j
#NOTE: This Module Will Be Replaced With Slurm Specific:
module load intel/impi-15.0.1
mpirun /projects/augenda/gamess/rungms Opt 00 2 > OptMPI.out
As I said before, the program is compiled for mpi ( and not 'sockets' ) .
My problem is when I run run sbatch Opt.sh , I receive this error:
- srun: error: PMK_KVS_Barrier duplicate request from task 1
- when I change -N number , sometimes I receive error saying (4 !=2 ).
- with odd number of -N I receive error saying it expects even number of processes.
What am I missing ?
Here is the code from our super-computer website as a bash file example
4 != 2are you sure you specify the right number of cpus also in therungmsparameter?