0

I am newbie in Linux and recently started working with our university super-computer and I need to install my program ( GAMESS Quantum Chemistry Software ) on my own allocated space. I have installed and ran it successfully under 'sockets' but actually I need to run it under 'mpi' ( otherwise there will be little advantage of using a super-computer ).

System Setting:

  • OS: Linux64 , Redhat, intel
  • MPI: impi
  • compiler: ifort
  • modules: slurm , intel/intel-15.0.1 , intel/impi-15.0.1

This software runs ' rungms ' and receives arguments as:

rungms [fileName][Version][CPU count ] ( for example: ./rungms Opt 00 4 )

Here is my bash file ( my feeling is this is the main culprit for my problem !):

#!/bin/bash

#Based off of Monte's Original Script for Torque:
#https://gist.github.com/mlunacek/6306340#file-matlab_example-pbs

#These are SBATCH directives specifying name of file, queue, the
#Quality of Service, wall time, Node Count, #of CPUS, and the
#destination output file (which appends node hostname and JobID)

#SBATCH -J OptMPI
#SBATCH --qos janus-debug
#SBATCH -t 00-00:10:00
#SBATCH -N2
#SBATCH --ntasks-per-node=1
#SBATCH -o output-OptMPI-%N-JobID-%j

#NOTE: This Module Will Be Replaced With Slurm Specific:
module load intel/impi-15.0.1

mpirun /projects/augenda/gamess/rungms Opt 00 2 > OptMPI.out

As I said before, the program is compiled for mpi ( and not 'sockets' ) .

My problem is when I run run sbatch Opt.sh , I receive this error:

  • srun: error: PMK_KVS_Barrier duplicate request from task 1
  • when I change -N number , sometimes I receive error saying (4 !=2 ).
  • with odd number of -N I receive error saying it expects even number of processes.

What am I missing ?

Here is the code from our super-computer website as a bash file example

4
  • 4
    I would really recommend to ask the supercomputer support rc.colorado.edu/support for advice. There is too much information missing for us and my experience with SC support in other institutions was always quite good. Commented Feb 28, 2015 at 10:30
  • @VladimirF Actually that's what I did multiple times but their answer is always : we dont offer help in software installation ! I know in some other states you can just walk in and ask them to install your softwares but it never happens here ! I don't know why they expect a chemist to be such a professional linux coder ?!! That's why I desperately need help from here. Commented Feb 28, 2015 at 16:11
  • How do you run it? When you get 4 != 2 are you sure you specify the right number of cpus also in the rungms parameter? Commented Feb 28, 2015 at 18:49
  • @VladimirF After loading the 3 modules ( slurm , intel , impi ) and requesting an allocation ( salloc --qos janus- debug ) , I navigate to the folder containing Opt.sh and run sbatch Opt.sh. Commented Feb 28, 2015 at 20:08

1 Answer 1

2

The Slurm Workload Manager has a few ways of invoking an Intel MPI process. Likely, all you have to do is use srun rather than mpirun in your case. If errors are still present, refer here for alternative ways to invoke Intel MPI jobs; it's rather dependent on how the HPC admins configured the system.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.