2

I'm completely new to using HPCs and SLURM, so I'd really appreciate some guidance here.

I need to iteratively run a command that looks like this

kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-SRR3225412"                 \
                         "SRR3225412_1.fastq.gz"       \
                         "SRR3225412_2.fastq.gz"

where the SRR3225412 part will be different in each interation

The problem is, as I found out, I can't just append this to the end of an sbatch command

sbatch --nodes=1          \
       --ntasks-per-node=1 \
       --cpus-per-task=1    \
         kallisto quant -i '/home/myName/genomes/hSapien.idx' \
                        -o "output-SRR3225412"                 \
                                  "SRR3225412_1.fastq.gz"       \
                                  "SRR3225412_2.fastq.gz"

This command doesn't work. I get the error

sbatch: error: This does not look like a batch script.  The first
sbatch: error: line must start with #! followed by the path to an interpreter.
sbatch: error: For instance: #!/bin/sh

I wanted to ask, how do I run the sbatch command, specifying its run parameters, and also adding the command-line arguments for the kallisto program I'm trying to use? In the end I'd like to have something like

#!/bin/bash

for sample in ...
do
    sbatch --nodes=1          \
           --ntasks-per-node=1 \
           --cpus-per-task=1    \
             kallistoCommandOnSample --arg1 a1 \
                                     --arg2 a2 arg3 a3
done
1

1 Answer 1

3

The error sbatch: error: This does not look like a batch script. is because sbatch expect a submission script. It is a batch script, typically a Bash script, in which comments starting with #SBATCH are interpreted by Slurm as options.

So the typical way of submitting a job is to create a file, let's name it submit.sh:

#! /bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1

kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-SRR3225412"                 \
                         "SRR3225412_1.fastq.gz"       \
                         "SRR3225412_2.fastq.gz"

and then submit it with

sbatch submit.sh

If you have multiple similar jobs to submit, it is beneficial for several reasons to use a job array. The loop you want to create can be replaced with a single submission script looking like

#! /bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --array=1-10 # Replace here with the number of iterations in the loop

SAMPLES=(...) # here put what you would loop over
CURRSAMPLE=${SAMPLE[$SLURM_ARRAY_TASK_ID]}
kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-${CURRSAMPLE}"              \
                         "${CURRSAMPLE}_1.fastq.gz"    \
                         "${CURRSAMPLE}_2.fastq.gz"

As pointed out by @Carles Fenoy, if you do not want to use a submission script, you can use the --wrap parameter of sbatch:

sbatch --nodes=1          \
       --ntasks-per-node=1 \
       --cpus-per-task=1    \
       --wrap "kallisto quant -i '/home/myName/genomes/hSapien.idx' \
                              -o 'output-SRR3225412'                 \
                                        'SRR3225412_1.fastq.gz'       \
                                        'SRR3225412_2.fastq.gz'"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.