I have written for loop and parallelized it with & and limited the threshold to running 3 jobs at one time. Below is my script. I am reserving 32 cores and 256 GB memory through BSUB command. The sample_pipe I am running inside for loop requires 32 cores and 256 GB memory.
I am getting memory failure errors on some jobs. I think I am reserving only 32 cores and 256 GB and trying to run 3 jobs at a time which might be causing memory failure errors on some jobs.
My question is how do I parallelize such that all 3 jobs are using the same amount of cores and memory.
I submitted using the command bsub < example.sh
#!/bin/bash
#BSUB -J cnt_job # LSF job name
#BSUB -o cnt_job.%J.out # Name of the job output file
#BSUB -e cnt_job.%J.error # Name of the job error file
#BSUB -n 32 # 32 cores
#BSUB -M 262144 # 256 GB
#BSUB -R "span[hosts=1] rusage [mem=262144]"
n=0
maxjobs=3
for sample in $main ; do
for run in $nested ; do
sample_pipe count --id="$run_name" \
--localcores=32 \
--localmem=256 &
done
cd ..
# limit jobs
if (( $(($((++n)) % $maxjobs)) == 0 )) ; then
wait # wait until all have finished
echo $n wait
fi
done
for sample in mainwould just run one iteration of the loop, so the script looks a bit confusing.#BSUBlines look like). Have you considered giving it one script per process, instead of looping over$mainand letting it take care of the rest? That's basically the whole idea of such systems. Most have their own tools for looping, but you would have to talk to whoever is responsible for this cluster.