16

I have access to a machine where I have access to 10 of the cores -- and I would like to actually use them. What I am used to doing on my own machine would be something like this:

for f in *.fa; do
  myProgram (options) "./$f" "./$f.tmp"
done

I have 10 files I'd like to do this on -- let's call them blah00.fa, blah01.fa, ... blah09.fa.

The problem with this approach is that myProgram only uses 1 core at a time, and doing it like this on the multi-core machine I'd be using 1 core at a time 10 times, so I wouldn't be using my mahcine to its max capability.

How could I change my script so that it runs all 10 of my .fa files at the same time? I looked at Run a looped process in bash across multiple cores but I couldn't get the command from that to do what I wanted exactly.

8
  • 2
    You tried gnu parallel? What didn't work for you? Commented Mar 11, 2013 at 16:27
  • 1
    Have you tried using gnu parallel suggestion in that answer? Commented Mar 11, 2013 at 16:27
  • 4
    seq 0 10 | parallel myProgram -opt1 -opt2 ./blah{}.fa ./blah{}.tmp Commented Mar 11, 2013 at 16:31
  • Yes, I tried using parallel. Problem: on the machine with many cores it's not installed and I have no sudo access so can't use parallel. :( Commented Mar 11, 2013 at 16:31
  • 4
    You don't need to be root to install it; download the source and run ./configure --prefix=${HOME}; make; make install to install it in your home directory. Commented Mar 11, 2013 at 16:33

3 Answers 3

13

You could use

for f in *.fa; do
    myProgram (options) "./$f" "./$f.tmp" &
done
wait

which would start all of you jobs in parallel, then wait until they all complete before moving on. In the case where you have more jobs than cores, you would start all of them and let your OS scheduler worry about swapping processes in an out.

One modification is to start 10 jobs at a time

count=0
for f in *.fa; do
    myProgram (options) "./$f" "./$f.tmp" &
    (( count ++ ))        
    if (( count = 10 )); then
        wait
        count=0
    fi
done

but this is inferior to using parallel because you can't start new jobs as old ones finish, and you also can't detect if an older job finished before you manage to start 10 jobs. wait allows you to wait on a single particular process or all background processes, but doesn't let you know when any one of an arbitrary set of background processes complete.

Sign up to request clarification or add additional context in comments.

1 Comment

Next version of Bash will have an option for that (wait -n). Currently you can do something like this but it's a bit racy due to some bugs that will also be fixed in the next version.
8

With GNU Parallel you can do:

parallel myProgram (options) {} {.}.tmp ::: *.fa

From: http://git.savannah.gnu.org/cgit/parallel.git/tree/README

= Full installation =

Full installation of GNU Parallel is as simple as:

./configure && make && make install

If you are not root you can add ~/bin to your path and install in ~/bin and ~/share:

./configure --prefix=$HOME && make && make install

Or if your system lacks 'make' you can simply copy src/parallel src/sem src/niceload src/sql to a dir in your path.

= Minimal installation =

If you just need parallel and do not have 'make' installed (maybe the system is old or Microsoft Windows):

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
mv parallel sem dir-in-your-$PATH/bin/

Watch the intro videos to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Comments

0
# Wait while instance count less than $3, run additional instance and exit
function runParallel () {
    cmd=$1
    args=$2
    number=$3
    currNumber="1024"
    while true ; do
        currNumber=`ps -e | grep -v "grep" | grep " $1$" | wc -l`
        if [ $currNumber -lt $number ] ; then
            break
        fi
        sleep 1
    done
    echo "run: $cmd $args"
    $cmd $args &
}

loop=0
# We will run 12 sleep commands for 10 seconds each 
# and only five of them will work simultaneously
while [ $loop -ne 12 ] ; do
    runParallel "sleep" 10 5
    loop=`expr $loop + 1`
done

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.