0

I am trying to parallel the task of rpw_gen_features in the following bash script:

#!/bin/bash
maxjobs=8
jobcounter=0
MYDIR="/home/rasoul/workspace/world_db/journal/for-training"
DIR=$1
FILES=`find $MYDIR/${DIR}/${DIR}\_*.hpl -name *.hpl -type f -printf "%f\n" | sort -n -t _ -k 2` 
for f in $FILES; do
  fileToProcess=$MYDIR/${DIR}/$f
  # construct .pfl file name
  filebasename="${f%.*}"
  fileToCheck=$MYDIR/${DIR}/$filebasename.pfl
  # check if the .pfl file is already generated
  if [ ! -f $fileToCheck ];
  then        
    echo ../bin/rpw_gen_features -r $fileToProcess &
    jobcounter=jobcounter+1
  fi
  if [jobcounter -eq maxjobs]
    wait
    jobcounter=0
  fi
done

but it generates some error at runtime:

line 20: syntax error near unexpected token `fi'

I'm not an expert in bash programming, so please feel free to comment on the whole code.

0

2 Answers 2

2

I am curious why you don't just use GNU Parallel:

MYDIR="/home/rasoul/workspace/world_db/journal/for-training"
DIR=$1
find $MYDIR/${DIR}/${DIR}\_*.hpl -name *.hpl -type f |
  parallel '[ ! -f {.}.pfl ] && echo ../bin/rpw_gen_features -r {}'

Or even:

MYDIR="/home/rasoul/workspace/world_db/journal/for-training"
parallel '[ ! -f {.}.pfl ] && echo ../bin/rpw_gen_features -r {}' ::: $MYDIR/$1/$1\_*.hpl

It seems to be way more readable, and it will automatically scale when you move from an 8-core to a 64-core machine.

Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial (man parallel_tutorial). You command line with love you for it.

Sign up to request clarification or add additional context in comments.

Comments

1

You are missing a then, spaces and ${} around the variables:

  if [jobcounter -eq maxjobs]
    wait
    jobcounter=0
  fi

Should be

  if [ ${jobcounter} -eq ${maxjobs} ]; then
    wait
    jobcounter=0
  fi

Further, you need to double check your script as I can see many missing ${} for example:

jobcounter=jobcounter+1

Even if you use the variables correctly this still will not work:

jobcounter=${jobcounter}+1

Will yield:

1
1+1
1+1+1

And not what you expect. You need to use:

jobcounter=`expr $jobcounter + 1`

With never versions of BASH you should be able to do:

(( jobcounter++ ))

2 Comments

@Rasoul You were also missing spaces and variable declarations.
You are right! I just fixed it. What do you think about the whole code? Is this the right way to parallel the task?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.