Skip to main content
added 15 characters in body
Source Link
Gilles 'SO- stop being evil'
  • 866.1k
  • 205
  • 1.8k
  • 2.3k

My scripts are having trouble with correctly running things in GNU parallel.

I have a sub_scriptsub_script like so (all these are actually simplified versions):

#! /bin/bash
input=$1

input is a date in YYYYMMDD format

mkdir -p $input

cd $input
filename=$input'.txt'
echo 'line1' > $filename
echo 'The date is: '$input >> $filename

#! /bin/bash   
input=$1  
# input is a date in YYYYMMDD format  
mkdir -p $input

cd $input  
filename=$input'.txt'  
echo 'line1' > $filename  
echo 'The date is: '$input >> $filename

Then I have a file multi.sh like so:

cd /home/me/scripts; ./sub_script 20141001
cd /home/me/scripts; ./sub_script 20141002
cd /home/me/scripts; ./sub_script 20141003
cd /home/me/scripts; ./sub_script 20141004
cd /home/me/scripts; ./sub_script 20141005

cd /home/me/scripts; ./sub_script 20141001   
cd /home/me/scripts; ./sub_script 20141002   
cd /home/me/scripts; ./sub_script 20141003   
cd /home/me/scripts; ./sub_script 20141004   
cd /home/me/scripts; ./sub_script 20141005    

I am trying to use GNU parallel to execute all these functions with multiple cores using this command

parallel -j 3 --delay 1 < multi.sh

parallel -j 3 --delay 1 < multi.sh 

to run on 3 cores. I've tried to implement a 1 second delay between running each line to prevent problems, but this does not work.

I am having problems with the new directories containing improper files. I think this only happens when there are more lines in multi.shmulti.sh than cores specified by -j-j, and it only happens sporadically (it's not always reproducible). I can rerun the parallelparallel line 2 times in a row and get different results. Sometimes I might get 20141002.txt20141002.txt files in the 20141005 folder20141005 directory instead of the 20141005.txt20141005.txt files. Other times I may only get the 20141002.txt20141002.txt files in the 201005 folder201005 directory.

Are there any suggestions on how I can fix this? GNU parallel is preferred, but I can try other commands as well.

My scripts are having trouble with correctly running things in GNU parallel.

I have a sub_script like so (all these are actually simplified versions):

#! /bin/bash
input=$1

input is a date in YYYYMMDD format

mkdir -p $input

cd $input
filename=$input'.txt'
echo 'line1' > $filename
echo 'The date is: '$input >> $filename

Then I have a file multi.sh like so:

cd /home/me/scripts; ./sub_script 20141001
cd /home/me/scripts; ./sub_script 20141002
cd /home/me/scripts; ./sub_script 20141003
cd /home/me/scripts; ./sub_script 20141004
cd /home/me/scripts; ./sub_script 20141005

I am trying to use GNU parallel to execute all these functions with multiple cores using this command

parallel -j 3 --delay 1 < multi.sh

to run on 3 cores. I've tried to implement a 1 second delay between running each line to prevent problems, but this does not work.

I am having problems with the new directories containing improper files. I think this only happens when there are more lines in multi.sh than cores specified by -j, and it only happens sporadically (it's not always reproducible). I can rerun the parallel line 2 times in a row and get different results. Sometimes I might get 20141002.txt files in the 20141005 folder instead of the 20141005.txt files. Other times I may only get the 20141002.txt files in the 201005 folder.

Are there any suggestions on how I can fix this? GNU parallel is preferred, but I can try other commands as well.

My scripts are having trouble with correctly running things in GNU parallel.

I have a sub_script like so (all these are actually simplified versions):

#! /bin/bash   
input=$1  
# input is a date in YYYYMMDD format  
mkdir -p $input

cd $input  
filename=$input'.txt'  
echo 'line1' > $filename  
echo 'The date is: '$input >> $filename

Then I have a file multi.sh like so:

cd /home/me/scripts; ./sub_script 20141001   
cd /home/me/scripts; ./sub_script 20141002   
cd /home/me/scripts; ./sub_script 20141003   
cd /home/me/scripts; ./sub_script 20141004   
cd /home/me/scripts; ./sub_script 20141005    

I am trying to use GNU parallel to execute all these functions with multiple cores using this command

parallel -j 3 --delay 1 < multi.sh 

to run on 3 cores. I've tried to implement a 1 second delay between running each line to prevent problems, but this does not work.

I am having problems with the new directories containing improper files. I think this only happens when there are more lines in multi.sh than cores specified by -j, and it only happens sporadically (it's not always reproducible). I can rerun the parallel line 2 times in a row and get different results. Sometimes I might get 20141002.txt files in the 20141005 directory instead of the 20141005.txt files. Other times I may only get the 20141002.txt files in the 201005 directory.

Are there any suggestions on how I can fix this? GNU parallel is preferred, but I can try other commands as well.

Source Link

Concurrency problems with GNU parallel

My scripts are having trouble with correctly running things in GNU parallel.

I have a sub_script like so (all these are actually simplified versions):

#! /bin/bash
input=$1

input is a date in YYYYMMDD format

mkdir -p $input

cd $input
filename=$input'.txt'
echo 'line1' > $filename
echo 'The date is: '$input >> $filename

Then I have a file multi.sh like so:

cd /home/me/scripts; ./sub_script 20141001
cd /home/me/scripts; ./sub_script 20141002
cd /home/me/scripts; ./sub_script 20141003
cd /home/me/scripts; ./sub_script 20141004
cd /home/me/scripts; ./sub_script 20141005

I am trying to use GNU parallel to execute all these functions with multiple cores using this command

parallel -j 3 --delay 1 < multi.sh

to run on 3 cores. I've tried to implement a 1 second delay between running each line to prevent problems, but this does not work.

I am having problems with the new directories containing improper files. I think this only happens when there are more lines in multi.sh than cores specified by -j, and it only happens sporadically (it's not always reproducible). I can rerun the parallel line 2 times in a row and get different results. Sometimes I might get 20141002.txt files in the 20141005 folder instead of the 20141005.txt files. Other times I may only get the 20141002.txt files in the 201005 folder.

Are there any suggestions on how I can fix this? GNU parallel is preferred, but I can try other commands as well.