how to run multiple process in parallel in python

Question

the heading is very generic but issue might not be.

I have a script that is compiling some code with the parameters passed from a file(xls file). Based on number of configurations on xls i have to compile certain files. I want to store result of each compilation(stdout and stderr) in text files whose names comes from configuration.

I have been able to do all this but to speed up things i want to run all the compilation in parallel. Is there a way to do this?

Sample file..

for n in num_rows: # num_rows store all the rows read using xlrd object
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls

    p = subprocess.Popen(parameters_list, stderr=logfile)
    p.wait()
    logfile.close()

I have to wait for each process to be over before closing the file.

My problem might be too long but any help or leads are welcomed.

docs.python.org/3/library/multiprocessing.html

donkopotamus
– donkopotamus

2015-10-12 12:38:00 +00:00
Commented Oct 12, 2015 at 12:38 — donkopotamus
– donkopotamus, Commented Oct 12, 2015 at 12:38

Perry · Accepted Answer · 2015-10-12 12:41:20Z

2

You can do this using a multiprocessing.Pool:

def parse_row(n):
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls
    p = subprocess.Popen(parameters_list, stderr=logfile)
    p.wait()
    logfile.close()
pool = multiprocessing.Pool()
pool.map_async(parse_row, num_rows)
pool.close()
pool.join()

answered Oct 12, 2015 at 12:41

Perry

3,90326 gold badges40 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rmunn · Accepted Answer · 2015-10-12 12:40:03Z

1

Assuming your processes will all be writing to different logfiles, the answer is quite simple: the subprocess module will already run things in parallel. Just create a different Popen object for each one, and store them in a list:

processes = []
logfiles = []
for n in num_rows: # num_rows store all the rows read using xlrd object
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls
    logfiles.append(logfile)

    p = subprocess.Popen(parameters_list, stderr=logfile)
    logfiles.append(logfile)
    processes.append(p)

# Now, outside the for loop, the processes are all running in parallel.
# Now we can just wait for each of them to finish, and close its corresponding logfile

for p, logfile in zip(processes, logfiles):
    p.wait() # This will return instantly if that process was already finished
    logfile.close()

answered Oct 12, 2015 at 12:40

rmunn

37k11 gold badges80 silver badges107 bronze badges

3 Comments

xtofl Over a year ago

While subprocess is quite useful, for parallel processing, the multiprocessing package works nicer...

rmunn Over a year ago

@xtofl - Absolutely. Upvoted your answer, since it's better than mine.

xtofl Over a year ago

that's very sympathetic :) I'm an objective outstander, though: the credits go to ppperry.

Collectives™ on Stack Overflow

how to run multiple process in parallel in python

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related