7

How can i execute the same python program in parallel (I am thinking of x10) with the only difference being two input arguments representing a time-range? I need it for some data-processing, which otherwise will take too long to finish

I know i can do it manually in shell by starting 10 scripts one by one, but it does not seem to be the most "elegant" solution + I would also love to define arguments for each of those programs dynamically in the "main" python program.

Is there a way to do it?

5
  • @Merlin IO. It has multiple select and insert statements from/into the database which are executed in each cycle of a loop of about 1 500 000 cycles Commented Jun 12, 2016 at 13:12
  • @Merlin All the selects are happening from table A and all the inserts are into table B. Commented Jun 12, 2016 at 13:30
  • .@Dennis can you write the code in SQL. Dont go of out-of-process to the database. Commented Jun 12, 2016 at 13:35
  • @Merlin I understand that it is better to keep it all in SQL (inside the database). Unfortunately, I haven't been able to figure out how to do that yet. Commented Jun 12, 2016 at 13:39
  • [email protected] you delete this question, it a SQL problem... Commented Jun 12, 2016 at 14:33

2 Answers 2

10

Enclose your script in a main method, like so:

def main(args):
    a, b = args
    # do something


if __name__ == '__main__':
    args = parse_arguments()
    main(args)

Then you can use a second script together with multiprocessing.Pool, to run the main method with different arguments.

from myscript import main
from multiprocessing import Pool



a = [1, 2, 3, 4, 5]
b = [6, 7, 8, 9, 10]

if __name__ == '__main__':
    with Pool(4) as pool: # four parallel jobs
        results = pool.map(main, zip(a, b))
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks Max, looks clean and simple. Will try it out in a minute
How come you specified 5 pairs of arguments but there are only 4 parallel jobs to run? Am I getting something wrong?
The pool has a number of 4 parrallel tasks. So only 4 jobs can be run simultaniously. So what will happen is, that the first 4 jobs are executed and as soon as the first one has finished, the fifth is started. If you have more than 4 cores, you can pass a higher number to Pool
Ah, Got it :) Thank you :)
Hi! I have similar problem, except that my main function takes 11 arguments. I have three such sets of 11 arguments that I wish to pass to into main function and then launch it in different cores. I wrote my script but it is throwing error saying that I am passing one argument instead of 11.
|
0

You can try the following in shell:

python main.py [1, 2, 3, 4, 5] &
python main.py [6, 7, 8, 9, 10]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.