6

I have a Python program from which I spawn a sub-program to process some files without holding up the main program. I'm currently using bash for the sub-program, started with a command and two parameters like this:

result = os.system('sub-program.sh file.txt file.txt &')

That works fine, but I (eventually!) realised that I could use Python for the sub-program, which would be far preferable, so I have converted it. The simplest way of spawning it might be:

result = os.system('python3 sub-program.py file.txt file.txt &')

Some research has shown several more sophisticated alternatives, but I have the impression that the latest and most approved method is this one:

subprocess.Popen(["python3", "-u", "sub-program.py"])

Am I correct in thinking that that is the most appropriate way of doing it? Would anyone recommend a different method and why? Simple would be good as I'm a bit of a Python novice.

If this is the recommended method, I can probably work out what the "-u" does and how to add the parameters for myself.

Optional extras:

  • Send a message back from the sub-program to the main program.
  • Make the sub-program quit when the main program does.

3 Answers 3

1

Yes, using subprocess is the recommended way to go according to the documentation:

The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function.

However, subprocess.Popen may not be what you're looking for. As opposed to os.system you will create a Popen object that corresponds to the subprocess and you'll have to wait for it in order to wait for it's completion, fx:

proc = subprocess.Popen(["python3", "-u", "sub-program.py"])

do_something()

res = proc.wait()

If you want to just run a program and wait for completion you should probably use subprocess.run (or maybe subprocess.call, subprocess.check_call or subprocess.check_output) instead.

Sign up to request clarification or add additional context in comments.

2 Comments

I don't want to wait, I want it to run independently so the main program can get on with something else.
Just using the code in my example above, it doesn't wait, which is what I want. The 'optional extras' I mentioned can be looked at another day...
0

Thanks skyking!

With

import subprocess

at the beginning of the main program, this does what I want:

with open('output.txt', 'w') as f:   
  subprocess.Popen([spawned.py, parameter1, parameter2], stdout = f)

The first line opens a file for the output from the sub-program started in the second line. In the second line, the square brackets contain the stuff for the sub-program - name followed by two parameters. The parameters are available in the sub-program in sys.argv[1] and sys.argv[2]. After that come the subprocess parameters - the f says to output to the text file mentioned above.

Comments

0

Is there any particular reason it has to be another program entirely? Why not just spawn another process which runs one of the functions defined within your script?

I suggest that you read up on multiprocessing. Python has module just for that: https://docs.python.org/dev/library/multiprocessing.html

Here you can find info on spawning new processes, communicating between them and syncronizing them.

Be warned though that if you want to really speed up your file processing you'll want to use processes instead of threads (due to some limitations in python, threads will only slow you down which is confusing).

Also check out this page: https://pymotw.com/2/multiprocessing/basics.html It has some code samples that will help you out a lot. Don't forget this guard in your script:

if __name__ == '__main__':

It is very important ;)

5 Comments

Thanks for your comments, but they mostly go over my head! 1 "Is there any particular reason..." I don't really understand this, I think you mean I can run something asynchronously within my main program 2 "I suggest that you read up" I'm afraid I didn't even understand the first paragraph of that document" 3 "Be warned though" I didn't state that as a requirement, but I think I see what you mean.
Would I be right in thinking then that: (1) I don't need a separate program, there is a way of doing what I want within the main program. (3) I should use a process rather than a thread. What is this method called and how do ensure that it uses a process rather than a thread? If you could let me know, I will able to direct my research rather than taking on learning the whole of all the different types of multiprocessing,
(1) Yes and (3) yes. I might have given you very vague recommendation, so here is the resource that will hopefully help you: pymotw.com/2/multiprocessing/basics.html as here you will find some code samples that do pretty much what you want. Be warned though: multiprocessing/multithreading are pretty advanced topics.
Thanks, that looks like an introduction at a level I would understand with a bit of effort. The method in "answered Jan 22 at 21:21" works, the only thing missing is the ability to send messages back from the sub-process as tasks are completed. I imagine the multiprocessing your recommend would be able to, but I am using a Glade GUI which obviously has a built in loop which waits for user input. I wonder whether that would be capable of receiving those messages. Sounds like it is getting more and more complicated.
If the method works than you should mark it as an answer. (Check mark to the left)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.