0

Please correct me if I am wrong here. A goal: to copy a file by spawning into another process (so actual copying is not "locking" the process that calls it).

cmd = ['cp', '/Users/username/Pictures/2Gb_ImageFile.tif', '/Volume/HugeNetworkDrive/VerySlow/Network/Connection/Destination.tif']

def copyWithSubprocess(cmd):        
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

copyWithSubprocess(cmd)
6
  • 1
    What do you mean by "locking" the process? This will still wait for the child process to terminate. (Edit: the question has been edited; this comment made more sense when the "communicate" call was still included) Commented Feb 27, 2014 at 20:20
  • whats the question ? ... subprocess should not block until you call comunnicate or some other method that must wait ... Commented Feb 27, 2014 at 20:21
  • @JeremyRoman why would that wait for the subprocess to finish? (Im almost positive it does not and continues executing immediatly after spawning) Commented Feb 27, 2014 at 20:22
  • 1
    @JoranBeasley: Question was edited between when I commented and when you saw it. :) Commented Feb 27, 2014 at 20:22
  • 1
    You didn't state a problem. Is there a bug? An exception? Give us some more specifics Commented Feb 27, 2014 at 21:55

2 Answers 2

1

Popen(cmd, stdout=PIPE, stderr=PIPE) won't "lock" your parent process.

cmd may stall itself if it generates enough output due to the full pipe buffers. If you want to discard subprocess' output then use DEVNULL instead of PIPE:

import os
from subprocess import Popen, STDOUT

DEVNULL = open(os.devnull, 'wb') #NOTE: it is already defined in Python 3.3+
p = Popen(cmd, stdout=DEVNULL, stderr=STDOUT)
# ...

if you want to process the output without blocking the main thread then you could use several approaches: fcntl, select, named pipes with iocp, threads. The latter is a more portable way:

p = Popen(cmd, stdout=PIPE, stderr=PIPE, bufsize=-1)
bind(p.stdout, stdout_callback)
bind(p.stderr, stderr_callback)
# ...

where bind() function:

from contextlib import closing
from functools import partial
from threading import Thread

def bind(pipe, callback, chunksize=8192):
    def consume():
        with closing(pipe):
            for chunk in iter(partial(pipe.read, chunksize), b''):
                callback(chunk)
    t = Thread(target=consume)
    t.daemon = True
    t.start()
    return t

You don't need an external process to copy a file in Python without blocking the main thread:

import shutil
from threading import Thread

Thread(target=shutil.copy, args=['source-file', 'destination']).start()

Python can release GIL during I/O so the copying happens both concurrently and in parallel with the main thread.

You could compare it with a script that uses multiple processes:

import shutil
from multiprocessing import Process

Process(target=shutil.copy, args=['source-file', 'destination']).start()

If you want to cancel the copying when your program dies then set thread_or_process.daemon attribute to True.

Sign up to request clarification or add additional context in comments.

5 Comments

There are 'stdout_callback' and 'stderr_callback' variables being sent as the arguments to bind() function.... bind(p.stdout, stdout_callback) and bind(p.stderr, stderr_callback). If possible would you please clarify where those variable we've got from? Thanks in advance
@Sputnix: It is your functions that process subprocess' stdout/stderr. Do you see callback(chunk) in the code? For example: def stdout_callback(chunk): print("Got %d bytes on stdout" % len(chunk))
Please take a look at the code I posted at the bottom of this page... It is a code I try to run. It gives me an error: "NameError: global name 'stdout_callback' is not defined"
@Sputnix: It is your function. You must define it "if you want to process the output". See the previous comment on how you could do it. If you don't know what do you want to do with the subprocess' output then use the variant with DEVNULL.
I've got it now! Thanks again!
1

The easiest way to handle complicated asynchronous processes in Python is to use the multiprocessing library, which was designed specifically to support such tasks and has an interface that closely parallels that of the threading module (indeed I have written code that can switch between multi-threading and multi-processing operations mostly by importing one library or the other, but this required fairly rigorous limits on which parts of the modules were utilized).

[Edit: removed spurious advice about threading and made my opening assertion less bombastic]

4 Comments

It would be interesting to see how multiprocessing could be used with subprocess when used to copy multiple files. Would you post any simple example of multprocessing when used in conjunction with subprocess? I am posting a simple code below that could be modiefed to make it multy-processing/multy-threated. A simple example would be more that sufficient!
Popen is "asynchronous". It is the way to call external processes in Python.
It is one way to call asynchronous processes in Python. Given your reminder, I will remove the second part of my answer and modify the first. If you want to be able to communicate picklable Python objects to your subprocesses then the subprocess module just won't do it - it was designed to mimic the same sort of process interactions that the shell does. [Edited to add my understanding of the difference between subprocess and multiprocess]
That should have been "if you want to communicate non-picklable objects"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.