51

I am trying to use rsync with subprocess.call. Oddly, it works if I pass subprocess.call a string, but it won't work with a list.

calling sp.call with a string:

In [23]: sp.call("rsync -av content/ writings_raw/", shell=True)
sending incremental file list

sent 6236 bytes  received 22 bytes  12516.00 bytes/sec
total size is 324710  speedup is 51.89
Out[23]: 0

calling sp.call with a list:

In [24]: sp.call(["rsync", "-av", "content/", "writings_raw/"], shell=True)
rsync  version 3.0.9  protocol version 30
Copyright (C) 1996-2011 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 32-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

rsync is a file transfer program capable of efficient remote update
via a fast differencing algorithm.

Usage: rsync [OPTION]... SRC [SRC]... DEST
  or   rsync [OPTION]... SRC [SRC]... [USER@]HOST:DEST
  or   rsync [OPTION]... SRC [SRC]... [USER@]HOST::DEST
  or   rsync [OPTION]... SRC [SRC]... rsync://[USER@]HOST[:PORT]/DEST
  or   rsync [OPTION]... [USER@]HOST:SRC [DEST]
  or   rsync [OPTION]... [USER@]HOST::SRC [DEST]
  or   rsync [OPTION]... rsync://[USER@]HOST[:PORT]/SRC [DEST]
The ':' usages connect via remote shell, while '::' & 'rsync://' usages connect
to an rsync daemon, and require SRC or DEST to start with a module name.

Options
 -v, --verbose               increase verbosity
 -q, --quiet                 suppress non-error messages
     --no-motd               suppress daemon-mode MOTD (see manpage caveat)
... snipped....
                             repeated: --filter='- .rsync-filter'
     --exclude=PATTERN       exclude files matching PATTERN
     --blocking-io           use blocking I/O for the remote shell
 -4, --ipv4                  prefer IPv4
 -6, --ipv6                  prefer IPv6
     --version               print version number
(-h) --help                  show this help (-h is --help only if used alone)
...snipped ...
rsync error: syntax or usage error (code 1) at main.c(1438) [client=3.0.9]
Out[24]: 1

What is wrong with how I use the list? How would you fix it? I need the list, because I would like to use variables. Of course I could use:

  sp.call("rsync -av "+Orig+" "+Dest, shell=True)    

But I would like to understand how subprocess understands lists vs. strings.

setting shell=False and a list:

In [36]: sp.call(['rsync', '-av', ORIG, DEST], shell=False)
sending incremental file list

sent 6253 bytes  received 23 bytes  12552.00 bytes/sec
total size is 324710  speedup is 51.74
Out[36]: 0

setting shell=False and a string

In [38]: sp.call("rsync -av"+" "+ORIG+" "+DEST, shell=False)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-38-0d366d3ef8ce> in <module>()
----> 1 sp.call("rsync -av"+" "+ORIG+" "+DEST, shell=False)

/usr/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
    491     retcode = call(["ls", "-l"])
    492     """
--> 493     return Popen(*popenargs, **kwargs).wait()
    494 
    495 

/usr/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
    677                             p2cread, p2cwrite,
    678                             c2pread, c2pwrite,
--> 679                             errread, errwrite)
    680 
    681         if mswindows:

/usr/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
   1257                     if fd is not None:
   1258                         os.close(fd)
-> 1259                 raise child_exception
   1260 
   1261 

OSError: [Errno 2] No such file or directory
4
  • 3
    Does the behavior change when you set shell=False (which you should generally do if True is not explicitly required)? Commented Feb 27, 2013 at 10:28
  • @Jan-PhilipGehrcke, Yes it changes. But that makes me even more confused.... becuase with shell=False and string it is exactly opposite. So what is going on here? Commented Feb 27, 2013 at 10:32
  • 3
    It is easy. The shell wants a string, so give it one. If you don't use the shell, the system call family exec*() is used, which want the parameters to be split - so give it a list. Commented Feb 27, 2013 at 10:45
  • 2
    Related Python issue: Don't use a list argument together with shell=True in subprocess' docs Commented Dec 28, 2014 at 12:51

1 Answer 1

70

subprocess's rules for handling the command argument are actually a bit complex.

Generally speaking, to run external commands, you should use shell=False and pass the arguments as a sequence. Use shell=True only if you need to use shell built-in commands or specific shell syntax; using shell=True correctly is platform-specific as detailed below.

From the docs:

args should be a sequence of program arguments or else a single string. By default, the program to execute is the first item in args if args is a sequence. If args is a string, the interpretation is platform-dependent and described below. See the shell and executable arguments for additional differences from the default behavior. Unless otherwise stated, it is recommended to pass args as a sequence.... If shell is True, it is recommended to pass args as a string rather than as a sequence.

With shell=False:

On Unix, if args is a string, the string is interpreted as the name or path of the program to execute. However, this can only be done if not passing arguments to the program.

On Windows, if args is a sequence, it will be converted to a string in a manner described in Converting an argument sequence to a string on Windows. This is because the underlying CreateProcess() operates on strings.

With shell=True:

On Unix with shell=True, the shell defaults to /bin/sh. If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself.

On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.

(all emphasis mine)


For completeness, here's what happens in each of your four examples on a UNIX system:

string with shell=True

subprocess.call("rsync -av a/ b/", shell=True) will invoke sh -c "rsync -av a/ b/", which executes the shell script rsync -av a/ b/; the shell will parse this as a call to rsync with arguments -av, a/, b/, so it works fine.

Note that if any argument contained a space or special shell character it would need to be manually escaped, making this a fragile approach.

list with shell=True

subprocess.call(["rsync", "-av", "a/", "b/"], shell=True) will invoke sh -c "rsync" -av a/ b/, which executes the shell script rsync, setting $0 to -av, $1 to a/, and $2 to b/. This shell script just invokes rsync with no arguments (ignoring $0, $1, $2), which is why you get a screenful of help text.

One way to make this work would be subprocess.call(['rsync "$@"', "rsync", "-av", "a/", "b/"], shell=True). This will invoke a shell script which passes the arguments through to rsync. Note the dummy extra rsync argument, necessary to set $0 (note that the expansion of $@ starts with $1). This is not an ideal solution, and hence why it's very rare to use a sequence with shell=True.

string with shell=False

subprocess.call("rsync -av a/ b/") will attempt to find a binary named rsync -av a/ b/ on your $PATH. Since no such binary exists, you get an error from subprocess. There is no way to provide any arguments to the program when using a string with shell=False.

list with shell=False

subprocess.call(["rsync", "-av", "a/", "b/"]) invokes the rsync binary on your $PATH, passing rsync as argv[0], -av as argv[1], a/ as argv[2] and b/ as argv[3]. No escaping of arguments is needed as they are passed straight through to the execve system call.

Sign up to request clarification or add additional context in comments.

5 Comments

Recommended in not "mandatory". Is it either a bug in the code, or a documentation bug. Do you happen to know which? From a quick look at the source code, it isn't obvious
Recommended means "for the least surprising result". You can do the non-recommended thing - the code won't stop you - and the results will be as documented in the snippets I quoted.
To me using shell=True and passing list instead of string looks totally broken, unless the list has only one element (i.e. no arguments).
@Davide, it's not broken at all -- the list is appended to the argument list of ['sh', '-c'], which parses the first element as code to run, and later elements as arguments to that code at all ($1, $2, etc). It works exactly as it's documented to work.
That's quixotic enough that I would not mind at all if Python warned or even threw an error in this scenario. If you really want to pass arguments to the shell, do it explicitly with sh -c 'rsync "$@"' _ "-av" "content/" "writings_raw/" or similar.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.