Running a unicode batch file in windows 7, python 2.x

Question

Have run smack into a problem with subprocess.open() when running a batch file with unicode characters in the path name. This barfs in 2.6 and 2.7 but works perfectly in 3.2. Was it really just a bug that lasted all the way until py3k??

# -*- coding: utf-8 -*-

o = u"C:\\temp\\test.bat"        #"control" case
q = u"C:\\temp\\こんにちは.bat"

ho = open(o, 'r')
hq = open(q, 'r')               #so we can open q

ho.close()
hq.close()

import subprocess
subprocess.call(o)              #batch runs
subprocess.call(q)              #nothing from here on down runs
subprocess.call(q, shell=True)
subprocess.call(q.encode('utf8'), shell=True)   
subprocess.call(q.encode('mbcs'), shell=True)  #this was suggested elsewhere for older windows

BTW there are a number of near-duplicates, but I believe this is slightly different from all of the ones I've looked at. — jambox
– jambox, Commented Mar 8, 2012 at 12:11
possible duplicate of Unicode filename to python subprocess.call() — Ferdinand Beyer
– Ferdinand Beyer, Commented Mar 8, 2012 at 12:21
How is this question any different? The subprocess module has troubles with unicode strings in version 2.x. Since 3.0, all strings are unicode and the problem went away. — Ferdinand Beyer
– Ferdinand Beyer, Commented Mar 8, 2012 at 12:23
OK your'e right, it seems like quite a famous bug. Maybe I just couldn't bring myself to believe it! — jambox
– jambox, Commented Mar 8, 2012 at 12:34

Burhan Khalid · Accepted Answer · 2012-03-08 12:24:23Z

2

Filenames are passed to and returned from APIs as (Unicode) strings. This can present platform-specific problems because on some platforms filenames are arbitrary byte strings. (On the other hand, on Windows filenames are natively stored as Unicode.) As a work-around, most APIs (e.g. open() and many functions in the os module) that take filenames accept bytes objects as well as strings, and a few APIs have a way to ask for a bytes return value. Thus, os.listdir() returns a list of bytes instances if the argument is a bytes instance, and os.getcwdb() returns the current working directory as a bytes instance. Note that when os.listdir() returns a list of strings, filenames that cannot be decoded properly are omitted rather than raising UnicodeError.

From the whats new in 3.0 page.

answered Mar 8, 2012 at 12:24

Burhan Khalid

175k20 gold badges254 silver badges291 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jambox Over a year ago

Thanks. I should have looked in the 3.0 release notes, but I couldn't find any reference to this in the 2.7 docs.

Collectives™ on Stack Overflow

Running a unicode batch file in windows 7, python 2.x

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related