12
>>> import sys
>>> sys.getfilesystemencoding()
'UTF-8'

How do I change that? I know how to change the default system encoding.

>>> reload(sys)
<module 'sys' (built-in)>
>>> sys.setdefaultencoding('ascii')

But there is no sys.setfilesystemencoding.

2
  • 1
    Note that there was sys.setfilesystemencoding function and also env var PYTHONFSENCODING in early versions of Python 3.x. They were problematic and got removed, now Python uses locale encoding as the filesystem encoding. See Painful History of the Filesystem Encoding from Victor Stinner's blog. Commented Feb 14, 2022 at 2:59
  • Also note that per peps.python.org/pep-0686 -- in the Python 3.15 timeframe we intend to make UTF-8 the default for text IO regardless of environment. Commented Feb 1, 2024 at 2:13

2 Answers 2

15

There are two ways to change it:

  1. (linux-only) export LC_CTYPE=en_US.UTF-8 before launching python:
$ LC_CTYPE=C python -c 'import sys; print(sys.getfilesystemencoding())'
ANSI_X3.4-1968
$ LC_CTYPE=en_US.UTF-8 python -c 'import sys; print(sys.getfilesystemencoding())'
UTF-8

Note that LANG serves as the default value for LC_CTYPE if it is not set, while LC_ALL overrides both LC_CTYPE and LANG)

  1. monkeypatching:
import sys
sys.getfilesystemencoding = lambda: 'UTF-8'

Both methods let functions like os.stat accept unicode (python2.x) strings. Otherwise those functions raise an exception when they see non-ascii symbols in the filename.

Update: In the (1) variant the locale has to be available (present in locale -a) for this setting to have the desired effect.

Sign up to request clarification or add additional context in comments.

16 Comments

@sureshvv What is your OS?
Ubuntu 16.04. Had to add LANG=en_US.UTF8 to /etc/environment and reboot.
@sureshvv reboot is definitely an overkill in this situation, but I'm glad that you've resolved the issue anyway. Did you launch python directly from command line or as a system service?
Only from the command line. The change I made did not become effective until reboot.
@sureshvv It's not surprising about /etc/environment but export LANG=en_US.UTF8 has immediate effect
|
4

The file system encoding is, in many cases, an inherent property of the operating system. It cannot be changed — if, for some reason, you need to create files with names encoded differently than the filesystem encoding implies, don't use Unicode strings for filenames. (Or, if you're using Python 3, use a bytes object instead of a string.)

See the documentation for details. In particular, note that, on Windows systems, the file system is natively Unicode, so no conversion is actually taking place, and, consequently, it's impossible to use an alternative filesystem encoding.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.