4

i am curious about how python source code set the value of Py_FileSystemDefaultEncoding. And i have receive a strange thing.

Since python doc about sys.getfilesystemencoding() said that:

On Unix, the encoding is the user’s preference according to the result of nl_langinfo(CODESET), or None if the nl_langinfo(CODESET) failed.

i use python 2.7.6

```

>>>import sys
>>>sys.getfilesystemencoding()
>>>'UTF-8'
>>>import locale
>>>locale.nl_langinfo(locale.CODESET)
>>>'ANSI_X3.4-1968'

```
Here is the question: why the value of getfilesystemencoding() is different from the value of locale.nl_landinfo() since the doc says that getfilesystemencoding() is derived from locale.nl_landinfo().

Here is the locale command output in my terminal:

LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=zh_CN.UTF-8
LC_TIME=zh_CN.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=zh_CN.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=zh_CN.UTF-8
LC_NAME=zh_CN.UTF-8
LC_ADDRESS=zh_CN.UTF-8
LC_TELEPHONE=zh_CN.UTF-8
LC_MEASUREMENT=zh_CN.UTF-8
LC_IDENTIFICATION=zh_CN.UTF-8
LC_ALL=

1 Answer 1

5

Summary: sys.getfilesystemencoding() behaves as documented. The confusion is due to the difference between setlocale(LC_CTYPE, "") (user's preference) and the default C locale.


The script always starts with the default C locale:

>>> import locale
>>> locale.nl_langinfo(locale.CODESET)
'ANSI_X3.4-1968'

But getfilesystemencoding() uses user's locale:

>>> import sys
>>> sys.getfilesystemencoding()
'UTF-8'
>>> locale.setlocale(locale.LC_CTYPE, '')
'en_US.UTF-8'
>>> locale.nl_langinfo(locale.CODESET)
'UTF-8'

Empty string as a locale name selects a locale based on the user choice of the appropriate environment variables.

$ LC_CTYPE=C python -c 'import sys; print(sys.getfilesystemencoding())'
ANSI_X3.4-1968
$ LC_CTYPE=C.UTF-8 python -c 'import sys; print(sys.getfilesystemencoding())'
UTF-8

where can i find the source code about setting Py_FileSystemDefaultEncoding.

There are two places in the source code for Python 2.7:


Can you give me some advice how to search some keywords in python source code

To find these places:

  • clone Python 2.7 source code:

    $ hg clone https://hg.python.org/cpython && cd cpython
    $ hg update 2.7
    
  • search for Py_FileSystemDefaultEncoding *= regex in your editor e.g.:

    $ make TAGS # to create tags table
    

    in Emacs: M-x tags-search RET Py_FileSystemDefaultEncoding *= RET and M-, to continue the search.

Sign up to request clarification or add additional context in comments.

4 Comments

can you please tell me where can i find the source code about setting Py_FileSystemDefaultEncoding.
@andy: I've added summary and links to the source code for Python 2.7
i see it, that's very good. Can you give me some advice how to search some keywords in python source code.
@andy: I've added one possible way to find where Py_FileSystemDefaultEncoding is defined

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.