0

When I run decode on a byte string encoded as UTF-8 I get ANSI encoding in a Windows command prompt.

>python --version
Python 3.13.0
>python -c "print(b'\xc3\x96'.decode('utf-8'))" > test.txt

When I open test.txt in Notepad++ it says that the encoding is ANSI. If I run the same command in MSYS2 (using Python 3.11.6) the resulting encoding is UTF-8 as expected. How come the encoding is wrong using the Windows command prompt?

1
  • When you redirect the output of a command to a file, that is going to be done via OS specific encoding. Perhaps you would like to pass a stream into print so that you can print to a file where you can dictate the encoding. Commented May 16 at 15:45

2 Answers 2

2

When you .decode() you generate a Unicode string (codepoints without encoding). It's no different than writing:

python -c "print('Ö')" > test.txt

print then writes that Unicode string to stdout in an OS-dependent way.

For example, on Windows when redirected to a file it uses the default "ANSI" encoding of that localized Windows version (Windows-1252 encoding on US and Western European Windows versions).

Using UTF-8 Mode overrides this with either the -X utf8 Python option or setting the environment variable PYTHONUTF8=1:

python -X utf8 -c "print('Ö')" > test.txt

The environment variable PYTHONIOENCODING can also be used to directly override the encoding of stdin/stdout/stderr when redirecting Python I/O.

Sign up to request clarification or add additional context in comments.

1 Comment

Since I'm building an exe file from the script I solved the problem by adding the option 'X utf8' to the .spec file.
0

Use this

set PYTHONIOENCODING=utf-8
python -c "print(b'\xc3\x96'.decode('utf-8'))" > test.txt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.