0

I have a script I'm writing where I need to print the character sequence "Qä" to the terminal. My terminal is using UTF-8 encoding. My file has # -*- coding: utf-8 -*- at the top of it, which I think is not actually necessary for Python 3, but I put it there in case it made any difference. In the code, I have something like

print("...Qä...")

This does not produce Qä. Instead it produces Q▒.

I then tried

qa = "Qä".encode('utf-8')
print(f"...{qa}...")

This also does not produce Qä. It produces 'Q\xc3\xa4'.

I also tried

qa = u"Qä"
print(f"...{qa}...")

This also produces Q▒.

However, I know that Python 3 can open files that contain UTF-8 and use the contents properly, so I created a file called qa.txt, pasted Qä into it, and then used

with open("qa.txt") as qa_file:
    qa = qa_file.read().strip()
print(f"...{qa}...")

This works. However, it's beyond dumb that I have to create this file in order to print this string. How can I put this text into my code as a string literal?

This question is NOT a duplicate of a question asking about Python 2.7, I am not using Python 2.7.

10
  • @Barmar: That dupe target was specifically about Python 2. This is a Python 3 question. Commented Jul 27, 2023 at 3:26
  • I suspect this is actually a terminal emulator issue. Your first code works for me in a Mac Terminal window. Commented Jul 27, 2023 at 3:28
  • 2
    Are sys.stdout.encoding and sys.getdefaultencoding() both "utf-8"? Commented Jul 27, 2023 at 3:31
  • 1
    Windows 10. I'm using Git Bash for the console, it has an options menu where you can set the encoding, and I've confirmed that it is set to UTF-8. A new file with just print("ä") also doesn't work. Commented Jul 27, 2023 at 3:36
  • 1
    sys.stdout.reconfigure(encoding='utf-8') might help. docs.python.org/3/library/io.html#io.TextIOWrapper.reconfigure I'm not sure what's causing this situation in the first place, though. Commented Jul 27, 2023 at 3:43

1 Answer 1

4

You're using Git Bash, on Windows. On Windows, except if stdio is connected to a standard Windows console (which I don't think Git Bash counts as), Python defaults the standard streams to a locale encoding of 'cp1252'. Your terminal is set to expect UTF-8, not CP1252. You can reconfigure the standard output stream to UTF-8 with

sys.stdout.reconfigure(encoding='utf-8')

and similarly for stdin and stderr, or you can set the PYTHONIOENCODING environment variable to utf-8 before running Python to change the default stdin/stdout/stderr encodings.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.