0

The way as python2 and python3 handtle the strings and the bytes are different, thus printing a hex string which contains non-ASCII characters in Python3 is different to Python2 does.

Why does it happens and how could I print something in Python3 like Python2 does? (With ASCII characters or UTF-8 it works well if you decode the bytes string)

Python3:

$ python3 -c 'print("\x41\xb3\xde\x41\x42\x43\xad\xde")' |xxd -p
41c2b3c39e414243c2adc39e0a

Python2:

$ python2 -c 'print "\x41\xb3\xde\x41\x42\x43\xad\xde"' |xxd -p
41b3de414243adde0a

\x0a is newline because print adds it.

How could I print "\xb3" in python3? It adds "\xc2\xb3" instead just "\xb3".

$ python3 -c 'print("\xb3")' |xxd
00000000: c2b3 0a                                  ...
$ python2 -c 'print "\xb3"' |xxd
00000000: b30a                                     ..
5
  • Can you please clarify what exactly you want to achieve? Do you want to write raw bytes without any (internal and external) encoding? Commented Feb 18, 2021 at 11:00
  • Yes, I want to print with Python3 as Python2 does. With Python3 I can not print "\xb3", Python3 prints "\xc2\xb3", I just want to print "\xb3". I have added this example to the question. Commented Feb 18, 2021 at 11:04
  • Not to be too pedantic, but are you aware that "\xb3" in Python2 and Python3 are not the same? The closes equivalent to the Python2 thing in Python3 is b"\xb3". So the questions "how do I write "\xb3" to stdout in Python3 as in Python2" and "how do I write the byte value 0xb3 to stdout in Python3" are two different things. In short, do you already have the string literal and want to write that, or do you have the byte value and want to write that? Commented Feb 18, 2021 at 11:14
  • Yes, I know in Python3 is b"\xb3"; And that is the question; how could I print the b"\xb3"; If i do bytes.fromhex("41b3de414243adde0a") I got b'A\xb3\xdeABC\xad\xde\n' which is OK; but when I print It adds "\xc2\xb3". That is the question. You are not pedantic, do not worry. We are in Internet. Commented Feb 18, 2021 at 11:20
  • More info: To print it I have tried with utf-8 and ignore, replace. Also I have tried with latin_1. Before to ask here I have looked docs.python.org/3/library/codecs.html Commented Feb 18, 2021 at 11:23

2 Answers 2

2

The underlying problem is that in Python3 str is for encoded strings, and likewise print only handles str and thus always enforces some encoding.

To write binary data, directly write bytes to the underlying binary pipe of stdout:

python3 -c 'import sys; sys.stdout.buffer.write(b"\x41\xb3\xde\x41\x42\x43\xad\xde")' | xxd -p
41b3de414243adde

Note that the final 0a is missing because .write adds no newline. Manually add it if it is desired.

In case the data already exists as a string, the latin1 encoding can be used to get equivalent bytes:

python3 -c '
import sys
sys.stdout.buffer.write("\x41\xb3\xde\x41\x42\x43\xad\xde".encode("latin1"))' | xxd -p
41b3de414243adde
Sign up to request clarification or add additional context in comments.

Comments

2

I'm not sure if this will help in your case but you can use sys.stdout.buffer:

Note

To write or read binary data from/to the standard streams, use the underlying binary buffer object. For example, to write bytes to stdout, use sys.stdout.buffer.write(b'abc').

$ python3 -c 'import sys; sys.stdout.buffer.write(b"\x41\xb3\xde\x41\x42\x43\xad\xde")' | hexdump -C
00000000  41 b3 de 41 42 43 ad de                           |A..ABC..|
00000008

Please also note that there is no new line character now that was added by print function.

1 Comment

Sure. That's the solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.