Convert bytes data inside a string to a true bytes object [duplicate]

Question

In Python 3, I have a string like the following:

mystr = "\x00\x00\x01\x01\x80\x02\xc0\x02\x00"

This string was read from a file and it is the bytes representation of some text. To be clear, this is a unicode string, not a bytes object.

I need to transform mystr into a bytes object like the following:

mybytes = b"\x00\x00\x01\x01\x80\x02\xc0\x02\x00"

Notice that the translation should be literal. I don't want to encode the string.

Running .encode('utf-8') will escape the \.

It I manually copy and past the content into a bytes string, then everything works. What I couldn't find anywhere is how could I convert it without copy+paste.

bytes(bytearray(ord(i) for i in mystr)) seems to work ... though I feel like there should be a better way. Maybe the better way is to figure out how to not end up in this situation in the first place? :-) — mgilson
– mgilson, Commented Jul 13, 2016 at 0:39
@mgilson thanks! I was thinking about that but this is what I have. Reading the file in 'rb' gives be "\\x00\\x00...", which is not what I want. Looking for something unrelated I found the solution I posted below. — Leo Uieda
– Leo Uieda, Commented Jul 13, 2016 at 0:42
I ended up deleting my answer because it didn't really work. There were some extra characters being printed in the middle that I hadn't noticed before. — Leo Uieda
– Leo Uieda, Commented Jul 13, 2016 at 0:53
"Running .encode('utf-8') will escape the `. " No, it won't. There isn't a ` to escape in the string shown here. If the file actually contains backslashes, lowercase xs etc. then that is a separat problem; and you will see the backslashes be escaped if you view a repr of the string, even without changing anything. However, .encode('utf-8') will corrupt the data (assuming each Unicode code point is intended to represent one byte) by prepending a 0xc2 byte before the 0x80, and 0xc3 before the 0xc0. — Karl Knechtel
– Karl Knechtel, Commented Aug 5, 2022 at 2:49
I'm not sure what this question was intended to be, but it's one of these duplicates for sure. — Karl Knechtel
– Karl Knechtel, Commented Aug 5, 2022 at 4:36

Paul Cornelius · Accepted Answer · 2016-07-13 01:17:29Z

2

mystr.encode("latin-1") is what you want.

answered Jul 13, 2016 at 1:17

Paul Cornelius

11.4k1 gold badge18 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Convert bytes data inside a string to a true bytes object [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related