In Python 3, I have a string like the following:
mystr = "\x00\x00\x01\x01\x80\x02\xc0\x02\x00"
This string was read from a file and it is the bytes representation of some text. To be clear, this is a unicode string, not a bytes object.
I need to transform mystr into a bytes object like the following:
mybytes = b"\x00\x00\x01\x01\x80\x02\xc0\x02\x00"
Notice that the translation should be literal. I don't want to encode the string.
Running .encode('utf-8') will escape the \.
It I manually copy and past the content into a bytes string, then everything works. What I couldn't find anywhere is how could I convert it without copy+paste.
bytes(bytearray(ord(i) for i in mystr))seems to work ... though I feel like there should be a better way. Maybe the better way is to figure out how to not end up in this situation in the first place? :-)'rb'gives be"\\x00\\x00...", which is not what I want. Looking for something unrelated I found the solution I posted below..encode('utf-8')will escape the`. " No, it won't. There isn't a` to escape in the string shown here. If the file actually contains backslashes, lowercase xs etc. then that is a separat problem; and you will see the backslashes be escaped if you view areprof the string, even without changing anything. However,.encode('utf-8')will corrupt the data (assuming each Unicode code point is intended to represent one byte) by prepending a 0xc2 byte before the 0x80, and 0xc3 before the 0xc0.