I am trying to pack a char to bytes with python using the struct package but my code won't return 4 bytes when packing the char using this code:
def charToHex(s):
#check if string is unicode
if isinstance(s, str):
print(struct.pack('<c', 'a'.encode(encoding='utf-8')))
return '{:02x}'.format(struct.unpack('<I', struct.pack('<c', s.encode('utf-8')))[0])
#check if input is already a byte
elif isinstance(s, bytes):
return '{:02x}'.format(struct.unpack('<I', struct.pack('<c', s))[0])
else:
raise Exception()
Can anyone explain to me why this won't work? I am just trying to convert the unicode char to 4 bytes and unpack it but it won't even pack correct.
cformat ischarin the C sense of a single byte, not the Python sense of a Unicode code point. Since the UTF-8 encoding of a Unicode character is anywhere from 1 to 4 bytes, you can'tpackit as ac. You'd have to do something silly like pad it out to 4 bytes and pack that as4c(at which point it's a lot simpler to use UTF-32 instead of UTF-8).