0

I am given strings structured as such: "\x0C\x00Maximilianus\xf4\x01" and I would like to dynamically extract first two and last two bytes and convert them to decimals. The encoding for this should be UTF-8 little-endian unsigned.

"\x0C\x00" equals 12

"\xf4\x01" equals 500

I am not able to find any function that would be able to do that. Also replacing "\x" in the string doesn't work as I cannot manipulate with escape characters.

Any thoughts?

8
  • Do you really want decimals? Or ints? Commented Mar 7, 2020 at 22:13
  • And you say you get strings but you're also talking about encoding, which doesn't really make sense. Commented Mar 7, 2020 at 22:15
  • 1
    how did you create this structure ? If you will know how it was created then you will know how to convert it back. For example : if you used struct.pack() to create it then use struct.unpack() to convert it back. Commented Mar 7, 2020 at 22:17
  • BTW: string Maximilianus has 12 chars so "\x0C\x00" can be information how long is string and it can be some system to send data in network Commented Mar 7, 2020 at 22:26
  • print(struct.unpack('hh', b"\x0C\x00\xf4\x01")) gives (12, 500) Commented Mar 7, 2020 at 22:27

2 Answers 2

1

You can use struct to get numbers.

Using table Format Characters you can see you need "h" to convert 2-bytes integer.
You can eventually use "<h" to make sure it will use little-endian

import struct

# convert to bytes
data = "\x0C\x00Maximilianus\xf4\x01".encode('latin1')

# get short integer
number = struct.unpack('<h', data[:2])[0]
print('number:', number)

# skip number
data = data[2:]

# get string
#text = struct.unpack(f'{number}s', data[:number])[0] # use `number` to create `"12s"`
#print('text:', text.decode())
print('text:', data[:number].decode())

# skip string
data = data[number:]

# get short integer
number = struct.unpack('<h', data[:2])[0]
print('number:', number)

BTW: it looks similar to MessagePack so maybe there is special module for this but I don't know it.

Sign up to request clarification or add additional context in comments.

2 Comments

data = "\x0C\x00Maximilianus\xf4\x01".encode('latin1') this was the line I was looking for. Then I can do int.from_bytes(data[:2],byteorder="little") to get the first part and int.from_bytes(data[-2:],byteorder="little") to get the last part.
you could put your comment as answer - it can be useful for other users.
0

So here is my final solution with the help from furas:

data = "\x0C\x00Maximilianus\xf4\x01".encode('latin1')
name_len = int.from_bytes(data[:2],byteorder="little")
ending = int.from_bytes(data[-2:],byteorder="little")

print(name_len) # --> 12
print(ending) # --> 500

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.