0

I encoded and decoded a bunch of coefficients (related to my previous question). The process is based on RLE where a bunch of coefficients are encoded and the runtime encoding is only focused on zeros. To cut it short this was the original array:

[200, -145, 0, 0, 0, 0, 51, 0, 0, 0, 0, 0, 0, 0, 0, -34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Encoded into binary data that looks like this:

['000011001000', '11001>101101111<', '000010001110110011', '00010000111>1011110<', '00011000110011101', '000100011']

To avoid binary numbers that look like -10010001 (-145), I manually performed twos complement(since I couldn't find a built in way) on negative numbers. The results for the numbers in this case (-145, -34) were (101101111, 1011110). To avoid confusion, I marked them in the array above for the purpouse of this question.

This was padded to be divisible by 8(the last element had 0's inserted into the beginning), divided into bytes and written into a file.

When I read the file, I decoded most things successfully and the number of coefficients is identical to the starting one. The problem arose with the negative values:

[200, 367, 0, 0, 0, 0, 51, 0, 0, 0, 0, 0, 0, 0, 0, 94, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Instead of -145 i got 367 and instead of -34 i got 94.

Is there any built-in way (or any kind of way) to convert bitstrings into signed values? I feel like this would fix my issue. I haven't been able to find a way and I'm stuck now.

1 Answer 1

0

For unsigned numbers the word size is not important because leading zeroes have no meaning there. For instance 5=101=0101=00101=0...0101. However, for two's complement the word size makes a difference because the first bit indicates negative numbers. For instance, -3=101 != 0101=5. If you don't know what the first bit is, you cannot tell whether the number is negative or not.

It seems like your encoding uses a variable word width. Since you already can decode the numbers, you already know how wide each word is.

# these variables should be set by your decoder
# in this case we read -145 encoded as 101101111
width = 9
word = 367

# add this to your decoder to fix the sign
firstBit = word >> (width - 1)
if (firstBit == 1):
  leadingOnes = (-1 << width)
  word = leadingOnes | word

The same could be done without the branch and in a single statement, but I think this could be slower on average for CPython and certainly less readable.

word |= -(word >> (width - 1)) << width

Of course you have to make sure that non-negative numbers are encoded with a leading 0 so that you can tell them apart from the negative ones.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.