2

I'm kind of at a loss i want to extract up to 64bits with a defined bitoffset and bitlength (unsigned long long) from a string (coming from network).

The string can be at an undefined length, so i need to be sure to only access it Bytewise. (Also means i cant use _bextr_u32 intrinsic). I cant use the std bitset class because it doesnt allow extraction of more then one bit with an offset and also only allows extraction of a predefined number of bits.

So I already calculate the byteoffset (within the string) and bitoffset (within the starting byte).

m_nByteOffset = nBitOffset / 8;
m_nBitOffset = nBitOffset % 8;

Now i can get the starting address

const char* sSource = str.c_str()+m_nByteOffset;

And the bitmask

unsigned long long nMask = 0xFFFFFFFFFFFFFFFFULL >> (64-nBitLen);

But now I just cant figure out how to extract up to 64 bits from this as there are no 128 bit integers available.

unsigned long long nResult = ((*(unsigned long long*)sSource) >> m_nBitOffset) & nMask;

This only works for up to 64-bitoffset bits, how can i extend it to really work for 64 bit indepently of the bitoffset. And also as this is not a bytewise access it could cause a memory read access violation.

So im really looking for a bytewise solution to this problem that works for up to 64 bits. (preferably C or intrinsics)

Update: After searching and testing a lot I will probably use this function from RakNet: https://github.com/OculusVR/RakNet/blob/master/Source/BitStream.cpp#L551

0

2 Answers 2

2

To do it byte-wise, just read the string (which BTW it is better to interpret as a sequence of uint8_t rather than char) one byte at a time, updating your result by shifting it left 8 and oring it with the current byte. The only complications are the first bit and the last bit, which both require you to read a part of a byte. For the first part simply use a bit mask to get the bit you need, and for the last part down shift it by the amount needed. Here is the code:

const uint8_t* sSource = reinterpret_cast<const uint8_t*>(str.c_str()+m_nByteOffset);

uint64_t result = 0;
uint8_t FULL_MASK = 0xFF;

if(m_nBitOffset) {
    result = (*sSource & (FULL_MASK >> m_nBitOffset));
    nBitLen -= (8 - m_nBitOffset);
    sSource++;
}

while(nBitLen > 8) {
    result <<= 8;
    result |= *sSource;
    nBitLen -= 8;
    ++sSource;
}

if(nBitLen) {
    result <<= nBitLen;
    result |= (*sSource >> (8 - nBitLen));
}

return result;
Sign up to request clarification or add additional context in comments.

18 Comments

i like this division of the problem, this will work ill do some tests later today
A static_cast should work, since const char* should be compatible with const uint8_t*. That would give you some safety against typos.
This is might compile to some pretty bad asm with gcc. e.g. it might actually shift and load one byte at a time. :/ In asm, loading up-to-64bits that might not be byte-aligned should be doable with a byte-load, an unaligned 64bit load, and a couple shifts. (And an OR if you don't use a 2-register shift like x86's shrd to take a window of the concatenation of 2 regs). Branches are optional, to skip the byte-load if one 64bit load can include the entire desired bitstring. If you're lucky, though, you'll get asm like that from this src.
hm this code didn't work it shuffles the bytes wrong, endianness is destroyed. uint64_t nMask = 0xFFFFFFFFFFFFFFFFULL; uint64_t nPattern = 0xFFFEFDFCFBFAF9F8ULL; uint64_t nPattern2 = 0xFAAAAAAAAAAAAAFFULL; EXPECT_EQ( nPattern, ExtractField( 0, 64, (uint8_t*)&nPattern) ); EXPECT_EQ( nMask >> 1, ExtractField( 1, 63, (uint8_t*)&nMask) ); EXPECT_EQ( nPattern >> 1, ExtractField( 1, 63, (uint8_t*)&nPattern) ); EXPECT_EQ( nPattern2 >> 1, ExtractField( 1, 63, (uint8_t*)&nPattern2) );
Well OK, but you didn't specify the endianess of your input (str.c_str()) and I assumed it was network byte order. I guess you're saying the data is little endian instead?
|
1

This is how I would do it in modern C++ style. The bit length is determined by the size of the buffer extractedBits: instead of using an unsigned long long, you could also use any other data type (or even array type) with the desired size.

See it live

unsigned long long extractedBits;
char* extractedString = reinterpret_cast<char*>(&extractedBits);
std::transform(str.begin() + m_nByteOffset,
               str.begin() + m_nByteOffset + sizeof(extractedBits),
               str.begin() + m_nByteOffset + 1,
               extractedString,
               [=](char c, char d)
               {
                   char bitsFromC = (c << m_nBitOffset);
                   char bitsFromD = 
                       (static_cast<unsigned char>(d) >> (CHAR_BIT - m_nBitOffset));
                   return bitsFromC | bitsFromD;
               });

4 Comments

hm i tried this code as well, but even basic test cases fail also there isn't a bitlength used.
@Hendrik Updated! Now it works and I explained how to change the bit length.
I thought the OP's bitlength didn't have to be a power of 2, or even a multiple of 8, so e.g. the result could be 45 bits, stored in the low 45 of a 64bit integer (zero-extended to fill the upper 19 bits with 0).
exactly, thats why its missing. anyways i found now the function i require in RakNets bitstream processor.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.