15

I'm working on a project that includes WebSockets, and data between the server (Node.js) and the client (Chrome) is sent using a custom (very simple) format for data exchange I set up.

I'm sending data in pieces of 3 bits because I'm sending items which all have 8 possibilities. The data format looks like this:

            0          1
bit index   01234567 8901...
item        aaabbbcc cddd...

Currently, I'm parsing the items out of the bytes like this:

var itemA = bytes[0] >> 5;
var itemB = (bytes[0] >> 2) & 7;
var itemC = (bytes[0] & 3) << 1 | bytes[1] >> 7;
var itemD = (bytes[1] >> 4) & 7;

Personally, this feels as being too sophisticated. The problem is that it's only complex because I'm getting data in bytes, which are a multiple of 8. To parse out items of 3 bits I have to bit-shift, doing AND operations, and because 8 is not divisible by 3 I sometimes even have to combine parts of two bytes like for itemC.

It would be much more effective to read this data as groups of 3 bits instead of groups of 8 bits.

What I've come up with is converting all bytes into bits to a string using .toString(2), then using .substring to get a substring with length 3, and converting back to a number with parseInt(bitString, 2), but I guess that's not the way to do it, since string manipulation is slow and I'm actually not doing anything string-related.

Is it possible to read bits in groups of e.g. 3 instead of parsing them from bytes? Or is there a more efficient way to read bits out of bytes?

6 Answers 6

8

The binary AND and bit shifting operations are the fastest way of doing this. They translate well into machine code instructions. The only way to further speed this up is by sacrificing bandwidth for speed by for example simply not using more than 3 bits per byte, but judging from your question you've probably already considered and rejected that tradeoff.

Sign up to request clarification or add additional context in comments.

4 Comments

Do you imply there is no real means of getting rid of the multiple of 8 that one always obtains with bytes?
That's how computers work, with bytes (8 bits) and multiples of that (16, 32, 64 bits) so even if a higher level language would allow you to work with anything else, in the end that's what these instructions need to be translated to.
I'd always believed that bitwise operations were lame in JS, because all JS was supposed to know about is 64bit float, and all else is software emulated. Have the passing years made this old truth wrong ?
JS has had ArrayBuffer and Uint8Array and family since ES2015 (the year you asked). So yes, it's not what it used to be.
3
function byte2bits(a)
{
    var tmp = "";
    for(var i = 128; i >= 1; i /= 2)
        tmp += a&i?'1':'0';
    return tmp;
}
function split2Bits(a, n)
{
    var buff = "";
    var b = [];
    for(var i = 0; i < a.length; i++)
    {
        buff += byte2bits(a[i]);
        while(buff.length >= n)
        {
            b.push(buff.substr(0, n));
            buff = buff.substr(n);
        }
    }
    return [b, buff];
}
var a, b, r;
a = [227, 142];
[b, r] = split2Bits(a, 3);
//b = ["111", "000", "111", "000", "111"];
//r = '0'; //rest of bits

Comments

1

if endian-ness is taken care, you can access it as an int or a long int array. There is another possibily of not using bit 3 and bit 7

Comments

1

We can get value we need by getting appropriate 16-bits integer and then bitshift it.

It is clear, that to get i-th value we should get 16-bits integer with offset in bytes that fits (bits * (i + 1) - 16)/8 <= offset <= (bits * i)/8.

Lets take M=bits*i/8, so we have M + bits/8 - 2<= offset <= M. Then we get minimal offset as ceil(M + bits/8 - 2) and calculate position of i-th value in the 16-bit integer by offsets. I have just wrote the following function

function getDataFromStream(buffer, bitsPerValue, endianness) {
    var valuesCount = Math.floor(buffer.length * 8 / bitsPerValue);
    var ret = new Buffer(valuesCount);

    if (valuesCount > 0) {
        for (var i = 0; i < valuesCount; i++) {
            var offsetMin = Math.ceil(bitsPerValue * i / 8. + bitsPerValue / 8. - 2);
            if (offsetMin < 0) {
                offsetMin = 0;
            }
            if(endianness == 'BE')
                var wordWithValue = buffer.readUInt16BE(offsetMin, true);
            else
                var wordWithValue = buffer.readUInt16LE(offsetMin, true); 
            var offsetInWord = bitsPerValue * i - offsetMin * 8;
            var leftInWord = 16 - bitsPerValue - offsetInWord;

            // then get value in the word by shifting and then remove other bits by "%"
            ret[i] = (wordWithValue >> (endianness == 'BE' ? leftInWord : offsetInWord ))  % Math.pow(2, bitsPerValue);
        }
    }
    return ret;
}

And the following example to read 8 5-bit values off the Buffer with 5 bytes length.

// buffer with 5 bytes
var xx = new Buffer(5);
xx[0] = 255;
xx[1] = 255;
xx[2] = 255;
xx[3] = 255;
xx[4] = 250;

// get data, 5bits per value.
var yy = getDataFromStream(xx, 5, 'BE');
console.log('got buffer with length='+ yy.length);
for(i = 0; i < yy.length; i++){
    console.log(i+'-'+yy[i]);
}

When I launch node test.js I got

got buffer with length=8
0-31
1-31
2-31
3-31
4-31
5-31
6-31
7-26

4 Comments

I have problems getting this to work with values larger than 255. Could you possibly give an update?
@Geert-Jan but one byte is 0-255 only. What do you want from my code? =)
I was looking for some generic encode/decode functions to bitpack values of arbitrary length. E.g.: sometimes I would have an array of values of which the max value would be 220. Other times it would be 1045. To encode the latter array I would need 11 bits per item (and thus > 8) . Not really in scope of the original question though ;)
It was six years ago, I look at this piece of my code just as you do :) You should just add converting into bytes.
0

If you have a number of fixed length (ie you can always guarantee it'll be 2 bytes), you can read the bits like this:

// convert to binary
const num = 256;
const numAsBinaryString = num.toString(2);
const leastSignificantByteAsBinaryString = numAsBinaryString.substr(8);

const [
    eighthBit,
    seventhBit,
    sixthBit,
    fifthBit,
    fourthBit,
    thirdBit,
    secondBit,
    firstBit,
] = leastSignificantByteAsBinaryString.join('');

Comments

0

The shortest way I know is this:

function isBitSet(value, bit) {
    return !!(bit === 1 ? value & 1 : value & Math.pow(2,bit));
}

This is assuming value is a "big-endian" byte (the left side of the number is written in memory before the right side). If it isn't, you'll need to convert it first.

This function should also work for any bit-size number (16, 32, 64...).

HOW IT WORKS: Because each bit, starting from the right most, represents an equivalent "decimal" value equals to 2 (representing 2 symbols in the Binary system) to the power of the bit position (1-based, excluding the first which is just 1) - if that bit is set, ANDing it with the original value will always return a non-zero value - the result of Math.pow (which is also truthy), and 0 (falsy) if it is not.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.