0

Below is a base 64 image encoding function that I got from Philippe Tenenhaus (http://www.philten.com/us-xmlhttprequest-image/).

It's very confusing to me, but I'd love to understand.

I think I understand the bitwise & and | , and moving through byte position with << and >>.

I'm especially confused at those lines : ((byte1 & 3) << 4) | (byte2 >> 4); ((byte2 & 15) << 2) | (byte3 >> 6);

And why it still using byte1 for enc2, and byte2 for enc3. And the purpose of enc4 = byte3 & 63; ...

Can someone could explain this function.

function base64Encode(inputStr) 
            {
               var b64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";
               var outputStr = "";
               var i = 0;

               while (i < inputStr.length)
               {
                   //all three "& 0xff" added below are there to fix a known bug 
                   //with bytes returned by xhr.responseText
                   var byte1 = inputStr.charCodeAt(i++) & 0xff;
                   var byte2 = inputStr.charCodeAt(i++) & 0xff;
                   var byte3 = inputStr.charCodeAt(i++) & 0xff;

                   var enc1 = byte1 >> 2;
                   var enc2 = ((byte1 & 3) << 4) | (byte2 >> 4);

                   var enc3, enc4;
                   if (isNaN(byte2))
                   {
                       enc3 = enc4 = 64;
                   }
                   else
                   {
                       enc3 = ((byte2 & 15) << 2) | (byte3 >> 6);
                       if (isNaN(byte3))
                       {
                           enc4 = 64;
                       }
                       else
                       {
                           enc4 = byte3 & 63;
                       }
                   }

                   outputStr += b64.charAt(enc1) + b64.charAt(enc2) + b64.charAt(enc3) + b64.charAt(enc4);
                } 

                return outputStr;
            }

1 Answer 1

1

It probably helps to understand what Base64 encoding does. It converts 24 bits in groupings of 8 bits into groupings of 6 bits. (http://en.wikipedia.org/wiki/Base64)

So enc1, is the first 6-bits which are the first 6-bits of the first Byte.

enc2, is the next 6-bits, the last 2-bits of the first Byte and first 4-bits of the second Byte. The bitwise and operation byte1 & 3 targets the last 2 bits in the first Byte. So,

XXXXXXXX & 00000011 = 000000XX

It is then shifted to the left 4 bits.

000000XX << 4 = 00XX0000.

The byte2 >> 4 performs a right bit shift, isolating the first 4 bits of the second Byte, shown below

YYYYXXXX >> 4 = 0000YYYY

So, ((byte1 & 3) << 4) | (byte2 >> 4) combines the results with a bitwise or

00XX0000 | 0000YYYY = 00XXYYYY

enc3, is the last 4-bits of the second byte and the first 2-bits of the 3rd Byte.

enc4 is the last 6-bits of the 3rd Byte.

charCodeAt returns a Unicode code point which is a 16-bit value, so it appears there is an assumption that the relevant information is only in the low 8-bits. This assumption makes me wonder if there still is a bug in the code. There could be some information lost as a result of this assumption.

Sign up to request clarification or add additional context in comments.

4 Comments

Great! I almost understand all. Why '(byte1 & 3) << 4' and '(byte2 & 15) << 2' ? I don't understand the why 4 and 2.
OK I see! (byte1 & 3) becoming the first 2 bits, then more powerful, so x2x2x2x2. But : 'all three "& 0xff" added below are there to fix a known bug' : what is the bug ? Please tell me if I'm right : The function only reads the last octets of yeach byte (0xFF = 00000000000000000000000011111111). When reading more, there's a bug.
I am not 100% sure what the bug was, but if I were to guess it would be some sort of type conversion bug. it looks like it is ensuring that it is only receiving one byte, since 0xFF = 11111111 in binary.
The bug is if the string length did not divide by 3 the remaining bytes would = 0 and the isNaN test would always be false simple fix is to test if the increment of(i) is greater then the string length.. var byte2 = (i < inputStr.length)? inputStr.charCodeAt(i++) & 0xff : undefined; var byte3 = (i < inputStr.length)? inputStr.charCodeAt(i++) & 0xff : undefined;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.