I'm noticing a discrepancy between a javascript function run in Node and a javascript function in a UDF in BigQuery.
I am running the following in BigQuery:
CREATE TEMP FUNCTION testHash(md5Bytes BYTES)
RETURNS BYTES
LANGUAGE js AS """
md5Bytes[6] &= 0x0f;
md5Bytes[6] |= 0x30;
md5Bytes[8] &= 0x3f;
md5Bytes[8] |= 0x80;
return md5Bytes
""";
SELECT TO_HEX(testHash(MD5("test_phrase")));
and the output ends up being cb5012e39277d48ef0b5c88bded48591. (This is incorrect)
Running the same code in Node gets cb5012e39277348eb0b5c88bded48591 (which is the expected value) - notice how 2 of the characters are different.
I've narrowed down the issue to the fact that BigQuery doesn't actually apply the bitwise operators, since the output of not running these bitwise operators in Node is the same incorrect output from BQ:
md5Bytes[6] &= 0x0f;
md5Bytes[6] |= 0x30;
md5Bytes[8] &= 0x3f;
md5Bytes[8] |= 0x80;
Any ideas why the bitwise operators are not being applied to the md5Bytes input to the UDF?
md5Bytes[6] &= 0x0fdoesn't changemd5Bytesat all and you get the same result as input. I guess.