I'm using the checksum function in SQL Server 2008 R2, and I would like to get the same int values in a C# app.
Is there any equivalent method in C# that returns the same values as the SQL checksum function?
I'm using the checksum function in SQL Server 2008 R2, and I would like to get the same int values in a C# app.
Is there any equivalent method in C# that returns the same values as the SQL checksum function?
On SQL Server Forum, at this page, it's stated:
The built-in CHECKUM function in SQL Server is built on a series of 4 bit left rotational xor operations. See this post for more explanation.
I was able to port the BINARY_CHECKSUM to c# and it seems to be working... I'll be looking at the plain CHECKSUM later...
private int SQLBinaryChecksum(string text)
{
long sum = 0;
byte overflow;
for (int i = 0; i < text.Length; i++)
{
sum = (long)((16 * sum) ^ Convert.ToUInt32(text[i]));
overflow = (byte)(sum / 4294967296);
sum = sum - overflow * 4294967296;
sum = sum ^ overflow;
}
if (sum > 2147483647)
sum = sum - 4294967296;
else if (sum >= 32768 && sum <= 65535)
sum = sum - 65536;
else if (sum >= 128 && sum <= 255)
sum = sum - 256;
return (int)sum;
}
CHECKSUM docs don't disclose how it computes the hash. If you want a hash you can use in T-SQL and C#, pick from the algorithms supported in HashBytes
The T-SQL documentation does not specify what algorithm is used by checksum() outside of this:
CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indexes. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index. This can be used for equality searches over the columns.
It's unlikely to compute an MD5 hash, since its return value (the computed hash) is a 32-bit integer; an MD5 hash is 128 bits in length.
In case you need to do a checksum on a GUID, change dna2's answer to this:
private int SQLBinaryChecksum(byte[] text)
With a byte array, the value from SQL will match the value from C#. To test:
var a = Guid.Parse("DEAA5789-6B51-4EED-B370-36F347A0E8E4").ToByteArray();
Console.WriteLine(SQLBinaryChecksum(a));
vs SQL:
select BINARY_CHECKSUM(CONVERT(uniqueidentifier,'DEAA5789-6B51-4EED-B370-36F347A0E8E4'))
both answers will be -1897092103.
@Dan's implementation of BinaryChecksum can be greatly simplified down in c# down to
int SqlBinaryChecksum(string text)
{
uint accumulator = 0;
for (int i = 0; i < text.Length; i++)
{
var leftRotate4bit = (accumulator << 4) | (accumulator >> -4);
accumulator = leftRotate4bit ^ text[i];
}
return (int)accumulator;
}
This also makes it clearer what the algorithm is doing. For each character, a 4 bit circular shift then an xor with character's byte
Based on other answers and comments, I made a version with some fixes for:
C# >= 11 (note the Unsigned right-shift operator >>>)
[DbFunction("BINARY_CHECKSUM", IsBuiltIn = true, IsNullable = false)] // server-evaluation BINARY_CHECKSUM(text)
public static int BinaryChecksum(string text)
{
// client-evaluation implementation (same result as BINARY_CHECKSUM(text))
if (text is null)
{
return int.MaxValue;
}
var result = 0;
foreach (var c in text.TrimEnd(' ', ' ').Take(255))
{
var circularShift4BitsToLeft = (result << 4) | (result >>> 28);
result = circularShift4BitsToLeft ^ c;
}
return result;
}
C# < 11
[DbFunction("BINARY_CHECKSUM", IsBuiltIn = true, IsNullable = false)] // server-evaluation BINARY_CHECKSUM(text)
public static int BinaryChecksum(string text)
{
// client-evaluation implementation (same result as BINARY_CHECKSUM(text))
if (text is null)
{
return int.MaxValue;
}
var result = 0u;
foreach (var c in text.TrimEnd(' ', ' ').Take(255))
{
var circularShift4BitsToLeft = (result << 4) | (result >> 28);
result = circularShift4BitsToLeft ^ c;
}
return unchecked((int)result);
}