0

In the my application, i save urls content into specific table of database. to have minimum duplication, i want to compute checksum for each content. so what is best sqlserver data-type for saving checksum's? and fastest way to computing checksum's for contents(html) of urls?

2 Answers 2

3

SHA1 could be used to calculate the checksum. The result is a byte array which could be stored either as hex string or blob field in SQL but I think for practical reasons a string would be more convenient.

Sign up to request clarification or add additional context in comments.

Comments

2

you can use a built in function in sql server to compute any of these( MD2, MD4, MD5, SHA, or SHA1)

examples

SELECT HashBytes('MD5', 'http://www.cnn.com')

that returns the varbinary datatype 0xC50252F4F24784B5D368926DF781EDE9

SELECT CONVERT(VARCHAR(32),HashBytes('MD5', 'http://www.cnn.com'),2)

that returns a varchar C50252F4F24784B5D368926DF781EDE9

Now all you have to do is picking if you want varchar or varbinary and use that for your column

See Generating a MD2, MD4, MD5, SHA, or SHA1 hash by using HashBytes

1 Comment

OK, this is a good approach. but there is limitation (max length of input is 8000 bytes)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.