SQL Server hash algorithms

Question

If my input length is less than the hash output length, are there any hashing algorithms that can guarantee no collisions.

I know by nature that a one way hash can have collisions across multiple inputs due to the lossy nature of the hashing, especially when considering input size is often greater than output size, but does that still apply with smaller input sizes?

Must have been a copy paste fail. Regardless, it looks like a suitable answer was given :) — Xedni
– Xedni, Commented Jan 12, 2015 at 21:48

usr · Accepted Answer · 2015-01-09 22:04:13Z

1

Use a symmetric block cipher with a randomly chosen static key. Encryption can never produce a duplicate because that would prevent unambiguous decryption.

This scheme will force a certain output length which is a multiple of the cipher block size. If you can make use a variable-length output you can use a stream cipher as well.

answered Jan 9, 2015 at 22:04

usr

172k38 gold badges251 silver badges380 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Cryptopone · Accepted Answer · 2015-01-09 22:09:53Z

Your question sounds like you're looking for a perfect hash function. The problem with perfect hash functions is they tend to be tailored towards a specific set of data.

The following assumes you're not trying to hide, secure or encrypt the data...

To think of it another way, the easiest way to "generate" a perfect hash function that will accept your inputs is to map the data you want to store to a table and associate those inputs with a surrogate primary key. You then create a unique constraint for the column (or columns) to ensure the input you're mapping only maps to a single surrogate value.

The surrogate key could be int, bigint or a guid. It all depends on how many rows you're looking to store.

AaronLS · Accepted Answer · 2015-01-09 23:09:03Z

If your input lengths are known to be small, such as 32 bits, then you could actually enumerate through all possible inputs and check the resulting hashes for collisions. That's only going to be 4294967296 possible inputs, and shouldn't take to terribly long to enumerate all of them. Essentially you'd be building a rainbow table to test for collisions.

If there is some security relying on this though, one of the issues is if an attacker knows your input lengths are constrained, it makes it easy for them to also perform the same enumeration to create a map/table that will map hashes back to the original values. "attacker" is a pretty terrible term here though because I have no context of how you are using these hashes and whether you are concerned about being able to reverse them.

Collectives™ on Stack Overflow

SQL Server hash algorithms

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related