14

I have a few classes with heterogenous keys - int and string - and i want to work with them through common interface. It's pretty simple just convert int to string but it obviously will cause perfomance issues. Another options I see are box them to "object", which also doesn't seems perfect, or somehow generate unique integers from string (there will be no joins between former "string" and "int" so they must be unique only in "string" domain) and the quetsion here is "how"?

1
  • Why converting int to string obviously will cause performance issues? I think it is faster than string.GetHashCode(). Commented Jul 14, 2019 at 14:47

3 Answers 3

19

Just take string.GetHashCode() which returns an int from a string with very low collision probability.

Sign up to request clarification or add additional context in comments.

11 Comments

"very low" isn't enough and i don't want to create collision resolution mechanism, it seems like overhead here
@user1437713 - What you want is not possible. Any algorithm like what you want has a risk of collisions evens GUIDs are not actually unique.
Ramhound, depends on the version.
it could be millions, and there is 1:10 proportion between "string key" objects and "integer key", that's why i don't want convert int to string
@Ted: This is true only for .Net Core, which came out way after this post was written. In .Net Framework you always get the same hash code.
|
5

Be wary of string.GetHashCode().

The .Net documentation states https://msdn.microsoft.com/en-us/library/system.string.gethashcode(v=vs.110).aspx

The hash code itself is not guaranteed to be stable. Hash codes for identical strings states can differ across versions of the.NET Framework and across platforms (such as 32-bit and 64-bit) for a single version of the .NET Framework. In some cases, they can even differ by application domain

1 Comment

Almost even worse: GetHashCode will generate a different value for each execution of the application. Its easy to test, just write a console app, like Console.WriteLine("".GetHashCode().ToString()); and run the application twice. The output will be different each time.
3

As @tudor pointed out GetHashCode is the supported way of producing hash code from strings (and other objects). Unfortunately there is no way to do such transformation so an integer represent unique strings unless you put severe restrictions on set of strings.

I.e. if your strings short enough (i.e. 2 Unicode or 4 ASCII characters) than there is obvious one-to-one mapping, or if your set of strings is limited and known in advance.

Some reading on the subject: the underlying problem called pigeonhole principle which guarantees collision. Due to Birthday paradox collisions are very likely to happen on reasonably small sets.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.