3

In most cases, HashSet has lookup complexity O(1). I understand that this is because objects are kept in buckets corresponding to hashcodes of the object.

When lookup is done, it directly goes to the bucket and finds (using equals if many objects are present in same bucket) the element.

I always wonder, how it directly goes to the required bucket? Which algorithm is used for bucket lookup? Does that add nothing to total lookup time?

5
  • The word is 'algorithm'. It is derived from a proper name. Abbreviation not permissible, indeed meaningless. Commented Feb 25, 2015 at 7:11
  • will take care from next time :) Commented Feb 25, 2015 at 7:13
  • Basically, buckets[o.hashCode() % buckets.length]. Commented Feb 25, 2015 at 7:13
  • 2
    its basically accessing a index i from array. so no look up time to reach index i. i is calulated from hashcode%array.length ; and hashcode is being calculated from has function. although it takes some amount of constant time and O(1) signifies that it doesen't depends on the size of elements. Commented Feb 25, 2015 at 7:17
  • each answer giving some more insight!! I can't accept them all :( Commented Feb 25, 2015 at 7:21

3 Answers 3

4

I always wonder, how it directly goes to the required bucket?

The hash code is treated and used as an index in to an array.

The index is determined by hash & (array.length - 1) because the length of the Java HashMap's internal array is always a power of 2. (This a cheaper computation of hash % array.length.)

Each "bucket" is actually a linked list (and now, possibly a tree) where entries with colliding hashes are grouped. If there are collisions, then a linear search through the bucket is performed.

Does that add nothing to total lookup time?

It incurs the cost of a few loads from memory.

Sign up to request clarification or add additional context in comments.

1 Comment

HashMap#hash also mangles the hash to distribute the bits better
2

Often, the algorithm is simply

hash = hashFunction(key)
index = hash % arraySize

See the wikipedia article on Hash Table for details.

Comments

1

From memory: the HashSet is actually backed by a HashMap and the basic look up process is:

  • Get the key
  • hash it (hashcode())
  • hashcode % the number of buckets
  • Go to that bucket and evaluate equals()

For a Set there would only be unique elements. I would suggest reading the source for HashSet and it should be able to answer your queries.

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.containsKey%28java.lang.Object%29

Also note that the Java 8 code has been updated and this explanation covers pre Java 8 codebase. I have not examined in detail the Java 8 implementation except to figure out that it is different.

2 Comments

It is not true that in a set there can be only one element per bucket. If you have a clash (two different elements with the same hash), they would end up in the same bucket. In the source code you linked you can even see that every bucket has a linked list of Entry elements.
You are right, what I meant was that there would be only unique keys in the set and no duplicates. Updated. Thanks.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.