Algorithm used for bucket lookup for hashcodes [duplicate]

Question

In most cases, HashSet has lookup complexity O(1). I understand that this is because objects are kept in buckets corresponding to hashcodes of the object.

When lookup is done, it directly goes to the bucket and finds (using equals if many objects are present in same bucket) the element.

I always wonder, how it directly goes to the required bucket? Which algorithm is used for bucket lookup? Does that add nothing to total lookup time?

The word is 'algorithm'. It is derived from a proper name. Abbreviation not permissible, indeed meaningless. — user207421
– user207421, Commented Feb 25, 2015 at 7:11
its basically accessing a index i from array. so no look up time to reach index i. i is calulated from hashcode%array.length ; and hashcode is being calculated from has function. although it takes some amount of constant time and O(1) signifies that it doesen't depends on the size of elements. — anurag gupta
– anurag gupta, Commented Feb 25, 2015 at 7:17
each answer giving some more insight!! I can't accept them all :( — Saurabh Batra
– Saurabh Batra, Commented Feb 25, 2015 at 7:21

Radiodef · Accepted Answer · 2015-02-25 07:25:26Z

4

I always wonder, how it directly goes to the required bucket?

The hash code is treated and used as an index in to an array.

The index is determined by hash & (array.length - 1) because the length of the Java HashMap's internal array is always a power of 2. (This a cheaper computation of hash % array.length.)

Each "bucket" is actually a linked list (and now, possibly a tree) where entries with colliding hashes are grouped. If there are collisions, then a linear search through the bucket is performed.

Does that add nothing to total lookup time?

It incurs the cost of a few loads from memory.

edited Feb 25, 2015 at 7:25

answered Feb 25, 2015 at 7:19

Radiodef

38k14 gold badges95 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Niklas B. Over a year ago

HashMap#hash also mangles the hash to distribute the bits better

NamshubWriter · Accepted Answer · 2015-02-25 07:15:51Z

2

Often, the algorithm is simply

hash = hashFunction(key)
index = hash % arraySize

See the wikipedia article on Hash Table for details.

answered Feb 25, 2015 at 7:15

NamshubWriter

24.4k2 gold badges44 silver badges60 bronze badges

Comments

Khanna111 · Accepted Answer · 2015-02-25 07:35:37Z

1

From memory: the HashSet is actually backed by a HashMap and the basic look up process is:

Get the key
hash it (hashcode())
hashcode % the number of buckets
Go to that bucket and evaluate equals()

For a Set there would only be unique elements. I would suggest reading the source for HashSet and it should be able to answer your queries.

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.containsKey%28java.lang.Object%29

Also note that the Java 8 code has been updated and this explanation covers pre Java 8 codebase. I have not examined in detail the Java 8 implementation except to figure out that it is different.

edited Feb 25, 2015 at 7:35

answered Feb 25, 2015 at 7:17

Khanna111

3,9612 gold badges26 silver badges30 bronze badges

2 Comments

Vincent van der Weele Over a year ago

It is not true that in a set there can be only one element per bucket. If you have a clash (two different elements with the same hash), they would end up in the same bucket. In the source code you linked you can even see that every bucket has a linked list of Entry elements.

Khanna111 Over a year ago

You are right, what I meant was that there would be only unique keys in the set and no duplicates. Updated. Thanks.

Collectives™ on Stack Overflow

Algorithm used for bucket lookup for hashcodes [duplicate]

3 Answers 3

1 Comment

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

2 Comments

Linked

Related