Understanding how Java HashMap works

Ask Question

Asked 5 years, 8 months ago

Modified 5 years, 8 months ago

Viewed 275 times

I know following thing about Java HashMap put():

Calling put() starts by inserting key-values into HashMap.
Initially we have linked list per bin, but once we have more than `TREEFY_THRESHOLD entries in a bin, bin contents are rearranged from linked list into balanced trees.

Similarly, get() should work as follows:

If the bin in which the key hashes to is in the form of linked list, traverse it to check if the input key exists in the list or not. If it does then return its value.
If the bin is in the form balanced tree, search the key in the tree and return its value

This is very high level abstract understanding after reading some articles, but I am unable to see how exactly this is happening in code. I tried to google it a bit and found these good articles: 1, 2, 3. But they target HashMap from java 7 (which does not involve balanced tree logic) and not java 8 (which involves balanced tree logic). Actually I am not struggling with balanced tree logic, but with the rest. Java 7 implementation was quite easy, but not the java 8 implementation. The whole code is quite involved and difficult to understand. It will be great if someone explain important methods of HashMap. If not, below are some specific doubts:

The Java 8 HashMap.get() ^source:

1.     public V get(Object key) {
2.         Node<K,V> e;
3.         return (e = getNode(hash(key), key)) == null ? null : e.value;
4.     }
5. 
6.     /**
7.      * Implements Map.get and related methods
8.      *
9.      * @param hash hash for key
10.      * @param key the key
11.      * @return the node, or null if none
12.      */
13.     final Node<K,V> getNode(int hash, Object key) {
14.         Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
15.         if ((tab = table) != null && (n = tab.length) > 0 &&
16.             (first = tab[(n - 1) & hash]) != null) {
17.             if (first.hash == hash && // always check first node
18.                 ((k = first.key) == key || (key != null && key.equals(k))))
19.                 return first;
20.             if ((e = first.next) != null) {
21.                 if (first instanceof TreeNode)
22.                     return ((TreeNode<K,V>)first).getTreeNode(hash, key);
23.                 do {
24.                     if (e.hash == hash &&
25.                         ((k = e.key) == key || (key != null && key.equals(k))))
26.                         return e;
27.                 } while ((e = e.next) != null);
28.             }
29.         }
30.         return null;
31.     }

(a) Why to always check first node explicitly on line 17?
(b) What does tab[(n - 1) & hash] do on line 16?
(c) May I will need to understance hash()^source also:

static final int hash(Object key) {
   int h;
   return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Java 8 HashMap.put() ^source:

1.     public V put(K key, V value) {
2.         return putVal(hash(key), key, value, false, true);
3.     }
4. 
5.     final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
6.                    boolean evict) {
7.         Node<K,V>[] tab; Node<K,V> p; int n, i;
8.         if ((tab = table) == null || (n = tab.length) == 0)
9.             n = (tab = resize()).length;
10.         if ((p = tab[i = (n - 1) & hash]) == null)
11.             tab[i] = newNode(hash, key, value, null);
12.         else {
13.             Node<K,V> e; K k;
14.             if (p.hash == h ash &&
15.                 ((k = p.key) == key || (key != null && key.equals(k))))
16.                 e = p;
17.             else if (p instanceof TreeNode)
18.                 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
19.             else {
20.                 for (int binCount = 0; ; ++binCount) {
21.                     if ((e = p.next) == null) {
22.                         p.next = newNode(hash, key, value, null);
23.                         if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
24.                             treeifyBin(tab, hash);
25.                         break;
26.                     }
27.                     if (e.hash == hash &&
28.                         ((k = e.key) == key || (key != null && key.equals(k))))
29.                         break;
30.                     p = e;
31.                 }
32.             }
33.             if (e != null) { // existing mapping for key
34.                 V oldValue = e.value;
35.                 if (!onlyIfAbsent || oldValue == null)
36.                     e.value = value;
37.                 afterNodeAccess(e);
38.                 return oldValue;
39.             }
40.         }
41.         ++modCount;
42.         if (++size > threshold)
43.             resize();
44.         afterNodeInsertion(evict);
45.         return null;
46.     }

(a) I understand putTreeVal() in line 18 inserts key-value in balanced tree. But I dont find logic to put key-value in the linked list as in line 27 of getNode(), which involves traversing till end of the linked list.
(b) What does if on lines 10 and 14 do?
(c) I also dont get full logic completely clear. Can you provide comments for each important line. In fact I want to know if there is any documentation / source code repo having comment for on each important line explaining what it is trying to understand, because there are many tricky complex methods in this class like, resize(), tableSizeFor(int)

asked Mar 21, 2020 at 19:10

MsA

3,0296 gold badges31 silver badges71 bronze badges

3

If someone does answer this, it is going to be a looong answer. If you on the other hand ask one thing at a time, you will find out that the majority of your questions are duplicates. Here are some, 1, 2, 3.

Eugene
– Eugene

2020-03-21 19:16:45 +00:00
Commented Mar 21, 2020 at 19:16
2

"This is very high level abstract understanding after reading some articles" ... No. No, not high level at all. What you are asking about deals with a low level understanding of the Java Library's HashMap implementation. Maybe this is important to you because you are interested in using details of the implementation in some code of your own. However, for most users of the Collections Framework, these details are hardly very important.

scottb
– scottb

2020-03-21 19:19:11 +00:00
Commented Mar 21, 2020 at 19:19
I do agree with you on one question, though. I do find the resize interesting too, but I still have no answer to that

Eugene
– Eugene

2020-03-21 19:25:01 +00:00
Commented Mar 21, 2020 at 19:25
2

One of the best ways to understand code is to write unit test scenarios, debug and step through that code.

Andrew S
– Andrew S

2020-03-21 19:50:07 +00:00
Commented Mar 21, 2020 at 19:50
1

a when you are looking for a traversal, you are looking for a loop. There is only one loop, so it's actually easy to find. b line 10: if there is no entry at the array position, create one, line 14: if the node found at the array position has the right hashcode (shortcut) and the key matches, this is the target entry. c we surely don't do that for you. This is not a code review site. The previous questions were already stretching, but this is clearly off topic.

Holger
– Holger

2020-03-24 08:36:54 +00:00
Commented Mar 24, 2020 at 8:36

| Show 2 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Understanding how Java HashMap works

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked