7

I was going through an interview question on JavaRevisited and I'm having difficulty understanding this question :

What’s wrong with using a HashMap in a multithreaded environment? When get() method go into an infinite loop?

In my opinion, it's not a problem to use HashMap inside a multi-threaded environment, as long as our application is not accessing/reading threads which are modifying the created HashMap, rather than simply accessing the HashMap.

So, as I see it, there's not a problem as long as in the application we are just accessing the HashMap in a multi-threaded environment.

Please let me know if my understanding is correct.

2
  • I think you have a typo in your question. "as our application is creating threads which are modifying the created HashMap, rather than modifying the HashMap". I suspect the first "modifying" should be "accessing" or "reading". Commented Jun 15, 2012 at 12:51
  • Yes , thanks for pointing it out , our APplication doesn't create Threads explicitly . we just access the Data inside the HashMap . Commented Jun 15, 2012 at 12:57

4 Answers 4

24

What’s wrong using HashMap in multithreaded environment? When get() method go to infinite loop?

It is a bug to have multiple threads use a non-synchronized collection (really any mutable class) in an unprotected manner. Certain if each thread had their own HashMap instance then this is not an issue. It is a problem if multiple threads are adding to the same HashMap instance without it being synchronized. Even if just 1 thread is modifying a HashMap and other threads are reading from that same map without synchronization, you will run into problems.

If you need to use the same hash table object in multiple threads then you should consider using ConcurrentHashMap, wrapping each of the accesses to the HashMap in a synchronized {} block, or making use of the Collections.synchronizedMap(new HashMap<...>()) construct.

Chances are that the get() goes to an infinite loop because one of the threads has only a partially updated view of the HashMap in memory and there must be some sort of object reference loop. That's the peril of using an unsynchronized collection with multiple threads.

So in my understanding, it's not a problem as long as in the application we are just accessing the HashMap in a multi-threaded environment?

If by "accessing" you mean "reading", then this is true with qualifications. You must make sure:

  • All of the updates to the HashMap are completed before the threads are instantiated and the thread that creates the map also forks the threads
  • The threads are only using the HashMap in read-only mode – either get() or iteration without remove
  • There are no threads updating the map

If any of these conditions are not true then you will need to use a synchronized map instead.

Sign up to request clarification or add additional context in comments.

1 Comment

I also have similar question where I am using HashMap in multithreaded environment but I figured out I can use ConcurrentHashMap but it looks like there might be a thread safety issue in my code. I wanted to see what is the best way by which I can fix those in my code?
4

We know that HashMap is a non-synchronized collection whereas its synchronized counter-part is HashTable. So, when you are accessing the collection in a multithreaded environment and all threads are accessing a single instance of collection, then it's safer to use HashTable for various obvious reasons e.g. to avoid dirty reads and to maintain data consistency. In the worst case, this multithreaded environment can result in an infinite loop as well.

Yes, it is true. HashMap.get() can cause an infinite loop. Let us see how??

If you look at the source code HashMap.get(Object key) method, it looks like this:

 public Object get(Object key) {
    Object k = maskNull(key);
    int hash = hash(k);
    int i = indexFor(hash, table.length);
    Entry e = table[i];
    while (true) {
        if (e == null)
            return e;
        if (e.hash == hash &amp;&amp; eq(k, e.key))
            return e.value;
        e = e.next;
    }
}

while(true){...} can always be a victim of an infinite loop at runtime in a multithreaded environment, IF, somehow e.next can point to itself. This will result in an infinite loop. But, how e.next will point to itself?

This can happen in void transfer(Entry[] newTable) method, which is invoked at the time the HashMap resizing is done.

    do {
        Entry next = e.next;
        int i = indexFor(e.hash, newCapacity);
        e.next = newTable[i];
    newTable[i] = e;
        e
= next;
} while (e != null);

This piece of code is prone to produce the above condition if resizing happens and at the same time, other threads tried to modify the map instance.

The only way to avoid this scenario is to use synchronization in code, or better, use the synchronized collection.

Comments

3

This is a classical question. ArrayList and HashMap are not synchronized, while Vector and HashTable are. You should therefore use HashTable unless you are very careful defining mutexes yourself.

In other words, the methods in e.g. HashTable will ensure that no other thread is working with the HashTable at any given time. If you use a HashMap, you'd have to do that manually by ensuring that you synchronize on HashMap before you call the method.

Update: checkout @Gray's comment. It looks like wrapping HashMap with Collections.synchronizedMap(new HashMap()) is the way to go now.

EDIT: other posters have answered way better than I did. My answer, however, generated an interesting discussion on the use of the soon to be deprecated Vector, Stack, Hashtable and Dictionary classes, so I'm leaving the question here, as a head to the comments below. Thanks guys!

10 Comments

HashTable is all but deprecated and should not be used. If you need a concurrent HashMap then you should use ConcurrentHashMap or Collections.synchronizedMap(new HashMap()).
@Gray Oh, that's interesting, I had no idea. Do you have a reference to that? With HashTable not being marked as deprecated, this came as a surprise. Thanks!
@Gray And since we are at it, is vector also deprecated?
@Gary Or use ConcurrentHashMap which can be safely iterated without a CME.
They are "deprecated" since the introduction of the Concurrent package, which introduces thread-safe collections with a finer granularity than Vector or HashTable (which achieve thread-safety through a kind of big-lock on the whole data structure).
|
1

I guess that they meant to access to a shared copy of HashMap. Shared mutable state.

Since it is not synchronized every thread will grab its copy from the main memory, modify, and overwrite it.

HashMap with one entry <n, 1>

thread 1 grab the copy

thread 2 grab the copy

thread 1 modify <n, 2>

thread 2 modify <n, 3>

thread 1 is done, and stores the copy in the main memory

now memory is <n, 2>

thread 2 is done and stores the copy

now memory is <n, 3>

The state thread 1 is lost

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.