7

We've recently had a discussion at my work about whether we need to use ConcurrentHashMap or if we can simply use regular HashMap, in our multithreaded environment. The argument for HashMaps are two: it is faster then the ConcurrentHashMap, so we should use it if possible. And ConcurrentModificationException apparently only appears as you iterate over the Map as it is modified, so "if we only PUT and GET from the map, what is the problem with the regular HashMap?" was the arguments.

I thought that concurrent PUT actions or concurrent PUT and READ could lead to exceptions, so I put together a test to show this. The test is simple; create 10 threads, each which writes the same 1000 key-value pairs into the map again-and-again for 5 seconds, then print the resulting map.

The results were quite confusing actually:

Length:1299
Errors recorded: 0

I thought each key-value pair was unique in a HashMap, but looking through the map, I can find multiple Key-Value pairs that are identical. I expected either some kind of exception or corrupted keys or values, but I did not expect this. How does this occur?

Here's the code I used, for reference:

public class ConcurrentErrorTest
{
    static final long runtime = 5000;
    static final AtomicInteger errCount = new AtomicInteger();
    static final int count = 10;

    public static void main(String[] args) throws InterruptedException
    {
        List<Thread> threads = new LinkedList<>();
        final Map<String, Integer> map = getMap();

        for (int i = 0; i < count; i++)
        {
            Thread t = getThread(map);
            threads.add(t);
            t.start();
        }

        for (int i = 0; i < count; i++)
        {
            threads.get(i).join(runtime + 1000);
        }

        for (String s : map.keySet())
        {
            System.out.println(s + " " + map.get(s));
        }
        System.out.println("Length:" + map.size());
        System.out.println("Errors recorded: " + errCount.get());
    }

    private static Map<String, Integer> getMap()
    {
        Map<String, Integer> map = new HashMap<>();
        return map;
    }

    private static Map<String, Integer> getConcMap()
    {
        Map<String, Integer> map = new ConcurrentHashMap<>();
        return map;
    }

    private static Thread getThread(final Map<String, Integer> map)
    {
        return new Thread(new Runnable() {
            @Override
            public void run()
            {
                long start = System.currentTimeMillis();
                long now = start;
                while (now - start < runtime)
                {
                    try
                    {
                        for (int i = 0; i < 1000; i++)
                            map.put("i=" + i, i);
                        now = System.currentTimeMillis();
                    }
                    catch (Exception e)
                    {
                        System.out.println("P - Error occured: " + e.toString());
                        errCount.incrementAndGet();
                    }
                }
            }
        });
    }
}
17
  • 7
    Your test just showed you how HashMap isn't thread-safe. Commented Nov 25, 2016 at 12:48
  • 2
    You don't get exceptions if you use from multiple threads without proper synchronization. You get a broken data structure with no guarantee that the invariants still hold Commented Nov 25, 2016 at 12:48
  • 1
    Use ConcurrentHashMap Commented Nov 25, 2016 at 12:50
  • 1
    My recommendation: use ConcurrentHashMap, because you can afford a little performance hit over broken code because you think you understand concurrency. Then use the time you've saved to actually learn concurrency (the whole team!). Commented Nov 25, 2016 at 12:50
  • 2
    Here's another cautionary tale about HashMap and naive assumptions about thread safety. It's fairly old and the implementation of the class has changed since, but the moral of the story is still relevant Commented Nov 25, 2016 at 12:52

2 Answers 2

12

What you're faced with seems to be a TOCTTOU class problem. (Yes, this kind of bug happens so often, it's got its own name. :))

When you insert an entry into a map, at least the following two things need to happen:

  1. Check whether the key already exists.
  2. If the check returned true, update the existing entry, if it didn't, add a new one.

If these two don't happen atomically (as they would in a correctly synchronized map implementation), then several threads can come to the conclusion that the key doesn't exist yet in step 1, but by the time they reach step 2, that isn't true any more. So multiple threads will happily insert an entry with the same key.

Please note that this isn't the only problem that can happen, and depending on the implementation and your luck with visibility, you can get all kinds of different and unexpected failures.

Sign up to request clarification or add additional context in comments.

2 Comments

More pronounceably known as "check-then-act" bug. Often happens with lock-free code.
Thank you. This answer, as well as the blog post you linked earlier helped a lot to get my head around this. I remember having read about this in school a few years ago.
0

In multi thread environment, you should always use CuncurrentHashMap, if you are going to perform any operation except get.

Most of the time you won't get an exception, but definitely get the corrupt data because of the thread local copy value.

Every thread has its own copy of the Map data when performing the put operation and when they check for key existence, multiple threads found it false and they enter the data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.