Is it OK to modify items in an ArrayList from multiple threads, if those threads never modify the same item?

Question

A bit of (simplified) context.

Let's say I have an ArrayList<ContentStub> where ContentStub is:

public class ContentStub {
    ContentType contentType;
    Object content;
}

And I have multiple implementations of classes that "inflate" stubs for each ContentType, e.g.

public class TypeAStubInflater {

    public void inflate(List<ContentStub> contentStubs) {
        contentStubs.forEach(stub ->
                             {
                                 if(stub.contentType == ContentType.TYPE_A) {
                                    stub.content = someService.getContent();
                                 }
                             });         
    }
}

The idea being, there is TypeAStubInflater which only modifies items ContentType.TYPE_A running in one thread, and TypeBStubInflater which only modifies items ContentType.TYPE_B, etc. - but each instance's inflate() method is modifying items in the same contentStubs List, in parallel.

However:

No thread ever changes the size of the ArrayList
No thread ever attempts to modify a value that's being modified by another thread
No thread ever attempts to read a value written by another thread

Given all this, it seems that no additional measures to ensure thread-safety are necessary. From a (very) quick look at the ArrayList implementation, it seems that there is no risk of a ConcurrentModificationException - however, that doesn't mean that something else can't go wrong. Am I missing something, or this safe to do?

ConcurrentModificationException is thrown when you are modifying state of a list (like by adding or removing elements which can affect its size etc.) but in your code you modify state of elements placed in list, so that has nothing to do with list itself. — Pshemo
– Pshemo, Commented Aug 20, 2020 at 9:09
That is my feeling as well - I wonder if there is something else bad about doing what I propose though. — kenny_k
– kenny_k, Commented Aug 20, 2020 at 9:15
Not really, it repeats what I already said I believe in the body of the question (ConcurrentModificationException not being a problem) .I was hoping for a more authoritative answer (i.e. with links to documentation/source), but I realize that proving something is not a problem probably an impossible task — kenny_k
– kenny_k, Commented Aug 21, 2020 at 8:18
There’s the fundamental point of Java, that these objects are not “in the … List”, but the list has references to these objects. There can be an arbitrary number of other references to these objects. That all doesn’t matter. The variable, you’re modifying, is stub.content of distinct objects. So there’s no problem with the writes, however, writing values that no-one ever reads would be pointless. There must be reads. And there must be a reason why these objects are in a list (i.e. there is code iterating over it). But if these things do not interact, they shouldn’t be in the same object. — Holger
– Holger, Commented Aug 25, 2020 at 16:05

Isfirs · Accepted Answer · 2020-08-20 11:30:24Z

1

In general, that will work, because you are not modifying the state of the List itself, which would throw a ConcurrentModificationException if any iterator is active at the time of looping, but rather are modifying just an object inside the list, which is fine from the list's POV.

I would recommend splitting up your into a Map<ContentType, List<ContentStub>> and then start Threads with those specific lists.

You could convert your list to a map with this:

Map<ContentType, ContentStub> typeToStubMap = stubs.stream().collect(Collectors.toMap(stub -> stub.contentType, Function.identity()));

If your List is not that big (<1000 entries) I would even recommend not using any threading, but just use a plain for-i loop to iterate, even .foreach if that 2 extra integers are no concern.

answered Aug 20, 2020 at 11:30

Isfirs

1341 silver badge12 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

akuzminykh Over a year ago

ThreadLocal might interest you.

kenny_k Over a year ago

Appreciate the advice - the reasons for the implementation being how it is - (1) The service calls are network I/O and the performance speedup from parallelization here is dramatic and (2) The order of the list is vital, and there is enough complex business logic in this code to make additional data structure manipulation highly undesirable.

kenny_k Over a year ago

Not sure how ThreadLocal would help here - I want all the threads to populate a shared data structure, not to have their own copies of it.

akuzminykh Over a year ago

@kenny_k You've said that "No thread ever attempts to read a value written by another thread" and "No thread ever attempts to modify a value that's being modified by another thread". So there is nothing shared. Could you clarify that?

Isfirs Over a year ago

Sounds very interesting. Sadly, I won't be able to look into it myself :)

Emmef · Accepted Answer · 2020-10-21 17:41:10Z

Let's assume the thread A writes TYPE_A content and thread B writes TYPE_B content. The List contentStubs is only used to obtain instances of ContentStub: read-access only. So from the perspective of A, B and contentStubs, there is no problem. However, the updates done by threads A and B will likely never be seen by another thread, e.g. another thread C will likely conclude that stub.content == null for all elements in the list.

The reason for this is the Java Memory Model. If you don't use constructs like locks, synchronization, volatile and atomic variables, the memory model gives no guarantee if and when modifications of an object by one thread are visible for another thread. To make this a little more practical, let's have an example.

Imagine that a thread A executes the following code:

    stub.content = someService.getContent(); // happens to be element[17]

List element 17 is a reference to a ContentStub object on the global heap. The VM is allowed to make a private thread copy of that object. All subsequent access to reference in thread A, uses the copy. The VM is free to decide when and if to update the original object on the global heap.

Now imagine a thread C that executes the following code:

    ContentStub stub = contentStubs.get(17);

The VM will likely do the same trick with a private copy in thread C.

If thread C already accessed the object before thread A updated it, thread C will likely use the – not updated – copy and ignore the global original for a long time. But even if thread C accesses the object for the first time after thread A updated it, there is no guarantee that the changes in the private copy of thread A already ended up in the global heap.

In short: without a lock or synchronization, thread C will almost certainly only read null values in each stub.content.

The reason for this memory model is performance. On modern hardware, there is a trade-off between performance and consistency across all CPUs/cores. If the memory model of a modern language requires consistency, that is very hard to guarantee on all hardware and it will likely impact performance too much. Modern languages therefore embrace low consistency and offer the developer explicit constructs to enforce it when needed. In combination with instruction reordering by both compilers and processors, that makes old-fashioned linear reasoning about your program code … interesting.

Collectives™ on Stack Overflow

Is it OK to modify items in an ArrayList from multiple threads, if those threads never modify the same item?

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related