4

I want to write a simple thread-safe arraylist which supports:

add(), remove(int i), insert(int i), update(int i), and get(int i)

One simple implementation is to add lock to the internal data structure(an object array for example), but it is not good enough because only one thread could access the list at a time.

Therefore my initial plan is to add lock to each data slot so that different threads could have access to elements in different indexes at the same time. The data structure will look like this:

class MyArrayList {
    Lock listlock;
    Lock[] locks;
    Object[] array;
}

The locking should work as follows if there is no need to do resize():

  • for get(int i), a thread needs to acquire locks[i].
  • for insert(int i), a thread needs to acquire all locks[j] for j >= i, and listlock.
  • for remove(int i), a thread needs to acquire all locks[j] for j >= i, and listlock.
  • for add(), a thread needs to acquire listlock.
  • for insert(), a thread needs to acquire locks[i].

My questions are:

  • How to handle the locks when resizing while more objects are adding and I need to create a new larger array to hold all objects. It is annoying because some other threads may also wait for the locks to be released,
  • Any better suggestions to implement such thread-safe arraylist?
5
  • What about List list = Collections.synchronizedList(new ArrayList());? Commented Sep 26, 2017 at 22:53
  • That exists, it is called Vector. Commented Sep 26, 2017 at 22:53
  • 1
    I assume this is an exercise and not an easy one as there are built-in concurrent collections in the JDK. Commented Sep 26, 2017 at 23:03
  • @DwB thanks for the info. Actually, I found that, from source code of Vector or synchronizedList, synchronized keyword is used to ensure thread safety. Take get() for example, thread A wants to call get(0), while thread B wants to call get(1), using synchronized will only allow one thread call get() function even though two threads are trying to read data from different location. What I am trying to do is to do this simultaneously if threads are accessing different slots. Commented Sep 27, 2017 at 4:24
  • That is in no way reasonable. You can not, reasonably, limit the scale of synchronization on the list because there are multiple way to access a given element in the list (iterator, enumerator, access-by-index) and growing the list by one element can cause a complete rebuild of the array (i.e. allocate new array, copy old elements, dump old array). Commented Sep 27, 2017 at 14:33

1 Answer 1

0

A simple approach would be to just use a read-write lock ([Reentrant]ReadWriteLock), so many threads could read concurrently, but once someone gets the write lock, nobody else can access the list.

Or you could do something somewhat similar to your idea: one read-write lock for each slot + a global ("structural") read-write lock + a variable to keep track of the j >= i cases. So:

  • To access (read or write) anything, a thread needs at least the global read lock.
  • Only threads trying to make structural changes (the ones that change the size) get the global write lock, but only to set an int modifyingFrom variable indicating all positions from there on are "locked" (the j >= i cases). After setting modifyingFrom, you downgrade (see docs) from write to read lock, letting others access the list.
  • Any thread trying to do anything that isn't a structural change, once holding the global read lock, checks if what it wants to do conflicts with the current value of modifyingFrom. If there's a conflict, sleep until the thread who has set modifyingFrom finishes and notifies everybody who is waiting. This check must be synchronized (just use synchronized (obj) on some object) so the structure-changing thread doesn't happen to obj.notify() before the conflicting thread calls obj.wait() and sleeps forever (holding the global read lock!). :(
  • You should either have a boolean structuralChangeHappening = false or set modifyingFrom to some x > <list size> when no structural changes are happening (then you can just check that i < modifyingFrom to get() or update()). A thread finishing a structural change sets modifyingFrom back to this value and here's where it has to synchronize to notify waiting threads.
  • A thread wanting to make a structural change when one is already happening will wait because it needs the global write lock and at least one thread has the global read lock. In fact, it will wait until nobody is accessing the list at all.
  • A thread allocating a new (bigger... or smaller, if you had a trimToSize() or something) array would hold the global write lock during the entire operation.

I was tempted to think the global read-write lock wasn't really necessary, but the last two points justify it.

Some example cases:

  • Some threads trying to get(i) (each with it's i, unique or not): each one would get the global read lock, then the ith read lock, then read the position, and nobody would wait at all.
  • The same case with additional threads trying to update([index =] i, element): if there are no equal is, nobody will wait. Otherwise, only the thread writing or the threads reading the conflicting position will wait.
  • A thread t starts an insert([index =] 5, element), and other threads try to get(i): Once t has set modifyingFrom = 5 and released the global write lock, all threads reading get the global read lock, then check modifyingFrom. Those with i < modifyingFrom just get the (read) lock of the slot; the others wait until the insert(5) finishes and notifies, then get the lock of the slot.
  • A thread starts an add() and needs to allocate a new array: Once it gets the global write lock, nobody else can do anything until it has finished.
  • The size of the list is 7, a thread t_a calls add(element) and another thread t_g calls get([index =] 7):
    • If t_a happens to get the global write lock first, it sets modifyingFrom = 7, and once it has released the lock, t_g gets the global read lock, sees that index (= 7) >= modifyingFrom and sleeps until t_a finishes and notifies it.
    • If t_g gets the global read lock first, it checks that 7 < modifyingFrom (modifyingFrom > <list size> (== 7), 4th point before the examples), then throws an exception because 7 >= <list size> after releasing the lock! Then t_a is able to get the global write lock and proceeds normally.

Remembering that accesses to modifyingFrom must be synchronized.

You said you want only that five operations, but if you wanted an iterator, it could check if something changed by other means (not the iterator itself), like standard classes do.

Now, I don't know under which conditions exactly this would be better than other approaches. Also, consider that you may need more restrictions in a real application, because this should ensure only consistency: if you try to read and write the same position, the read can happen before or after the write. Maybe it would make sense to have methods like tryUpdate(int, E), that only does something if no conflicting structural changes are happening when the method is called, or tryUpdate(int, E, Predicate<ArrayList>), which only does its work if the list is in a state that satisfies the predicate (which should be defined carefully not to cause deadlocks).

Please let me know if I missed something. There may be lots of corner cases. :)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.