Using Lock to lock a position in an array instead of using synchronized for concurrency (Fine Grained Locking)

Question

I have been asked to implement fine grained locking on a hashlist. I have done this using synchronized but the questions tells me to use Lock instead.

I have created a hashlist of objects in the constructor

    private LinkedList<E> data[];;
    private Lock lock[];
    private Lock lockR = new ReentrantLock();
    
    // The constructors ensure that both the data and the dataLock are the same size
    @SuppressWarnings("unchecked")
    public ConcurrentHashList(int n){
        if(n > 1000) {
            data = (LinkedList<E>[])(new LinkedList[n/10]);
            lock = new Lock [n/10]; 
        }
        else {
            data = (LinkedList<E>[])(new LinkedList[100]);
            lock = new Lock [100]; ;
        }
        
        for(int j = 0; j < data.length;j++) {
            data[j] = new LinkedList<E>();
            lock[j] = new ReentrantLock();// Adding a lock to each bucket index
        }
    }

The original method

    public void add(E x){
         if(x != null){
             lock.lock();
             try{
                 int index = hashC(x);
                 if(!data[index].contains(x))
                     data[index].add(x);
             }finally{lock.unlock();}
          }
     }

Using synchronization to grab a handle on the object hashlist to allow mutable Threads to work on mutable indexes concurrently.

public void add(E x){
    if(x != null){
        int index = hashC(x);
        synchronized (dataLock[index]) { // Getting handle before adding 
            if(!data[index].contains(x))
                data[index].add(x);
        }       
    }
}

I do not know how to implement it using Lock though I can not lock a single element in a array only the whole method which means it is not coarse grained.

Using an array of ReentrantLock

public void add(E x){
    if(x != null){
        int index = hashC(x);
        dataLock[index].lock();
        try {
         // Getting handle before adding 
            if(!data[index].contains(x))
                data[index].add(x);
        }finally {dataLock[index].unlock();}                    
    }
}

The hash function

private int hashC(E x){
    int k = x.hashCode();
    int h = Math.abs(k % data.length);
    return(h);
}

rzwitserloot · Accepted Answer · 2020-12-03 16:51:13Z

Presumably, hashC() is a function that is highly likely to produce unique numbers. As in, you have no guarantee that the hashes are unique, but the incidence of non-unique hashes is extremely low. For a data structure with a few million entries, you have a literal handful of collisions, and any given collision always consists of only a pair or maybe 3 conflicts (2 to 3 objects in your data structure have the same hash, but not 'thousands').

Also, assumption: the hash for a given object is constant. hashC(x) will produce the same value no matter how many times you call it, assuming you provide the same x.

Then, you get some fun conclusions:

The 'bucket' (The LinkedList instance found at array slot hashC(x) in data) that your object should go into, is always the same - you know which one it should be based solely on the result of hashC.
Calculating hashC does not require a lock of any sort. It has no side effects whatsoever.
Thus, knowing which bucket you need for a given operation on a single value (Be it add, remove, or check-if-in-collection) can be done without locking anything.

Now, once you know which bucket you need to look at / mutate, okay, now locking is involved.

So, just have 1 lock for each bucket. Not a List<Object> locks[];, that's a whole list worth of locks per bucket. Just Object[] locks is all you need, or ReentrantLock[] locks if you prefer to use lock/unlock instead of synchronized (lock[bucketIdx]) { ... }.

This is effectively fine-grained: After all, the odds that one operation needs to twiddle its thumbs because another thread is doing something, even though that other thread is operating on a different object, is very low; it would require the two different objects to have a colliding hash, which is possible, but extremely rare - as per assumption #1.

NB: Note that therefore lock can go away entirely, you don't need it, unless you want to build into your code that the code may completely re-design its bucket structure. For example, 1000 buckets feels a bit meh if you end up with a billion objects. I don't think 'rebucket everything' is part of the task here, though.

Thanks for the information you presumed right about the hashC(). I have updated the question to the current state of my code. I think I have implemented it in one of the ways you described using ReentrantLock[]?. I though the same about the hashC() and not needing to lock or synchronize as this will be done in the main functioning methods anyway.

Collectives™ on Stack Overflow

Using Lock to lock a position in an array instead of using synchronized for concurrency (Fine Grained Locking)

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related