0

I have been asked to implement fine grained locking on a hashlist. I have done this using synchronized but the questions tells me to use Lock instead.

I have created a hashlist of objects in the constructor

    private LinkedList<E> data[];;
    private Lock lock[];
    private Lock lockR = new ReentrantLock();
    
    // The constructors ensure that both the data and the dataLock are the same size
    @SuppressWarnings("unchecked")
    public ConcurrentHashList(int n){
        if(n > 1000) {
            data = (LinkedList<E>[])(new LinkedList[n/10]);
            lock = new Lock [n/10]; 
        }
        else {
            data = (LinkedList<E>[])(new LinkedList[100]);
            lock = new Lock [100]; ;
        }
        
        for(int j = 0; j < data.length;j++) {
            data[j] = new LinkedList<E>();
            lock[j] = new ReentrantLock();// Adding a lock to each bucket index
        }
    }
            

The original method

    public void add(E x){
         if(x != null){
             lock.lock();
             try{
                 int index = hashC(x);
                 if(!data[index].contains(x))
                     data[index].add(x);
             }finally{lock.unlock();}
          }
     }

Using synchronization to grab a handle on the object hashlist to allow mutable Threads to work on mutable indexes concurrently.

public void add(E x){
    if(x != null){
        int index = hashC(x);
        synchronized (dataLock[index]) { // Getting handle before adding 
            if(!data[index].contains(x))
                data[index].add(x);
        }       
    }
}

I do not know how to implement it using Lock though I can not lock a single element in a array only the whole method which means it is not coarse grained.

Using an array of ReentrantLock

public void add(E x){
    if(x != null){
        int index = hashC(x);
        dataLock[index].lock();
        try {
         // Getting handle before adding 
            if(!data[index].contains(x))
                data[index].add(x);
        }finally {dataLock[index].unlock();}                    
    }
}

The hash function

private int hashC(E x){
    int k = x.hashCode();
    int h = Math.abs(k % data.length);
    return(h);
}

1 Answer 1

1

Presumably, hashC() is a function that is highly likely to produce unique numbers. As in, you have no guarantee that the hashes are unique, but the incidence of non-unique hashes is extremely low. For a data structure with a few million entries, you have a literal handful of collisions, and any given collision always consists of only a pair or maybe 3 conflicts (2 to 3 objects in your data structure have the same hash, but not 'thousands').

Also, assumption: the hash for a given object is constant. hashC(x) will produce the same value no matter how many times you call it, assuming you provide the same x.

Then, you get some fun conclusions:

  1. The 'bucket' (The LinkedList instance found at array slot hashC(x) in data) that your object should go into, is always the same - you know which one it should be based solely on the result of hashC.
  2. Calculating hashC does not require a lock of any sort. It has no side effects whatsoever.
  3. Thus, knowing which bucket you need for a given operation on a single value (Be it add, remove, or check-if-in-collection) can be done without locking anything.

Now, once you know which bucket you need to look at / mutate, okay, now locking is involved.

So, just have 1 lock for each bucket. Not a List<Object> locks[];, that's a whole list worth of locks per bucket. Just Object[] locks is all you need, or ReentrantLock[] locks if you prefer to use lock/unlock instead of synchronized (lock[bucketIdx]) { ... }.

This is effectively fine-grained: After all, the odds that one operation needs to twiddle its thumbs because another thread is doing something, even though that other thread is operating on a different object, is very low; it would require the two different objects to have a colliding hash, which is possible, but extremely rare - as per assumption #1.

NB: Note that therefore lock can go away entirely, you don't need it, unless you want to build into your code that the code may completely re-design its bucket structure. For example, 1000 buckets feels a bit meh if you end up with a billion objects. I don't think 'rebucket everything' is part of the task here, though.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the information you presumed right about the hashC(). I have updated the question to the current state of my code. I think I have implemented it in one of the ways you described using ReentrantLock[]?. I though the same about the hashC() and not needing to lock or synchronize as this will be done in the main functioning methods anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.