1

There is a new project that I am planning to start in few days and I would like to get some review done on my design points.

There is old legacy code that uses a hashtable as in memory database. There is one thread which consumes the xml feed from files and sockets and populates this hashtable and another thread does validation and update and third thread persists the validated data in the database if the validation is successful.

As the performance is struggling during the update (meaning other two threads are catching up fast and waiting for the validation thread to complete), I am planning to use a concurrenthashmap to prototype my solution and create more than one thread for validation. I am still in my prototyping stage but would like to get some feedback on if I am going in the right direction. Thank you in advance.

1 Answer 1

1

I don't think that concurrent hash map is going to help. I assume that you create number of entries in the hash table and upon validation, store them in the database. The problem is that your persistence thread has to wait for validation to complete.

If all entries in the hash table are interrelated and validator must check all of them - there is not much that you can do but wait.

However, if you can break down validation in smaller chunks (easiest case if entries are not related at all), that you could either parallelize validation with multiple threads or use consumer/producer pattern to store data. That is, once validator completes a chunk, it posts it to the queue and persistence thread reads from the queue and store that chunk.

Still if all entries must be checked, you can persist them in chunks but rollback if validation fails.

Sign up to request clarification or add additional context in comments.

3 Comments

Each entry in the hashtable is a collection which does not need to be validated against other entries in the hashtable. they are stand alone. validation of each record requires making some db calls and hence each record validation takes time.
So consumer producer will work just fine. In fact, it may even work across all three tiers. One pool of threads can read data from queue and validate them, and another pool of threads can read validated entries and save them in DB. Configure those thread pools to match the speed of producing entries by reader thread.
Thank you. I was just realizing this as well -- I dont think we will need to store this in a hashmap at all. Can I use ArrayBlockingQueue for implementing the consumer/producer ? Once it is read from the source ( once my data is constructed) , I will post into a ArrayBlockingQueue and then the consumer threads can come and validate and persist the validated data. Would this work?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.