6

I have a data structure which consists of 1,000 array elements, each array element is a smaller array of 8 ints:

std::array<std::array<int, 8>, 1000>

The data structure contains two "pointers", which track the largest and smallest populated array elements (within the "outer", 1000-element array). So for example they might be:

min = 247 
max = 842

How can I read and write to this data structure from multiple threads? I am worried about race conditions between pushing/popping elements and maintaining the two "pointers". My basic mode of operation is:

// Pop element from current index
// Calculate new index
// Write element to new index
// Update min and max "pointers"
9
  • 4
    How exactly do you pop from a std::array? Commented Oct 5, 2015 at 11:26
  • How often do you access it? A global lock might be enogh. Commented Oct 5, 2015 at 11:30
  • @nwp you remove the value and blank-out the array element...... not terribly difficult. Commented Oct 5, 2015 at 11:31
  • @KarolyHorvath this is going to be accessed regularly, I was hoping to avoid a lock but this increase in throughput could outweigh the cost of the lock. Commented Oct 5, 2015 at 11:32
  • If you lock mostly for read then protect the data with a "global" rw-lock. Commented Oct 5, 2015 at 11:36

1 Answer 1

1

You are correct that your current algorithm is not thread safe, there are a number of places where contention could occur.

This is impossible to optimize without more information though. You need to know where the slow-down is happening before you can improve it - and for that you need metrics. Profile your code and find out what bits are actually taking the time, because you can only gain by parallelizing those bits and even then you may find that it's actually memory or something else that is the limiting factor, not CPU.

The simplest approach will then be to just lock the entire structure for the full process. This will only work if the threads are doing a lot of other processing as well, if not you will actually lose performance compared to single threading.

After that you can consider having a separate lock for different sections of the data structure. You will need to properly analyse what you are using when and where and work out what would be useful to split. For example you might have chunks of the sub arrays with each chunk having its own lock.

Be careful of deadlocks in this situation though, you might have a thread claim 32 then want 79 while another thread already has 79 and then wants 32. Make sure you always claim locks in the same order.

The fastest option (if possible) may even be to give each thread it's own copy of the data structure, each processes 1/N of the work and then merge the results at the end. This way no synchronization is needed at all during processing.

But again it all comes back to the metrics and profiling. This is not a simple problem.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.