Multithreaded Atomic Store/Load of multiple values in C++

Question

Suppose I have a structure and class in C++:

struct Vec {
   double x;
   double y;
   double z;
}

class VecTracker {
   Vec latest_vec;
   std::atomic<double> highest_x;
   std::atomic<double> highest_y;
   std::atomic<double> highest_z;

   //updates highest_x, highest_y, highest_z atomically
   void push_vec(const Vec& v);
   double get_high_x() const;
   double get_high_y() const;
   double get_high_z() const;
   //returns Vec consisting of snapshot of highest_x, highest_y, highest_z
   Vec get_highs() const;
}

I'll have R reader threads and one writer thread. The writer thread will update zero or more of the highest_* members. If the reader thread calls get_highs() I need all the writes from the current invocation of the writer thread's push_vec() function to be visible to the reader thread before the reader thread reads highest_x, highest_y, etc. to produce a vector.

Now, I know that if Vec is sufficiently small, I could just use a std::atomic<Vec>. Problem is, if it's too big, native CPU instructions for these store/loads can't be used. Is there any way to use std::atomic_thread_fence to guarantee that multiple atomic writes are committed by the writer thread before the reader thread picks them up? That is, a guarantee that all writes by the writer thread are committed before a reader thread sees any of them? Or does std::atomic_thread_fence only provide reordering guarantees within a thread? Currently, just using the .store(std::memory_order_release) for each member doesn't seem to guarantee that all three stores happen before any reads.

Obviously, I could use a lock here, but ideally I want to find a way to make this data structure lockfree.

I know that I could put highest_x, highest_y, and highest_z in a single struct and allocate two copies of it on the heap, swapping pointers atomically after each write. Is this the only way to do it?

SergeyA · Accepted Answer · 2017-01-12 17:07:16Z

4

The devil is here: //updates highest_x, highest_y, highest_z atomically. How do you guarantee that they are, indeed, atomic? Since 3 doubles do not fit into 16B (the largest atomic operation I know on X86_64 platform) the only way to ensure this would be to use mutex.

Your problem is not with the fence. By issuing the fence instruction, you will guarantee that all previous updates would be visible. What you can't guarantee, though, is that they would not be visible before this. As a result, you would be able to read the more recent value for one of the vector variables.

To solve your issue, you should either go with mutex - they are quite efficient when uncontended - or, if you are allergic to mutexes, pointer swap solution you described yourself.

edited Jan 12, 2017 at 17:07

answered Jan 4, 2017 at 20:44

SergeyA

62.9k5 gold badges85 silver badges143 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

alfalfasprout Over a year ago

Aha. Yeah, I assumed that only the native atomic operations could actually guarantee atomic behavior. It looks like the pointer swap is the way to go then. Can't really use a Mutex or spinlock here since there will be very heavy contention and I'd like to guarantee progress.

Erik Nyström Over a year ago

I searched, but could not find any reference to a CMPXCHG32B instruction. If such an instruction actually exists, it would be sufficient in this case, as 3 doubles only amount to 24B on most systems today. Did you perhaps confuse it with CMPXCHG16B? In that case, your argument makes much more sense.

SergeyA Over a year ago

@ErikNyström, 100% yes! Thanks for spotting.

Collectives™ on Stack Overflow

Multithreaded Atomic Store/Load of multiple values in C++

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related