Understanding of C++'s std::atomic<T> and compare-and-swap [duplicate]

Question

My understanding is that compare-and-swap is something supported by hardware, e.g., CMPXCHG in x86 architecture. I have the following two confusions:

Is it that C++'s atomic does not "implement" atomicity itself, but rather it leverages the atomic functions of CPUs?
But what if an architecture does not have compare-and-swap functions? If a compiler on that platform wants to be C++ standard-compliant, it has to find some other (probably much more computationally expensive) ways to implement std::atomic without using compare-and-swap approach?

compare-and-swap can be implemented on top of LL/SC en.wikipedia.org/wiki/Load-link/store-conditional, along with other atomic RMW operations on a single object. LL/SC is why compare_exchange_weak exists. — Peter Cordes
– Peter Cordes, Commented Sep 28, 2023 at 3:45

user17732522 · Accepted Answer · 2023-09-28 04:10:06Z

2

Specializations of std::atomic are not generally required to be lock-free.

On a platform that doesn't support the required operations for a type X atomically, the C++ implementation can still implement std::atomic<X> with the help of a mutex. That way you can simply do the comparison and swap operations in multiple instructions which do not need to make any atomicity/ordering guarantees while holding a lock on the mutex.

To test whether a specialization of std::atomic is lock-free, use std::atomic<X>::is_always_lock_free or the weaker form std::atomic<X>::is_lock_free().

The only type that is required to provide lock-free atomic operations on a conforming C++ implementation is std::atomic_flag which has only two states and provides fewer operations than std::atomic and can be fully implemented by an atomic exchange of a byte, which the platform needs to provide for (plus a pure load in C++20 and later).

A std::atomic_flag is sufficient to implement locking, so that it is sufficient to implement all std::atomic specializations, but not lock-free.

The above requirements can be satisfied with the help of the OS scheduler by always having only one C++ thread run at the same time, effectively using only a single physical thread. But for actual concurrent multi-threading the hardware needs to provide the above-mentioned mechanisms. Depending on what atomic operations the CPU/instruction set supports on which size of operands, the std::atomic specializations will be implemented as lock-free using these operations, or instead using a locking mechanism.

edited Sep 28, 2023 at 4:10

answered Sep 28, 2023 at 3:42

user17732522

78.1k3 gold badges82 silver badges147 bronze badges

Sign up to request clarification or add additional context in comments.

15 Comments

D.J. Elkind Over a year ago

do you mean that std::atomic or mutex, hardware support is almost always needed to have multi-thread synchronization? If hardware support does not exist, language itself is almost impossible to achieve multh-thread synchronization?

Peter Cordes Over a year ago

Worth pointing out that the only operations supported on std::atomic_flag are ones that can be done with an atomic exchange of a byte or word. (C++20 adds a pure load function.) I think I've read that some actual old hardware had swap as the only atomic RMW. This is insufficient to implement lock-free .fetch_add() or CAS, but can implement locks. (@D.J.Elkind)

Peter Cordes Over a year ago

@D.J.Elkind: On a uniprocessor system, the OS could provide the required atomicity by asking it not to context-switch to another thread (of this process or at all) until after you're done. Or by having atomic ops as system calls, and perhaps the OS implements them by disabling interrupts. (With only one CPU, the only way for other threads to run is if this thread is if this core switches to another thread between instructions of a multi-step thing. Unlike with multi-core systems where another thread can be running simultaneously.)

D.J. Elkind Over a year ago

@user17732522 for simplicity, let's just say on x86: do std::atomic and std::mutex rely on two different synchronization mechanisms? std::atomic is generally considered faster because the synchronization mechanism is used is faster (than the sync mechanism used by std::mutex)?

D.J. Elkind Over a year ago

@PeterCordes if I get you correctly, you mean that if my computer has one CPU (and one thread) only, then it is possible to achieve multi-thread synchronization withOUT hardware support (as all "threads" are logical ones). but if we have a multi-core CPU, it is very unlikely (if not completely impossible) to implement multi-thread synchronization without hardware support. Is this what you mean?

|

Collectives™ on Stack Overflow

Understanding of C++'s std::atomic<T> and compare-and-swap [duplicate]

1 Answer 1

15 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

15 Comments

Linked

Related