Specializations of std::atomic are not generally required to be lock-free.
On a platform that doesn't support the required operations for a type X atomically, the C++ implementation can still implement std::atomic<X> with the help of a mutex. That way you can simply do the comparison and swap operations in multiple instructions which do not need to make any atomicity/ordering guarantees while holding a lock on the mutex.
To test whether a specialization of std::atomic is lock-free, use std::atomic<X>::is_always_lock_free or the weaker form std::atomic<X>::is_lock_free().
The only type that is required to provide lock-free atomic operations on a conforming C++ implementation is std::atomic_flag which has only two states and provides fewer operations than std::atomic and can be fully implemented by an atomic exchange of a byte, which the platform needs to provide for (plus a pure load in C++20 and later).
A std::atomic_flag is sufficient to implement locking, so that it is sufficient to implement all std::atomic specializations, but not lock-free.
The above requirements can be satisfied with the help of the OS scheduler by always having only one C++ thread run at the same time, effectively using only a single physical thread. But for actual concurrent multi-threading the hardware needs to provide the above-mentioned mechanisms. Depending on what atomic operations the CPU/instruction set supports on which size of operands, the std::atomic specializations will be implemented as lock-free using these operations, or instead using a locking mechanism.
compare_exchange_weakexists.