14

I am reading that the volatile keyword is not suitable for thread synchronisation and in fact it is not needed for these purposes at all.

While I understand that using this keyword is not sufficient, I fail to understand why is it completely unnecessary.

For example, assume we have two threads, thread A that only reads from a shared variable and thread B that only writes to a shared variable. Proper synchronisation by e.g. pthreads mutexes is enforced.

IIUC, without the volatile keyword, the compiler may look at the code of thread A and say: “The variable doesn’t appear to be modified here, but we have lots of reads; let’s read it only once, cache the value and optimise away all subsequent reads.” Also it may look at the code of thread B and say: “We have lots of writes to this variable here, but no reads; so, the written values are not needed and thus let’s optimise away all writes.“

Both optimisations would be incorrect. And both one would be prevented by volatile. So, I would likely come to the conclusion that while volatile is not enough to synchronise threads, it is still necessary for any variable shared between threads. (note: I now read that actually it is not required for volatile to prevent write elisions; so I am out of ideas how to prevent such incorrect optimisations)

I understand that I am wrong in here. But why?

13
  • 1
    the compiler sees both the reads and the writes, so will not optimize the variable away. You seem to be mixing compile time visibility with run time accessing. Commented Feb 11, 2016 at 17:34
  • 2
    @user3629249: nope, that's not what's happening. those two codes could be in entirely different compilation units. Commented Feb 11, 2016 at 17:36
  • 8
    In layman's terms: The compiler is free to rearrange or cache reads and writes (as long as there's no difference in the observable behaviour). The synchronization primitives have some "magic" properties that create barriers, and the compiler is not allowed to move reads and writes through these barriers... (it's a little more complex than that, so for a full explanation, wait for a proper answer). Commented Feb 11, 2016 at 17:42
  • 1
    @user3629249: This is all irrelevant, it's not how it works. I just tried to give an example so you can get a grasp on why it's a bad idea. Commented Feb 11, 2016 at 17:50
  • 2
    If all the work occurs within mutex protected blocks, then choosing to read once and cache or write only the final value is perfectly legitimate. Why would this be incorrect? If the mutex is held, no one else is reading or modifying the variables until it is released, so the variable can't be changed by another thread (making read optimizations acceptable) and can't be read by another thread, so the other thread would only see the "before" or "after" state, and it doesn't matter whether intermediate values are written as long as the final value is correct. Commented Feb 11, 2016 at 18:02

4 Answers 4

9

For example, assume we have two threads, thread A that only reads from a shared variable and thread B that only writes to a shared variable. Proper synchronisation by e.g. pthreads mutexes is enforced.

IIUC, without the volatile keyword, the compiler may look at the code of thread A and say: “The variable doesn’t appear to be modified here, but we have lots of reads; let’s read it only once, cache the value and optimise away all subsequent reads.” Also it may look at the code of thread B and say: “We have lots of writes to this variable here, but no reads; so, the written values are not needed and thus let’s optimise away all writes.“

Like most thread synchronization primitives, pthreads mutex operations have explicitly defined memory visibility semantics.

Either the platform supports pthreads or it doesn't. If it supports pthreads, it supports pthreads mutexes. Either those optimizations are safe or they aren't. If they're safe, there's no problem. If they're unsafe, then any platform that makes them doesn't support pthreads mutexes.

For example, you say "The variable doesn’t appear to be modified here", but it does -- another thread could modify it there. Unless the compiler can prove its optimization can't break any conforming program, it can't make it. And a conforming program can modify the variable in another thread. Either the compiler supports POSIX threads or it doesn't.

As it happens, most of this happens automatically on most platforms. The compiler is just prevented from having any idea what the mutex operations do internally. Anything another thread could do, the mutex operations themselves could do. So the compiler has to "synchronize" memory before entering and exiting those functions anyway. It can't, for example, keep a value in a register across the call to pthread_mutex_lock because for all it knows, pthread_mutex_lock accesses that value in memory. Alternatively, if the compiler has special knowledge about the mutex functions, that would include knowing about the invalidity of caching values accessible to other threads across those calls.

A platform that requires volatile would be pretty much unusable. You'd need versions of every function or class for the specific cases where an object might be made visible to, or was made visible from, another thread. In many cases, you'd pretty much just have to make everything volatile and not caching values in registers is a performance non-starter.

As you've probably heard many times, volatile's semantics as specified in the C language just do not mix usefully with threads. Not only is it not sufficient, it disables many perfectly safe and nearly essential optimizations.

Sign up to request clarification or add additional context in comments.

18 Comments

No offence, but you're addressing him like he's stupid. His question is wise, he just doesn't know about those visibility semantics... on which you don't elaborate and all. Focus on what's important.
@KarolyHorvath He knows volatile isn't sufficient. He's asking why it's necessary to tell the compiler not to optimize in code that he knows tells the compiler not to optimize some other way. It's really sufficient to point out that you don't have to tell it twice. It listens the first time. I don't think it's helpful to get bogged down in platform-specific implementation details.
@KarolyHorvath The high-level view is that the platform supports POSIX threads and thus does whatever's necessary to make this work. You don't have to tell the compiler twice, it listens.
@DavidSchwartz All that is because most thread function arn't treated as normal functions, but provides certain memory synchronization and visibility guarantees - which isn't necessarily obvious to everyone. (e.g. the list at pubs.opengroup.org/onlinepubs/9699919799/basedefs/… for pthreads)
That's what I was also trying to tell, but failed. Lol.
|
4

Shortening the answer already given, you do not need to use volatile with mutexes for a simple reason:

  • If compiler knows what mutex operations are (by recognizing pthread_* functions or because you used std::mutex), it well knows how to handle access in regards to optimization (which is even required for std::mutex)
  • If compiler does not recognize them, pthread_* functions are completely opaque to it, and no optimizations involving any sort of non-local duration objects can go across opaque functions

1 Comment

I'd very much love to know what's wrong with my answer.
1

Making the answer even shorter, not using either a mutex or a semaphore, is a bug. As soon as thread B releases the mutex (and thread A gets it), any value in the register which contains the shared variable's value from thread B are guaranteed to be written to cache or memory that will prevent a race condition when thread A runs and reads this variable.

The implementation to guarantee this is architecture/compiler dependent.

Comments

1

The keyword volatile tells the compiler to treat any write or read of the variable as an "observable side-effect." That is all it does. Observable side-effects of course must not be optimized away, and must appear to the outside world as occurring in the order the program indicates; The compiler may not re-order observable side-effects with regard to each other. The compiler is however free to reorder them with respect to non-observables. Therefore, volatile is only appropriate for accessing memory-mapped hardware, Unix-style signal handlers and the like. For inter-thread concurrence, use std::atomic or higher level synchronization objects like mutex, condition_variable, and promise/future.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.