Why is volatile keyword not needed for thread synchronisation?

Question

I am reading that the volatile keyword is not suitable for thread synchronisation and in fact it is not needed for these purposes at all.

While I understand that using this keyword is not sufficient, I fail to understand why is it completely unnecessary.

For example, assume we have two threads, thread A that only reads from a shared variable and thread B that only writes to a shared variable. Proper synchronisation by e.g. pthreads mutexes is enforced.

IIUC, without the volatile keyword, the compiler may look at the code of thread A and say: “The variable doesn’t appear to be modified here, but we have lots of reads; let’s read it only once, cache the value and optimise away all subsequent reads.” Also it may look at the code of thread B and say: “We have lots of writes to this variable here, but no reads; so, the written values are not needed and thus let’s optimise away all writes.“

Both optimisations would be incorrect. And ~~both~~ one would be prevented by volatile. So, I would likely come to the conclusion that while volatile is not enough to synchronise threads, it is still necessary for any variable shared between threads. (note: I now read that actually it is not required for volatile to prevent write elisions; so I am out of ideas how to prevent such incorrect optimisations)

I understand that I am wrong in here. But why?

the compiler sees both the reads and the writes, so will not optimize the variable away. You seem to be mixing compile time visibility with run time accessing. — user3629249
– user3629249, Commented Feb 11, 2016 at 17:34
@user3629249: nope, that's not what's happening. those two codes could be in entirely different compilation units. — Karoly Horvath
– Karoly Horvath, Commented Feb 11, 2016 at 17:36
In layman's terms: The compiler is free to rearrange or cache reads and writes (as long as there's no difference in the observable behaviour). The synchronization primitives have some "magic" properties that create barriers, and the compiler is not allowed to move reads and writes through these barriers... (it's a little more complex than that, so for a full explanation, wait for a proper answer). — Karoly Horvath
– Karoly Horvath, Commented Feb 11, 2016 at 17:42
@user3629249: This is all irrelevant, it's not how it works. I just tried to give an example so you can get a grasp on why it's a bad idea. — Karoly Horvath
– Karoly Horvath, Commented Feb 11, 2016 at 17:50
If all the work occurs within mutex protected blocks, then choosing to read once and cache or write only the final value is perfectly legitimate. Why would this be incorrect? If the mutex is held, no one else is reading or modifying the variables until it is released, so the variable can't be changed by another thread (making read optimizations acceptable) and can't be read by another thread, so the other thread would only see the "before" or "after" state, and it doesn't matter whether intermediate values are written as long as the final value is correct. — ShadowRanger
– ShadowRanger, Commented Feb 11, 2016 at 18:02

Community · Accepted Answer · 2020-06-20 09:12:55Z

9

For example, assume we have two threads, thread A that only reads from a shared variable and thread B that only writes to a shared variable. Proper synchronisation by e.g. pthreads mutexes is enforced.

IIUC, without the volatile keyword, the compiler may look at the code of thread A and say: “The variable doesn’t appear to be modified here, but we have lots of reads; let’s read it only once, cache the value and optimise away all subsequent reads.” Also it may look at the code of thread B and say: “We have lots of writes to this variable here, but no reads; so, the written values are not needed and thus let’s optimise away all writes.“

Like most thread synchronization primitives, pthreads mutex operations have explicitly defined memory visibility semantics.

Either the platform supports pthreads or it doesn't. If it supports pthreads, it supports pthreads mutexes. Either those optimizations are safe or they aren't. If they're safe, there's no problem. If they're unsafe, then any platform that makes them doesn't support pthreads mutexes.

For example, you say "The variable doesn’t appear to be modified here", but it does -- another thread could modify it there. Unless the compiler can prove its optimization can't break any conforming program, it can't make it. And a conforming program can modify the variable in another thread. Either the compiler supports POSIX threads or it doesn't.

As it happens, most of this happens automatically on most platforms. The compiler is just prevented from having any idea what the mutex operations do internally. Anything another thread could do, the mutex operations themselves could do. So the compiler has to "synchronize" memory before entering and exiting those functions anyway. It can't, for example, keep a value in a register across the call to pthread_mutex_lock because for all it knows, pthread_mutex_lock accesses that value in memory. Alternatively, if the compiler has special knowledge about the mutex functions, that would include knowing about the invalidity of caching values accessible to other threads across those calls.

A platform that requires volatile would be pretty much unusable. You'd need versions of every function or class for the specific cases where an object might be made visible to, or was made visible from, another thread. In many cases, you'd pretty much just have to make everything volatile and not caching values in registers is a performance non-starter.

As you've probably heard many times, volatile's semantics as specified in the C language just do not mix usefully with threads. Not only is it not sufficient, it disables many perfectly safe and nearly essential optimizations.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Feb 11, 2016 at 17:48

David Schwartz

184k18 gold badges229 silver badges293 bronze badges

Sign up to request clarification or add additional context in comments.

18 Comments

Karoly Horvath Over a year ago

No offence, but you're addressing him like he's stupid. His question is wise, he just doesn't know about those visibility semantics... on which you don't elaborate and all. Focus on what's important.

David Schwartz Over a year ago

@KarolyHorvath He knows volatile isn't sufficient. He's asking why it's necessary to tell the compiler not to optimize in code that he knows tells the compiler not to optimize some other way. It's really sufficient to point out that you don't have to tell it twice. It listens the first time. I don't think it's helpful to get bogged down in platform-specific implementation details.

David Schwartz Over a year ago

@KarolyHorvath The high-level view is that the platform supports POSIX threads and thus does whatever's necessary to make this work. You don't have to tell the compiler twice, it listens.

nos Over a year ago

@DavidSchwartz All that is because most thread function arn't treated as normal functions, but provides certain memory synchronization and visibility guarantees - which isn't necessarily obvious to everyone. (e.g. the list at pubs.opengroup.org/onlinepubs/9699919799/basedefs/… for pthreads)

Karoly Horvath Over a year ago

That's what I was also trying to tell, but failed. Lol.

|

SergeyA · Accepted Answer · 2016-02-11 18:31:49Z

4

Shortening the answer already given, you do not need to use volatile with mutexes for a simple reason:

If compiler knows what mutex operations are (by recognizing pthread_* functions or because you used std::mutex), it well knows how to handle access in regards to optimization (which is even required for std::mutex)
If compiler does not recognize them, pthread_* functions are completely opaque to it, and no optimizations involving any sort of non-local duration objects can go across opaque functions

answered Feb 11, 2016 at 18:31

SergeyA

62.9k5 gold badges85 silver badges143 bronze badges

1 Comment

SergeyA Over a year ago

I'd very much love to know what's wrong with my answer.

user8861000 · Accepted Answer · 2017-10-31 08:03:18Z

1

Making the answer even shorter, not using either a mutex or a semaphore, is a bug. As soon as thread B releases the mutex (and thread A gets it), any value in the register which contains the shared variable's value from thread B are guaranteed to be written to cache or memory that will prevent a race condition when thread A runs and reads this variable.

The implementation to guarantee this is architecture/compiler dependent.

answered Oct 31, 2017 at 8:03

user8861000

111 bronze badge

Comments

Jive Dadson · Accepted Answer · 2018-03-26 16:20:03Z

1

The keyword volatile tells the compiler to treat any write or read of the variable as an "observable side-effect." That is all it does. Observable side-effects of course must not be optimized away, and must appear to the outside world as occurring in the order the program indicates; The compiler may not re-order observable side-effects with regard to each other. The compiler is however free to reorder them with respect to non-observables. Therefore, volatile is only appropriate for accessing memory-mapped hardware, Unix-style signal handlers and the like. For inter-thread concurrence, use std::atomic or higher level synchronization objects like mutex, condition_variable, and promise/future.

answered Mar 26, 2018 at 16:20

Jive Dadson

17.2k9 gold badges56 silver badges66 bronze badges

Collectives™ on Stack Overflow

Why is volatile keyword not needed for thread synchronisation?

4 Answers 4

18 Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

18 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related