

Volatiles Are Miscompiled, and What to Do about It [pdf] - emkemp
http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

======
kenrose
Looking at the generated code for their watchdog example and their workaround
of forcing a function evaluation, it looks like the common cause of this bug
amongst all of the compilers is the optimizer not respecting volatile. It's
easy to understand how this could be the case.

Imagine you work on a production quality C compiler. The pedestrian pieces
(e.g., lexer, parser, AST) have been stable forever. The piece that you're
very likely going to be working on is the optimizer (or maybe the code
generator) to make use of new techniques or new instructions. While you're
going about your business, you're probably not thinking about that arcane
corner of the C language spec that discusses volatile. What's more, when you
finally complete your feature, all of the compiler's test suites pass because
there's insufficient coverage for volatile.

The biggest contribution of this paper, besides the fact that it identified
this issue across various compilers, is the notion of access summary testing
and advocating for it to be included as part of the test suite for C
compilers.

~~~
jrockway
At least gcc's generated code has no possibility of working at all, and the
system will reboot in a loop the first time it's powered up. So you're only 30
seconds away from finding that bug. (Of course, once you notice that the
watchdog is being triggered it's going to be several hours of debugging before
you realize that your compiler just optimized out the check.)

~~~
kenrose
Hours of debugging? Days even!

Really, how often when you encounter a bug do you think it's a compiler bug?
Never. It can't be. Compiler writers are infalliable.

You'll first think it's your program, or maybe you misunderstood how volatile
works, so you'll read the spec again. You'll write a poodle to isolate the
problem. That will reboot each time too. Then you'll think it's some odd race
condition related to volatile. But you're just doing a load and a store. The
reboot happens every time, OK, that's promising. Then maybe, MAYBE, if you're
awesome, you'll think to look at the generated assembly. And when you realize
you have a no-op, you'll start to think if you maybe inadvertently specified
something wrong in your -O parameters. Because how could the compiler be
wrong? It's never wrong.

Code generation bugs are the worst.

~~~
cnvogel
I think it's a safe bet to _start_ debugging with the notion of a correctly
working compiler...

But then, I've found bugs in compilers and standard-libraries for embedded a
few times already, they are much less battle-tested than your regular
x86_64-linux-gnu-gcc. So at some point, I normally switch into "trust no one"
mode and start reading disassembler outputs in the vincinity of the crash-site
;-)

~~~
mansr
That's when you start finding bugs in the disassembler.

~~~
cnvogel
...and in your hex-editor? ;-)

------
DannyBee
This paper misstates the proper behavior of volatile to start

In particular, it says "For every read from or write to a volatile variable
that would be performed by a straightforward interpreter for C, exactly one
load from or store to the memory location(s) allocated to the variable must be
performed."

This is wrong. It later kind of gets it right for C, explaining about sequence
points, but it entirely misses that implementations are free to combine and
eliminate multiple volatile accesses within the same sequence point.

Now certainly, most of what was reported were genuine bugs (and John reports a
lot of correctness issues). But it does/did nobody favors to start with an
incorrect definition.

~~~
cnvogel
I think if you replace "straightforward interpreter for C" with the "abstract
state machine" in the standard (I'm looking at ISO/IEC 9899:1999 right now),
at least the first half of the sentence you qoute it's pretty much what the
standard says:

❝An object that has volatile-qualified type may be modified in ways unknown to
the implementation or have other unknown side effects. Therefore any
expression referring to such an object shall be evaluated strictly according
to the rules of the abstract machine...❞ (§6.7.3)

For combining multiple volatile accesses within the same statement, I think I
cannot find an answer in the standard.

------
zurn
Related: [https://www.kernel.org/doc/Documentation/volatile-
considered...](https://www.kernel.org/doc/Documentation/volatile-considered-
harmful.txt)

Sounds like "volatile" variables don't really provide good semantics for most
uses even without considering compiler bugs, so it's better to just use
explicit load and store macros or functions.

~~~
acqq
That is the reason the compilers mostly didn't care: the semantics of the
language "volatile" is seldom what is needed "in the real world programs."

~~~
avian
As the paper notes in the introduction, "volatile" is heavily used in embedded
software where synchronization primitives like kernel's spinlocks aren't
readily available.

The "buffer_ready" in the paper is a very good example that I have seen many
times in the real world. If anyone can share the "better solution" that avoids
"volatile" (and works on a bare-bone ARM microcontroller for example), I would
love to see it.

~~~
acqq
I haven't programmed ARM myself and there are different ARMs, but as far as I
understand (inspired by post from cnvogel), on most CPU's even non-ARM ones,
you need to use intrinsics like these which exist on ARM:

Memory barrier instructions:

[http://infocenter.arm.com/help/topic/com.arm.doc.faqs/ka1404...](http://infocenter.arm.com/help/topic/com.arm.doc.faqs/ka14041.html)

GCC example:

[http://stackoverflow.com/questions/6751605/data-memory-
barri...](http://stackoverflow.com/questions/6751605/data-memory-barrier-dmb-
in-cmsis-libraries-for-cortex-m3s)

As far as I understand, volatile doesn't give you the needed "barrier"
semantics (what is not allowed to happen before or after, on the deep hardware
level) if the code with "volatile" works, it can be just an accident.

~~~
zAy0LfpBZLC8mAC
I don't know the details for ARM, but generally, microcontroller type CPUs
don't do any reordering, so no need to prevent it with barriers. And if you do
use volatile correctly (and the compiler is not broken), you don't need
compiler barriers either. You also might not need any CPU barriers when the
cache controller is configured to treat certain regions of the address space
as uncacheable (which might be used for MMIO regions).

~~~
mansr
Modern high-performance ARM cores such as the Cortex-A15 do plenty of
reordering. Even when using strongly ordered memory mappings, you still need
barriers to prevent reordering between normal and strongly ordered accesses.
No amount of volatile will help with this.

~~~
zAy0LfpBZLC8mAC
The parent's parent was talking about microcontroller class ARMs, though.
Also, you don't necessarily have to order "normal" with regards to strongly
ordered accesses. If it's only about ordering accesses to MMIO, it doesn't
matter when some arithmetic result gets stored to RAM relative to those
accesses, all that matters is that the hardware sees the register reads and
writes in the right order. Ordering only matters for stuff that is shared in
some way, for private data, the illusion of naive serial execution is
guaranteed anyhow.

~~~
mansr
A common scenario is filling a data buffer in normal memory (cached or non-
cached doesn't matter on ARM) before initiating a DMA operation by writing to
a device register. In this situation, a barrier is required to prevent any of
the normal writes being reordered around the DMA initiation which would then
see stale data in the buffer.

Discussions about barriers (or volatile) are only meaningful in the context of
related accesses, so explicitly mentioning this isn't really necessary.

~~~
zAy0LfpBZLC8mAC
I don't really get what you are trying to say. Sure, there are cases where you
need barriers, both the CPU and the compiler kind, nobody denied that. Still,
in low-end stuff, PIO and cache-free in-order execution still are well and
alive, and volatile can be perfectly sufficient under such circumstances.

~~~
colin_mccabe
_I don 't really get what you are trying to say._

volatile is an antipattern. Don't use it. Don't encourage other people to use
it. It is not a thing which should be used, by you. To use it would be wrong,
because not using it is correct.

Use atomic instructions.

------
infogulch
Quote from intro:

"Although the symptoms of this compiler bug—spurious periodic reboots due to
failure to reset the watchdog timer—may be relatively benign, the situation
could be worse, for example, if the hardware register were used to lower
control rods, cancel a missile launch, or open the pod bay doors."

So _that 's_ what the problem was with those pod bay doors.

------
rjzzleep
so, is this kind of quickcheck for c code generators?

i'm a little surprised at how much worse clang fared.

~~~
lstamour
Note the article was written in 2008 and clang was first released in 2007.
"LLVM 2.2 was released on February 11, 2008, and LLVM r53339 is a snapshot of
the source code from July 9, 2008." They later in that paragraph describe the
improvements in clang over such a short time.

Also, it seems there are quite a few compiler bugs being found, even today.
This looks like a very productive field of study, though the same could likely
be said for software correctness in general.
[http://blog.regehr.org/archives/1061](http://blog.regehr.org/archives/1061)

This particular paper's software (its modern equivalent) is at
[https://github.com/csmith-project/voltest](https://github.com/csmith-
project/voltest)

