
Who's afraid of a big bad optimizing compiler? - signa11
https://lwn.net/SubscriberLink/793253/6ff74ecfb804c410/
======
alltakendamned
Especially for security-critical operations, it is often a good idea to
disassemble the relevant sections of the binary and ensure that things are
handled appropriately. Optimizations can result in security controls not
working as expected.

Edit: More background:

[https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Bewar...](https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations)

[https://people.eecs.berkeley.edu/~dawnsong/papers/The%20Corr...](https://people.eecs.berkeley.edu/~dawnsong/papers/The%20Correctness-
Security%20Gap%20in%20Compiler%20Optimization_may%202015.pdf)

~~~
chrisseaton
> Optimizations can result in security controls not working as expected.

Most likely the security controls weren't correct in the first place, which is
pretty bad for a security control.

~~~
zimbatm
Or C doesn't provide the necessary semantic.

For example, trying to zero a page before freeing it (because is contains
sensitive information) is surprisingly hard. Optimizing compilers will
consider that since the memory is not going to be accessed it doesn't need to
write the zeroes to it. Look at all the contortions libsodium goes to do
coerce all the compilers to do the right thing :)
[https://github.com/jedisct1/libsodium/blob/a26467874a512de30...](https://github.com/jedisct1/libsodium/blob/a26467874a512de304f6e7e25d98af549963de9f/src/libsodium/sodium/utils.c#L97-L142)

~~~
chrisseaton
Why don't people write things like that in assembly and call that from C?

~~~
nsajko
Boringssl used to use assembly until they figured out this trick of scaring
the compiler away by justing putting an assembly directive in the function:

[https://boringssl.googlesource.com/boringssl/+/refs/heads/ma...](https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/mem.c#156)

[https://boringssl.googlesource.com/boringssl/+/ad1907fe73334...](https://boringssl.googlesource.com/boringssl/+/ad1907fe73334d6c696c8539646c21b11178f20f%5E!/#F0)

~~~
pjmlp
Which is the kind of clever thing that it works until it doesn't, common in
many C codebases.

~~~
nsajko
I am pretty sure the code is entirely correct when meant for GCC or Clang.
Standards-noncompliance (because of inline assembly) is another issue, of
course; but one could argue that GCC and Clang _are_ the de facto standard.

~~~
jdsully
The problem would be someone working on the GCC optimizer deciding, “Wouldn’t
it be great if we could do all our fancy optimizations on functions with
inline assembly?”

~~~
nsajko
Search for "volatile" in
[http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Extended-
Asm.htm...](http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Extended-Asm.html)

~~~
jdsully
Volatile only says that the clobbered memory is not listed. Nothing would stop
the compiler from statically analyzing the assembly to make smarter decisions
if it chose to do so. This would be trivial with an empty asm block.

However because inline assembly is non-standard even that is not truly
guaranteed. Your relying on the good graces of the optimizers that have broken
all the other code paths with their legalistic approach.

~~~
nsajko
> statically analyzing the assembly

I must concede this notion troubles me now that you mentioned it, it is indeed
possible that GCC or Clang could decide that there is some other way explicit
memset-like stuff should be done; and boringssl could deal with it, they are
not a stable API for one thing, and Google employs GCC, LLVM and Clang
developers anyway so it would be easier for BoringSSL than the rest of the
world to fix the compilers changing stuff...

------
Gupie
Is the do_something_quickly example actually buggy?

while (!need_to_stop) /* BUGGY!!! */ do_something_quickly();

Can the optimiser really unroll the loop and only check the need_to_stop
variable once!?

    
    
      if (!need_to_stop)
          for (;;) {
              do_something_quickly();
              do_something_quickly();
              do_something_quickly();
              do_something_quickly();
              ....
    

Is this the same for C++ or is does the new memory model prevent it?

~~~
nybble41
> Can the optimiser really unroll the loop and only check the need_to_stop
> variable once!?

Yes. The compiler can see that need_to_stop is not changed by any of the code
inside the loop, and since it wasn't qualified with "volatile" or using
explicit atomic operations the compiler is free to assume that it won't be
changed asynchronously. That makes need_to_stop a constant for the duration of
the loop, so it _should_ have the same effect (given those assumptions)
whether the variable is checked once or before each call to
do_something_quickly(). Checking once is considerably more efficient.

~~~
nsajko
Are you sure atomics can help in this case?

EDIT: also, volatile is a bad idea, too, unless you are doing something low-
level like a kernel or hacking on memory mapped IO pins. It does not
synchronize accesses between threads.

~~~
plorkyeran
C11 atomics include a memory barrier, so making need_to_stop atomic would
indeed suffice.

------
gray_-_wolf
I feel like posting this link here goes against

> The "subscriber link" mechanism allows an LWN.net subscriber to generate a
> special URL for a subscription-only article. That URL can then be given to
> others, who will be able to access the article regardless of whether they
> are subscribed. This feature is made available as a service to LWN
> subscribers, and in the hope that they will use it to spread the word about
> their favorite LWN articles.

> If this feature is abused, it will hurt LWN's subscription revenues and
> defeat the whole point. Subscriber links may go away if that comes about.

but hey...

~~~
Hello71
it is accepted to post occasional links to articles on news sites, just not
constantly post it. this has been the case for many years:
[https://hn.algolia.com/?query=lwn.net%2Fsubscriberlink](https://hn.algolia.com/?query=lwn.net%2Fsubscriberlink)

------
magicalhippo
The article comes across as a bit confused. As far as I can see, these are all
examples of plain concurrency-unsafe code rather than anything to do with the
optimizer.

Sure, the optimizer is usually great at exposing buggy code that relies on
undefined behavior, but there's nothing concurrency-specific to that.

As such I think the article would have been better had it been framed from the
point of view that it feels like it wants to take, but doesn't: a guide for
beginners of things you rely on in non-concurrent code that you can't in
concurrent code.

There's more to that topic than what the optimizer does.

