
Memcpy (and friends) with NULL pointers - runesoerensen
https://www.imperialviolet.org/2016/06/26/nonnull.html
======
nkurz
Here's another example of this fine feature:

    
    
      #include <stdio.h>
      #include <string.h>
      #include <stdlib.h>
      #define LENGTH 128
    
      int main(int argc, char **argv) {
          char *string = NULL;
          int length = 0;
          if (argc > 1) {
              string = argv[1];
              length = strlen(string);
              if (length >= LENGTH) exit(1);
          }
    
          char buffer[LENGTH];
          memcpy(buffer, string, length);
          buffer[length] = 0;
    
          if (string == NULL) {
              printf("String is null, so cancel the launch.\n");
          } else {
              printf("String is not null, so launch the missiles!\n");
          }
    
          printf("string: %s\n", string);  // undefined for null but works in practice
    
          #if SEGFAULT_ON_NULL
          printf("%s\n", string);          // segfaults on null when bare "%s\n"
          #endif
    
          return 0;
      }
    
      nate@skylake:~/src$ clang-3.8 -Wall -O3 null_check.c -o null_check
      nate@skylake:~/src$ null_check
      String is null, so cancel the launch.
      string: (null)
    
      nate@skylake:~/src$ icc-17 -Wall -O3 null_check.c -o null_check
      nate@skylake:~/src$ null_check
      String is null, so cancel the launch.
      string: (null)
    
      nate@skylake:~/src$ gcc-5 -Wall -O3 null_check.c -o null_check
      nate@skylake:~/src$ null_check
      String is not null, so launch the missiles!
      string: (null)
    

It appear that Intel's ICC and Clang still haven't caught up with GCC's
optimizations. Ouch if you were depending on that optimization to get the
performance you need!

But before picking on GCC too much, consider that all three of those compilers
segfault on _printf( "string: "); printf("%s\n", string)_ when string is NULL,
despite having no problem with _printf( "string: %s\n", string)_ as a single
statement. Can you see why using two separate statements would cause a
segfault? If not, see here for a hint:
[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25609](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25609)

------
Ace17
Very interesting. In the same vein, I once stumbled upon the following
optimizer behaviour: ''' int array[8];

// later array[i]

''' The optimizer deduces from the access "array[i]" that "i >= 0 && i < 8".
(leading it to optimizing away conditions like "i > 10").

Of course, if optimization leads to semantic differences, this means the
program being compiled was wrong/ambiguous in the first place. However, now,
to debug your programs, you must know how optimizers work.

~~~
krylon
A couple of years back, I worked at a company where I helped maintaining a
Win32 application written in C, using (Open)Watcom as the compiler.

A few years before I started working there, the other programmers had been
bitten by a bug in Watcom's optimizer that _sometimes_ would wrongly optimize
away comparisons involving floating point numbers.

Consequently, all code was compiled with optimizations disabled completely. (I
tried to debate, but then I compiled the project with most optimizations
enabled and did a benchmark - the difference in performance was minimal, so I
stopped bothering).

------
ninjabeans
Can't the compiler warn about this?

~~~
nkurz
It's usually argued that it would be too hard for the compiler to avoid false
positives from templates and macro expansion. I don't like this argument,
since distinguishing between "generated" code and "explicit" code isn't that
hard. Also, the warning mechanism doesn't need to be perfect to be beneficial:
it's generally better to catch some of the security flaws than to catch none.
The one tool I've found that does catch many of these errors is "stack":
[https://github.com/xiw/stack](https://github.com/xiw/stack)

It operates by identifying "unstable" code. Essentially, it uses Clang to
optimize the code twice, once with optimizations on and once with
optimizations off. Then it checks to see if any "basic blocks" have been
removed. Its main problem is that it's difficult to set up. You have to
locally compile a specific older version of LLVM with a particular set of
compile flags.

But once you have it up, it catches a lot of non-obvious errors and doesn't
have many false positives. Unfortunately, in the sample code I gave above,
"stack" doesn't detect any errors, because Clang doesn't doesn't optimize out
the "if (string)" statement like GCC does. And it doesn't catch the switch
from printf() to puts() because it's a change of which function is called
rather than a change in control flow.

------
ChoHag
If the compiler is optimising code away, it's optimising away its _copy_ of
code that the diligent programmer has already written. ie. the implied
assertion that the check is not performed at all is not true. The check is not
performed _twice_.

Of course programmers are not always diligent but perhaps in that case a
language which unconditionally demands diligence is not the right choice.

> It also adds a very sub­tle, ex­cep­tional case to sev­eral very com­mon
> func­tions, bur­den­ing pro­gram­mers.

Dealing with the burden of subtle and/or exceptional cases is the price of
using a low-level language. If the price is unacceptable, don't buy the
product.

