Kees is one of the few developers in the world doing the boots-on-the-ground rough-and-tumble soul-withering grunt work of actually making Linux kernel more secure. It's not glorious work, but it's perhaps the most important work anyone's doing in the kernel these days. The world runs on the Linux kernel, and from a security standpoint, it's really been a mess (see: Dmitry's Syzkaller talk). Too many contributors are drive-by scattershot "my boss told me to upstream this stuff" types who don't really care about the incremental suckage they inject into the kernel. Bravo on Kees for doing what he's doing.
There's a misleading statement in the "undefined behavior" slide. Allow me to nitpick, since this subject is so full of misunderstandings and confusion.
"What are the contents of uninitialized variables?
... whatever was in memory before now!"
This may be true or false - as the slide itself says, "with undefined behavior, anything is possible!".
Besides, the subject of accessing uninitialized variables is more nuanced than "undefined behavior". Among other things, the effects depend on the variable's type ("unsigned char" does not have trap values):
It's also important to note that C++ has different rules on this. For example, the extract_int function in the last link is valid C, but not valid C++ (in C++ you'd use std::memcpy to achieve the same thing in a valid way).
Also to note that ISO C++ working group is trying to reduce the amount of UB, while ISO C working group doesn't have any ongoing papers into this direction.
There is a genuine, non-pedantic difference between a variable having an arbitrary value and reading it causing undefined behaviour. For example, say you don't care what value a variable has so long as it's even:
void foo() {
int i;
i -= i % 2;
printf("%d %d\n", i, i);
}
The numbers printed could be odd, because the compiler is allowed to do anything it likes. It could even print two different numbers out!
If you still think this is all pedantry that can't happen in practice, here's an example where compilers are known to do strange things when the behaviour is undefined:
int bar(int param) {
int uninit;
if (param == 0) {
uninit = 0;
}
baz(param); // Some other function
return uninit;
}
In this snippet of code, you might think that you're safe so long as you don't look at the return value of baz(). But in fact the optimiser may conclude that param must be zero (because anything else would be undefined behaviour), so baz() is always called with a parameter of zero even if bar has another argument. An problem very similar to this was discussed as a source of possible vulnerabilities in the Linux kernel [1] (although I don't know if any actual vulnerabilities were found).
yes - snprintf() returns the length of the string that would be written even if it wasn't clipped to the length of the output buffer - if you want to accumulate strings into a buffer using the output length to update the offset/size bad things will happen.
scnprintf() returns the actual number of bytes written (less the null)