

Checking PVS-Studio with Clang - AndreyKarpov
http://www.viva64.com/en/b/0270/

======
ridiculous_fish
The author's attitude towards UB is wrongheaded and dangerous. He argues:

> The Get() function can initialize the variables A and B. Whether or not it
> has done so is marked in the variables getA and getB. Regardless of whether
> or not the variables A and B are initialized, their values are copied into
> TmpA and TmpB correspondingly....Formally, as far as I understand, undefined
> behavior occurs. In practice, however, just some garbage will be copied.

This "in practice" is assuming a naive compiler that dutifully outputs a copy
instruction. But the reality is compilers _exploit_ undefined behavior for
optimization opportunities.

A likely outcome here is that the compiler will see UB if getA is false, and
therefore optimize getA to just always be true. Not what the author intended
at all!

Chris Lattner explains it beautifully: [http://blog.llvm.org/2011/05/what-
every-c-programmer-should-...](http://blog.llvm.org/2011/05/what-every-c-
programmer-should-know.html)

~~~
to3m
I post this guy's somewhat-related rant in response to any post I see in
support of the modern approach to undefined behaviour:
[http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasin...](http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasing.html)

(So, no matter which side of this argument you prefer, don't worry: you are
not alone.)

~~~
userbinator
This is another one on the same side: [http://blog.metaobject.com/2014/04/cc-
osmartass.html](http://blog.metaobject.com/2014/04/cc-osmartass.html)

I should point out the C standard explicitly says that "Possible undefined
behavior ranges from ignoring the situation completely with unpredictable
results, to _behaving during translation or program execution in a documented
manner characteristic of the environment_ (with or without the issuance of a
diagnostic message), to terminating a translation or execution (with the
issuance of a diagnostic message)." The second choice is what programmers
expect.

UB should not be taken literally as "you can do anything". To take this to an
extreme, a compiler that deletes any source files that contain UB could be
compliant, but would also be one that no one ever wants. Much of the utility
of C comes from its use as a "portable high-level assembler", meaning that its
constructs are supposed to intuitively map almost directly to machine
instructions. Aggressively optimising compilers that exploit UB violate this,
so they are not in the spirit of the language: UB is either unintentional (a
bug), in which case the compiler should issue a warning; or it is intentional,
in which case the compiler should translate it with the intention the
programmer had in mind. Making assumptions that "UB will never occur" and
using that for optimisation is "wrongheaded and dangerous"; C's reputation for
performance and efficiency comes from its simple low-level nature, not from
overly aggressive compiler optimisation. Any optimising by the compiler should
be at the instruction-selection level, and higher-level optimisation done with
assumptions that are "characteristic of the environment". It's also worth
mentioning that the "UB = optimisation" advocates seem to be mostly academics
with an intense interest in replacing C with "safer" languages... and there's
no better way to further their own agenda than to "legally" disparage C and
make it a worse language than it should be.

~~~
MaulingMonkey
_" UB should not be taken literally as "you can do anything". To take this to
an extreme, a compiler that deletes any source files that contain UB could be
compliant, but would also be one that no one ever wants."_

While I would agree it should not be taken as "you _should_ do anything", I
disagree with the notion that it should not be taken as "you _can_ do
anything".

For the situations I use C and C++ in:

1) I'm simply not willing to pay the price to catch all potentially dangerous
UB at build time. My build times are bad enough as is without forcing a full
static analysis pass during every iteration, and those still have holes and
false positives. The laws of physics prevent us from paying some currently
unimplemented O(N^crazy) checks that would certainly catch some bugs... and
O(N^3) is bad enough, from what I hear.

2) I'm simply not willing to pay the price to mitigate all potentially
dangerous UB at run time. I already spend more time optimizing than I'm happy
with - making the compiler do less means I'll have to do more, which has an
opportunity cost: I'll have less time to hunt down non-UB dangerous code. As
such, I'm happy to have ASLR and NX bits, but I won't be turning off my
optimizer nor enabling the more paranoid compiler-generated bounds checking.

If this means a malicious attacker manages "rm -f /home/user/project/all-my-
undefined-code.c" through a buffer overflow rather than unsanitized input
strings, so be it.

 _" It's also worth mentioning that the "UB = optimisation" advocates seem to
be mostly academics with an intense interest in replacing C with "safer"
languages... and there's no better way to further their own agenda than to
"legally" disparage C and make it a worse language than it should be."_

I have an interest in replacing C with "safer" languages - specifically the
kind that can better do safety and speed simultaneously. I feel that neither C
nor C++ can be twisted much further towards those goals efficiently, without
fundamentally redesigning those languages to the point where they're... well,
no longer C nor C++.

~~~
mpweiher
"I'm simply not willing to pay the price to catch all potentially dangerous UB
at build time. "

Huh? The point is the opposite: that UB is apparently explicitly (or maybe
implicitly) recognized and _exploited_ for non-sensical "optimizations". So
it's not that the compiler should do extra work to catch dangerous UB. It's
that it shouldn't do this extra work to then do unexpected stupid stuff.
Either not doing the extra work at all or doing something sensible would be
OK.

~~~
MaulingMonkey
_" that UB is apparently explicitly (or maybe implicitly) recognized"_

 _Potential_ UB is identified, but that's absolutely not the same as
identifying _certain_ or even _likely_ UB.

For the cases where it's certain or likely as determinable by the compiler:
Yes, please turn these into errors and warnings. As I've said, "I would agree
it should not be taken as "you should do anything"". But that's as little as I
address _certain_ UB, the majority of my post then talks about _potential_ UB
in general.

I want the optimizer to have free reign to let "anything" happen, by ignoring
scenarios which would cause UB, if those scenarios are sufficiently unlikely.
Even if that means occasionally backfiring as "unexpected stupid stuff" when
statistics catch up to me. I suspect I have a higher tolerance for weirdness
in the name of performance than the topic article's author.

The majority of my post (including what you quote) then centers on justifying
this higher tolerance.

------
wolfgke
According to [http://stackoverflow.com/questions/6793262/why-
dereferencing...](http://stackoverflow.com/questions/6793262/why-
dereferencing-a-null-pointer-is-undefined-behaviour) dereferencing a null
pointer is undefined behaviour. Thus the example in the unit test where this
occurs is IMHO a serious bug.

~~~
tokenrove
Note that in this case, what's actually happening is a null reference is being
created; while this is very ugly (C++ references should never be null), if the
reference isn't used, as the author asserts, a null dereference never happens
(otherwise, there would almost inevitably be a crash). Behind the scenes it's
just a NULL pointer being passed to the function that apparently ignores it.

~~~
sharth
But either way, the code is not well-formed. A null reference should never
exist in a well-formed program.

[http://stackoverflow.com/questions/4364536/c-null-
reference](http://stackoverflow.com/questions/4364536/c-null-reference)

