
Both true and false: a Zen moment with C - Niten
http://markshroyer.com/2012/06/c-both-true-and-false/
======
haberman
It's crazy to me that we as C programmers are flying blind about when we
invoke undefined behavior. There is no blinking red light that warns us that
something needs to be fixed, only strange and unexplainable behavior, or in
some cases things just work fine because the undefined behavior happens to
follow our expectations on some combination of platform and compiler.

I keep wishing that there was a Valgrind-like tool that could detect and
report undefined behavior. It would have to be a dynamic, runtime tool, since
many/most cases of undefined behavior cannot robustly detected statically. But
it would need to have more information at runtime about the original C program
than Valgrind has; the assembly alone does not contain enough information to
know if the source C program is invoking undefined behavior.

I feel certain that if this tool existed, we would find scores of undefined
behavior in all but the most conscientious C programs.

~~~
Argorak
Clang has -fcatch-undefined-behavior, "Turn on runtime code generation to
check for undefined behavior.". (
<http://clang.llvm.org/docs/UsersManual.html> )

Afaik, GCC has some flags for specific types of undefined behavior.

~~~
haberman
That flag catches four specific instances of undefined behavior, a tiny speck
in a sea of all possible ways that undefined behavior can be invoked.

~~~
gregholmberg
If you are building your project with Clang, and find that your code causes
undefined behavior not caught by any existing in-compiler test, and you can
reproduce it, you might want to ...

\- sign up for the developers' list at
<http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev> ,

\- skim over <http://clang.llvm.org/hacking.html> ,

\- build from 'trunk' by carefully following a few simple steps in
<http://clang.llvm.org/get_started.html> ,

\- wade in and reproduce your bug ,

... and then share your findings.

Together we can build better tools. Anyone with patience and interest can
help.

~~~
haberman
I think we may have our wires crossed. I'm concerned about undefined behavior
in my own programs, not undefined behavior in Clang itself. Someone mentioned
Clang as a tool that can help detect UB in other programs, but I mentioned
that these checks are far from exhaustive.

~~~
gregholmberg
Sorry, the fault is mine alone. I used an ambiguous description above, now
edited.

The only point I wanted to make was that a collective effort might help cut
down on ambiguity in expression that the compiler does not identify. I think
we could use "more warnings".

------
Suncho
It's so much fun to explore why certain instances of undefined behavior do
what they do. I never thought of this as being a possible effect of using
uninitialized variables.

Logically, you're doing two things. You're negating the bool and then you're
testing that negated value. This is what's expressed in the code generated
from the non-optimized version. In the optimized version, the compiler has
optimized this into a single jne instruction. I can understand why they
wouldn't want to do it for non-optimized builds.

It sounds like gcc always uses xor for negating bools. If all you're doing is
negating a bool, it's just a single instruction that involves no branching. It
could be the fastest way.

I wonder what the fastest way to negate (boolean-wise) a value other than a
bool would be. Maybe you just do a test and then set the value based on a xor
of the zero flag.

~~~
pbsd
> I wonder what the fastest way to negate (boolean-wise) a value other than a
> bool would be. Maybe you just do a test and then set the value based on a
> xor of the zero flag.

Probably "NOT reg". Exists in most CPU architectures, and is generally smaller
(in variable-size instruction sets) than "XOR reg, immed".

~~~
drv
NOT, at least in x86, is bitwise, so it won't do the right thing for values
other than 0 and ~0 (-1).

~~~
pbsd
Oh, I misunderstood. Thought it the point was to make it bitwise. In that
case, it is very architecture-dependent. In x86, you can use TEST coupled with
SETcc or CMOV for microarchitectures where branch mispredictions are very
costly (P4 comes to mind). If not, and the result is not random, you're better
off with a branch.

Another trick is to use the carry flag, i.e., NEG reg; MOV reg, 0; ADC reg, 0.
This works for all 386-compatible processors and is reasonably fast.

------
moondowner
Unrelated to the discussed topic in the post, but it reminded me of The
Codeless Code, an absolute gem. <http://thecodelesscode.com/contents>

~~~
ctdonath
Therein, quite related: <http://thecodelesscode.com/case/20>

~~~
akkartik
_“The sage and the fool go to their graves alike in this respect: both believe
the sage to be a fool. Where, then, may wisdom be found?”_

------
bediger4000
Super cool, but I was hoping for a "bluff combinator" like C value. See: "How
to circumvent church numerals", Mayer Goldberg and Mads Torgersen, Nordic
Journal of Computing, Volume: 9, Issue: 1, 2002. Sometimes I can find a PDF,
but right now, I can't.

~~~
gwillen
Assuming the 'bluff combinator' is like what I think it is, you can do similar
things in C++ by creating classes with overloaded operators. E.g. overload
operator (bool) to return true, and overload operator ! to also return true.

------
defrost
Interestingly, . . . it's a pointless exercise delving into ASM to "see what
happens" when you have undefined behaviour in C code. A slight flag change and
the only code produced could equally well just be a single NOP.

~~~
simias
Still, gcc is a pretty serious piece of software (no more starting nethack
these days), so while the behaviour is undefined it's still interesting to see
why the compiler ended up generating a code that behaves so weirdly.

If you need to find a use for it, it might make it easier to spot an undefined
behaviour later and finding its cause ("hey, this evalutates both to true and
false, it must be an unitialized bool!").

So it's still interesting and might not be completely pointless.

~~~
defrost
LLVM is more interesting piece of software, gcc is kind of crusty and warty.

Still, I'm glad you're interested in the language, if you're interested in
undefined behaviour then you should carefully and closely read the latest free
draft standard (N1570) and perhaps implement a small C compiler of your own (
Appel : Modern Compiler Implementation ).

There are many C compilers, they are free to do whatever they want when
behaviour is undefined and in many cases you'll unfortunately not see anything
odd at all, not until you port your code, change your flags or upgrade the
compiler and then you'll having a seething mass of "well this stuff worked
when we tested it" errors.

Strapping yourself into a vehicle and driving off a cliff teaches you a little
about the dangers of accidents and gives you a reason to drive safely, it
doesn't particularly teach much about the practice of safe driving.

The example code drove off the road, what happened next really isn't all that
exciting.

<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf>

<http://www.cs.princeton.edu/~appel/modern/c/>

~~~
dllthomas
Many serious pieces of software are crusty and warty.

------
JoeAltmaier
I think the compiler is just wrong. In C, zero is false and non-zero is true.
Testing just the bottom bit was wrong.

Wikipedia, Boolean in C:

    
    
      if, while, for, etc., treat any non-zero value as true

~~~
eridius
You're misreading what's going on. The if test is indeed testing for a non-
zero value. The "problem" is the negation only flipped the LSB in the
register, thus leaving behind a non-zero value that passed the second if test.
This is perfectly legitimate, since the value is undefined, and if it wasn't
the compiler would have already ensured it only contained 0 or 1.

~~~
JoeAltmaier
I read all that. If false is zero and true is anything else, its incorrect
behavior to assume anything about the value - a test for zero or non-zero is
the only legitimate test. I may have passed in 55 as a value for 'true' since
any non-zero value is supposed to be true (not false anyway).

~~~
eridius
You still don't understand.

The if test is correct. It's using non-zero as "truth". That's what you're
talking about, and there's absolutely no question that it's doing the right
thing here.

The negation is not using non-zero for truth, because it's a negation of a
bool, and bools are restricted to only being 0 and 1. However since this
particular bool is undefined, the in-memory representation happens to not meet
that restriction. But _that doesn't matter_. The value is undefined. The
negation could, quite legally, cause demons to fly out your nose. The fact
that you've observed, prior to the negation, that the value appears to be true
has no meaning. Undefined values remain undefined even after observation and
after manipulation.

~~~
JoeAltmaier
Ah. So its the bool type itself that's broken. If an int type was used to
store logical operations, the contradictory result is impossible (both true
and not true).

~~~
lmm
Nothing is broken. The compiler would be within its rights to do the same
thing to an int (or more realisticly a char or similar type that's shorter
than the registers its stored in).

~~~
JoeAltmaier
Yes, sure, but why design a language that way? Who is it helping? No language
implementer WOULD do that to an int. My development group avoids bool like the
plague, using int instead - it has well-defined size, behavior and sensible
warnings.

~~~
lmm
>Yes, sure, but why design a language that way? Who is it helping?

Compiler writers. C exists mostly to be a portable language that's easy to
compile. A better question is why would you write programs in C in 2012.

>No language implementer WOULD do that to an int. It certainly happened to
float/double on some older architectures where the internal representation
included different flag bits. I wouldn't be at all surprised if it happened to
char and short. I guess there's an argument for using int for all variables in
your program (since memory isn't usually constrained enough nowadays for it to
be worth using short etc.), but again, if you weren't memory-constrained why
would you be using C?

>My development group avoids bool like the plague, using int instead - it has
well-defined size, behavior and sensible warnings

Int might not suffer from this particular behaviour, but if you use an
uninitialized variable it _will_ bite you sooner or later. The article's
takeaway isn't "avoid bool", it's "initialize your variables"

~~~
JoeAltmaier
Maybe you misapprehend the bool issue? It isn't that bool doesn't match a
register; its that the value is stored in memory in a subset of the storage
unit i.e. 1 bit out of the byte. That's not true of any other scalar.

~~~
lmm
Suppose you have a char followed by a long in a struct; the compiler will
insert seven bytes of padding after the char. There's nothing to prevent it
setting the padding to 0 when initializing the char, and using a 64-bit load
instruction to load it into a register - in which case you'd get exactly the
same behaviour as seen here with bool, only even more confusing.

------
AshleysBrain
Heh, interesting. So I guess it's also possible that "b && !b" evaluates to
true... which is probably more ammo for C detractors.

~~~
simias
That one does not work with g++ and actually does not produce a warning with
-Wall without optimisations.

It makes me wonder if that's actually undefined behaviour in C++, because
unlike the example in the fine article this time the behaviour of the program
does not depend on the (undefined) value of b. Or maybe it's just g++ being
clever.

~~~
JamesLeonis
There is plenty of undefined behavior in C++ [1]. It is important to
understand that these aren't bugs in a particular compiler, but deliberate
definitions in the standard itself to give compiler writers leeway in their
implementations. Amusingly, that means "undefined behavior" is explicitly
defined in the standard.

<http://stackoverflow.com/a/367662/712512>

------
reacweb
good old C compilers did not have the bool keyword and did not have this silly
undefined behaviour. Standardisation committees should understand that the
quality of a standard is inversely proportional to its length.

~~~
pmjordan
I'm guessing you missed the MySQL security vulnerability the other week? That
was caused precisely by the lack of a proper boolean in older versions of C
and an innocuous-looking workaround.

------
moron
It's not really Zen until "p" evaluates to "mu".

~~~
astrodust
That's essentially what undefined behaviour is.

------
excuse-me
There was a bare unlabelled power line wire hanging from a pole and I touched
it and was fine, then another friend touched it another day was electrocuted.

How come the voltage on an unknown wire is just allowed to change like that -
what are the laws of physics for!

