

The first split vote about C compiler random test  - gossips
http://blog.regehr.org/archives/558

======
cdcarter
More interesting than the split vote on this test are the arguments in the
comments. What actually is the correct behavior? C99 is better than K&R C, but
as the article says, it's vague enough that it's not an chip agnostic
assembler.

~~~
copper
According to this explanation, the behaviour isn't completely specified:
<http://blog.regehr.org/archives/558#comment-2698>

------
daemin
I'm not sure about this, but isn't the struct bitfield value uninitialised,
therefore random? Unless in C it's default 0 initialised because it's in the
global scope.

~~~
caf
It's initialised to 0 because it has static storage duration (in this case, it
has static storage duration because it's declared at file scope, but there are
other ways to get static storage duration too).

~~~
a1k0n
Wouldn't that put it in the bss section and therefore not initialize it (but
most OS's will probably zero out the page for security reasons)?

Ether way, as it's a 1-bit field, the value can only be 0 or 1 and is always >
-3.

~~~
prodigal_erik
I think the problem is that -3 < 1 but -3 > 1U because the -3 is "promoted"
(quietly corrupted) to UINT_MAX - 2 ( _edit:_ C99 guarantees this value). So
the question is whether a small unsigned bitfield is handled like a signed int
(which can represent all its possible values) or an unsigned int (which is
more like what you asked for).

It's just astounding how many developer-years have been wasted by the
industry's insistence on producing and accepting wrong answers when commodity
hardware has been maintaining under/overflow flags for literally my entire
career.

~~~
caf
-3 > 1U is guaranteed to be true. Conversion of an out-of-range value to an unsigned type is well-defined in C: it is taken modulo one greater than the maximum value representable in the target type. (unsigned)-3 must always be equal to UINT_MAX - 2, there is no choice for the implementation here.

Conversion of out-of-range values to signed types is, however, implementation-
defined (and also allows an implementation-defined signal to be raised, which
in practice makes it something to avoid in correct programs).

~~~
prodigal_erik
Thank you, I wasn't aware that C99 now mandates the twos-complement result; I
think it had been implementation-defined in C89 (which is all I ever used
seriously).

[http://stackoverflow.com/questions/50605/signed-to-
unsigned-...](http://stackoverflow.com/questions/50605/signed-to-unsigned-
conversion-in-c-is-it-always-safe)

~~~
caf
C89 has this language in §3.1.2.5 Types:

    
    
      A computation involving unsigned operands can
      never overflow, because a result that cannot be
      represented by the resulting unsigned integer type
      is reduced modulo the number that is one greater 
      than the largest value that can be represented by 
      the resulting unsigned integer type.

------
cschneid
Can somebody link me a quick explanation of what the comma operator does in C?

~~~
caf
The comma operator evaluates its left argument, introduces a sequence point,
then evaluates its right argument. The result has the same type and value as
the right argument. It is useful in cases like:

    
    
        while (update_thing(&foo), foo != 0) {
    
        }

~~~
owyn
On second thought, let's not go to Camelot. It is a silly place.

It's useful shorthand in a few specific cases, but I would argue that this
sort of code is bad style and if I was in charge (which I am not) it would be
in the "never do this" section.

That this edge case apparently took decades to find implies (to me) that the
comma operator is not to be used in production code. I'll always choose a few
extra lines of code over ambiguity, because someday other will people have to
read my code when there's some kind of a problem with it, and I'd rather make
the problems obvious, not subtle. :)

~~~
caf
The problem is that if you try to rewrite that example to avoid the comma
operator, you either have to duplicate the update_thing(&foo) call, as in:

    
    
      update_thing(&foo);
      while (foo != 0) {
          /* loop body */
    
          update_thing(&foo);
      }
    

Unnecessary duplication like this is itself a potential source of bugs - it's
all to easy to update one but not the other. The other alternative is to hack
up the loop to exit in a strange place:

    
    
      while (1) {
          update_thing(&foo);
          if (foo == 0)
              break;
    
          /* loop body */
       }
    

This is arguably even worse - the actual loop termination condition is not
where you expect to find it anymore. Personally, I find the formulation using
the comma operator to be completely clear.

Note that the bug referenced here is more about the subtleties of bitfield
type promotion in expressions, and the interplay of bitfields with operators
that evaluate to the type of one of their arguments, than it is about the
comma operator. You can show the same bug using the assignment operator
instead of the comma operator.

If you're going consign anything involved here to the "never do this" section,
I'd start with any use of bitfields, and maybe also include mixing unsigned
and signed types in expressions without explicit conversion.

~~~
tb
I completely agree with everything you've said except for "any use of
bitfields." The comma operator is ok in this circumstance, mixing unsigned and
signed is a big no-no, but what have you got against bitfields? They're very
useful when it comes to pulling sub-byte fields out of network packets.

~~~
nitrogen
_They're very useful when it comes to pulling sub-byte fields out of network
packets._

That is, if you don't mind tying your code to a particular platform and
compiler. The alignment and packing of bitfields is implementation dependent:
[http://stackoverflow.com/questions/1490092/c-c-force-bit-
fie...](http://stackoverflow.com/questions/1490092/c-c-force-bit-field-order-
and-alignment)

For what it's worth, I use bitfields on occasion, but only to save memory and
get better warnings when storing range-limited values. It's nice to have the
compiler warn about a comparison always being true or false due to the limited
range of a type. The performance of bitfields is probably worse than just
tossing all my boolean values into an int32_t or int8_t.

~~~
tb
Aha, thank you. As most of my C code has been written for a single compiler
and a single platform I was less aware of the pitfalls than I should have
been. I will know now to be extra vigilant if I ever have to port that code.

