Hacker News new | comments | ask | show | jobs | submit login
When the Compiler Bites (nullprogram.com)
115 points by ingve 9 months ago | hide | past | web | favorite | 39 comments

The unit test issues in my real program, which was a bit more sophisticated than what was presented here, gave me artificial intelligence vibes. It’s that situation where a computer algorithm did something really clever and I felt it outsmarted me.

As someone who has been reading compiler output for many years, it seems like these "flashes of inspiration" are relatively sparse compared to the bulk plain-stupid code they normally produce; once in a while you'll see something and think "that's clever... really clever", followed immediately by something so stupid it makes you wonder how it managed to generate such code. You can easily distinguish between that and human-written Asm that way. The latter may not have as much "peak cleverness" but certainly keeps up its average level of efficiency.

Has there ever been a program (not including any so-called "optimised assembler") that accepts handwritten asm on stdin and sends "optimised" asm to stdout?

   optasm <1.asm >2.asm
   diff -U0 1.asm 2.asm|less 
The person writing the handwritten asm can then look at the suggested "optimisations" and can learn from/evaluate them.

Looking at x86 asm source code, I have always been amazed at the relatively small number of commonly used instructions versus the large number of available instructions that are listed in the manuals. As someone learning asm, how would I ever be introduced to apropos usage of these uncommon instructions absent real-world examples?

This was interesting, but I don't get this part of the example code:

    if (w == 0 || h <= SIZE_MAX / w)
To me, that condition is just super-strange: if w is zero, the product w x h is zero too, so the entire allocation becomes pointless. Surely it should be:

    if (w != 0 && h <= SIZE_MAX / w)
or something? Having these kinds of discussions based on code that is confusing in itself just becomes too much, for me.

malloc(0) is implementation-defined: https://port70.net/~nsz/c/c11/n1570.html#7.22.3p1

One might, I suppose, decide that new_image will be exactly equivalent to the system malloc in this respect, in which case malloc(0) should be called when appropriate.

My guess would be that they were thinking h <= SIZE_MAX / w, and added the most obvious logic to avoid a division by zero.

I think overflow occurs when w * h > SIZE_MAX

So you just want to make sure that w > 0 and w * h <= SIZE_MAX. But if w * h is large enough, it overflows, which would result in w * h <= SIZE_MAX being evaluated to true.

I agree that it is confusing though.

pointless, but not wrong: calloc on a zero width image should still yield a memory pointer, even if it points to a zero length stretch of memory. It would be strictly wrong to return a null pointer just because (one of) the dimensions happens to be zero.

It might read a little funny, but it's actually quite correct with respects to what the code path should be expected to do. Valid dimensions should yield a valid pointer. Even if they're validly zero =)

>It would be strictly wrong to return a null pointer just because (one of) the dimensions happens to be zero.

Strictly wrong according to whom? At least not any C standard I know of.

Besides, even if the function did return a non-null pointer value (which it is allowed to do; the behaviour is implementation-defined) you aren't allowed to dereference it any more than a null pointer, making it hardly any more "valid", either.

According to code design mostly. It's not a syntax issue (both are fine syntax) but a logic issue. You should be able to trust that as long as your dimensions are valid, any for loop over the resulting list works. And in graphics contexts, zero is typically a perfectly valid value to set one (or both) dimensions to (negative values, however, are not). Of course, you _can_ say "I think 0 is an invalid dimension" but given that plenty of frameworks and applications consider 0 just fine, that's kind of an unnecessary restriction for the sake of writing a conditional that is arguably irrelevantly "easier" to read.

You could argue that a null pointer would be a correct pointer, since you are not allowed to dereference it anyway (since there is no data to access).

It will probably break some assertions and assumptions though. Edit: I see that 8xde0wcNwpslOw (what a username) made the same point already.

Why disallow zero-pixel images?

You're right. The way you wrote is how I would have too. Author has written it weirdly.

GCC float comparison: since C doesn't have a portable way to denote single precision float constants, it's a good idea to cast them in order to avoid implementation specific differences. See: https://godbolt.org/g/t7UYoi

      float x = 1.3f;
      return x == (float) 1.3f;
This correctly returns 1 on all compilers I cared to check.

Couldn't reproduce clang calloc removal bug with that code. Tried clang 6.0, -O1, -O3.

Clang did remove calloc when there were no new_image calls where parameter "shade" could be zero. But that's to be expected, as it is a static function, only accessible from same compilation unit. Entirely correct behavior.


When it comes to the filed clang bug, I fail to see any bug at all: https://bugs.llvm.org/show_bug.cgi?id=37304

When calloc return value is not stored (or used), how could it be accessed or even freed at all? I think compiler is right to remove whole call.

Can you clarify this part:

[...] since C doesn't have a portable way to denote single precision float constants [...]

That sounds like the suffix 'f' which you're also using, how is that not a way to "denote single precision float constants"?

My draft copy of the C11 spec says:

An unsuffixed floating constant has type double. If suffixed by the letter f or F, it has type float. If suffixed by the letter l or L, it has type long double.

But this is not new syntax, I just looked in the most recent document I have.

Do you mean that C doesn't have a way to require IEEE 754 floating-point?

See edit below. C99 allows greater precision than required by the type!

Hmm, you're right.

I think something similar bit me long ago and left me under impression f-suffix for a float is not portable.

Perhaps it's time to read the standard through once again...

Then again, explicit casts generate code that works on all compilers, standards compliant or not.

Edit: C99, "The values of floating operands and of the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby."

So I guess explicit floating point casts are needed when explicit operations are wanted!

You might have encountered in the past is C/C++ silently promoting your floats to double.

An example, `printf("%f", float_value)` will actually promote float_value to a double. Or calling a function from `math.h` with a float constant

Regarding calloc, the issue is this:

- Since C++14, SIZE_MAX is the maximum size of any valid object; any object larger is ill-formed.

- calloc(SIZE_MAX, SIZE_MAX) is an attempt at creating an invalid object of size SIZE_MAX^2.

- Thus, calloc(SIZE_MAX, SIZE_MAX) should always return null. Shouldn't it?

Compilers often remove malloc calls even if the allocated object is modified and freed. But they shouldn't change the program's visible semantics.

> Compilers often remove malloc calls even if the allocated object is modified and freed. But they shouldn't change the program's visible semantics.

What visible side effects there are for an allocation with an unused return value pointer (cast to a boolean)? There's no way to reference said memory, so there's no way it can affect execution of the abstract program.

If the allocation succeeded, there's no way to free the memory in any case. So even ignoring possible concurrent allocations elsewhere, it's useless for checking whether future malloc can succeed. Unless, of course, I missed something obvious.

> When calloc return value is not stored (or used), how could it be accessed or even freed at all? I think compiler is right to remove whole call.

The calloc return value _is_ used, to calculate the return value.

    t->counter = calloc(n, sizeof(t->counter));
I think you are checking the sizeof of `unsigned *` here, not `unsigned`, which is itself a bug.

It strikes me that the System V amd64 ABI is not defined on such an (apparently) obvious point as padding of values smaller than a word. Is there a reason why it does not get fixed?

I’m amazed that we still have people being bitten by exact floating point comparisons in 2018 really. Nobody working with floating point numbers should be surprised by the behaviour in the post. This article [0] is over 6 years old and is a follow up from I don’t know when, talking about this exact subject.

[0] https://randomascii.wordpress.com/2012/02/25/comparing-float...

The first example is about optimizing away multiplication by zero. The second is about GCC treating float literals with greater than float precision, and the third is about dead code elimination.

Not convinced your comment is relevant.

I think the point here is to provide an example showing how different compilers implement certain aspects of source code. There is no implication that you should be able to depend on exact floating-point comparisons in programs.

We avoid depending on exact floating-point comparisons, not because of any inherent indeterminism in that comparison, but because it does not generally yield exactly what our algorithm needs.

I doubt I'm the only person a bit surprised to discover that the same literal float in source code won't always result in the same value in the program...

I think you should re-read that article you sent out, as it describes a different situation as the one presented here.

It's kind of a frequent misconception among kind-of-informed programmers that floating points are always "inaccurate" without fully understanding why. Your linked article is mostly describing floating point arithmetic where due to various reasons (order of operations, etc) it's hard to compare arithmetic results due to precision loss.

In this case though, you have a float assigned to a constant (1.3f) that is then compared with 1.3f again. There is no arithmetic done, and in fact, 1.3f has a non-ambiguous binary representation (it's the binary float point number that's closest to 1.3, a decimal number). Doing something along the lines of "1.3f == 1.3f" should always return true. The issue here is just the annoying issue of intermediate 80-bit representation vs 32-bit storage which isn't an intrinsic issue of floating point, but rather a language/platform oddity. As the article described this isn't actually an issue under x86-64 since compiled code don't use 80-bit intermediate representation there.

Another similar issue is reading/printing float point numbers out. It's a tricky thing to do, but there are lots of research done on it (e.g. see "dtoa" by David Gay) such that you can actually accurately print out floating point numbers and read them back and get the same binary representation.

You'd be surprised how many devs nowadays still have no idea about floating point numbers.

What every computer scientist should know about floating-point arithmetic is readily available online.

There are new developers every day.

1st/2nd semester stuff, but requesting a CS degree or equivalent qualification is against the SJW agenda.

So, afaik, the bug is not a bug. It should be, but the language specification says it's not.

To quote richard smith on a related issue (malloc(-1) succeeds):

" * malloc is part of the implementation, so LLVM is allowed to optimize it to allocate memory from "somewhere else".

* In this case, LLVM allocates SIZE_MAX bytes of memory in some hypothetical space that exists only in its imagination.

* This allocation succeeds, so the pointer is non-null, and errno is not set.

* However, as a result, malloc need not actually be called at runtime. "

Does this lead to silly results? Yes.

I was once told that optimizations in the sense of dead code elimination the way clang does it could be bypassed by making a variable volatile... Wouldn't that help here as well?

    volatile unsigned char *p = malloc(w * h);

Not quite, what you want is

  unsigned char * volatile p = malloc(w * h);
This indicates that p itself is a volatile variable, rather than a pointer to memory that is volatile.

Ah, that was the thing I was after. Thanks.

The volatile modified on a pointer tells the compiler that reads and writes to it can have side-effects -- it is meant for memory-mapped hardware registers. Because of those (presumed) side-effects, reads and writes to that volatile pointer cannot be optimized away by dead code elimination.

Changing the address where such a variable points to, will still be optimized away when it isn't used later on.

I tested it. It doesn't help on Clang.

Why do you try to allocate if w is zero? I'm assuming w stands for width.

I'm not the author of the website, if that's what you're thinking.

I love when people fail to understand the specification of the language and blame the compiler.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact