Hacker News new | past | comments | ask | show | jobs | submit login

That's because the optimizer symbolically executes the program according to the C memory model, not x86 memory model

...and I think that is a huge problem, because it strays from the spirit of the language in that it's supposed to be "closer to the hardware" than other HLLs.

Edit: care to give a counterargument?




Optimizers follow what the C specification says. When C says "this can be reordered", then they can reorder.

There are practical reasons why it's not "closer to the hardware":

• it would be harder to have a single compiler front-end for multiple CPU back-ends.

• what programmers have in mind for "closer to the hardware" is actually quite fuzzy, based roughly on how they imagine a naive C compiler would generate machine code. That interpretation doesn't have a spec. The things that seem obvious aren't that obvious in detail, or lead to unexpected performance cliffs (e.g. did you know that non-byte array indexing on 64-bit machines would be slow if signed int was defined to "simply" overflow?)


I'm not the one who downvoted you, but the counterargument is that such a target-specific approach prevents any reasonable effort to consistently define the semantics of any programming language. "On x86_amd_this_and_that processor the program means this, on arm_64_some_version it means that and on x86_intel_something again something else". This would not just be hell, it would be unworkable.

In other words, any approach other than an abstract machine model is doomed. Ideally this abstract machine is rather close to real world machines, but it can't be identical to all of them, nor is it reasonable to start special casing.


but the counterargument is that such a target-specific approach prevents any reasonable effort to consistently define the semantics of any programming language.

Why does there even need to be? There's already undefined behaviour to take into account the variance. Otherwise it's just making things worse for programmers who want to easily solve real problems on real hardware, not the absurd theoretical world of spherical cows.


Undefined isn't the same as implementation defined. "Undefined" means you're not even talking about C any more. "Implementation defined" means which of the allowable semantics the implementors happened to choose.


Aka the "C is a high level assembler" view.

C is not a low level language - https://queue.acm.org/detail.cfm?id=3212479 - argues that this hasn't been true in a long time, since C was ported from the PDP-11.

Attempts to model C as being "close to the hardware" are at best misguided and usually hopelessly wrong. It's only possible by delving into the "implementation defined" parts of the spec but even then implementation s are often very bad at precisely defining what their semantics actually are.


Exactly this. Undefined behaviour exists for the purpose of portability, not optimization; actively going out of it's way to abuse it is a compiler bug.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: