> I have programmed in C since ~1986, so please don't try to explain the language to me, and don't assume that my POV comes from a place of ignorance.
I apologize for my tone, it was more patronizing that I had intended it to be.
> The craziness with undefined behavior is a fairly recent phenomenon. In fact, I started programming in C before there even was a standard, so all behavior was "undefined", yet no compiler manufacturer would have dreamed of taking the liberties that are taken today.
I feel that the current renewed focus on optimizing compilers has really been born out of the general slowing of Moore's law and stagnation in hardware advances in general, as well as improvements in program analysis taken from other languages. Just my personal guess as to why.
> Exactly: "matches what you are trying to do". The #1 cardinal rule of optimization is to not alter behavior. That rule has been shattered to little pieces that have now been ground to fine powder.
The optimizing compiler has a different opinion than you do of "altering behavior". If you're looking for something that follows what you're doing exactly, write assembly. That's the only way you can guarantee that the code you have is what's being executed. A similar, but not perfect solution is compiling C at -O0, which matches the behavior of older compilers: generate assembly that looks basically like the C code that I wrote, and perform little to no analysis on it. Finally, we have the optimization levels, where the difference is that you are telling the compiler to make your code fast; however, in return, you promise to follow the rules. And if you hold up your side of the bargain, the compiler will hold up its own: make fast code that doesn't alter your program's visible behavior.
> The optimizing compiler has a different opinion than you do of "altering behavior".
Obviously. And let's be clear: the optimizing compilers of today. This rule used to be inviolable, now it's just something to be scoffed at, see:
> If you're looking for something that follows what you're doing exactly, write assembly.
Er, no. Compilers used to be able to do this, with optimizations enabled. That this is no longer the case is a regression. And shifting the blame for this regression to the programmers is victim blaming, aka "you're holding it wrong". And massively counter-productive and downright dangerous. We've had at least one prominent security failure due to the compiler removing a safety check, in code that used to work.
> Finally, we have the optimization levels, where the difference is that you are telling the compiler to make your code fast;
Hey, sure, let's have those levels. But let's clearly distinguish them from normal operations: cc -Osmartass [1]
The article you linked to in your blog post is most likely not serious; it's a tongue-in-cheek parody of optimizing compilers, though one that's written in a way that brings it awfully close to invoking Poe's Law.
But back to the main point: either you can have optimizations, or you can have code that "does what you want", but you can't have both. OK, I lied, you can have a very small compromise where you do simple things like constant folding and keep with the intent of the programmer, and that's O0. That's what you want. But if you want anything more, even simple things like loop vectorization, you'll need to give up this control.
Really, can you blame the compiler? If you had a conditional that had a branch that was provably false, wouldn't you want the compiler to optimize it out? Should the compiler emit code for
if (false) {
// do something
}
In the security issue you mentioned, that's basically what the compiler's doing: removing a branch that it knows never occurs.
This is simply not true. And it were horrible if it were true. "Code that does what I want" (or more precisely: what I tell it to) is the very basic requirement of a programming language. If you can't do that, it doesn't matter what else you can do. Go home until you can fulfill the basic requirement.
> very small compromise
This is also not true. The vast majority of the performance gains from optimizations come from fairly simple things, but these are not -O0. After that you run into diminishing returns very quickly. I realize that this sucks for compiler research (which these days seems to be largely optimization research), but please don't take it out on working programmers.
What is true is that you can't have optimizations that dramatically rewrite the code. C is not the language for those types of optimizations. It is the language for assisting the developer in writing fast and predictable code
> even simple things like loop vectorization
I am not at all convinced that loop vectorization is something a C compiler should do automatically. I'd rather have good primitives that allow me to request vectorized computation and a diagnostic telling me how I could get it.
C is not FORTRAN.
As another example: condensing a loop that you can compute the result of at runtime. Again, please tell me about it, rather than leaving it in without comment and "optimizing" it. Yes, I know you're clever, please use that cleverness to help me rather than to show off.
> Really, can you blame the compiler?
Absolutely, I can.
> If you had a conditional that had a branch that was provably false,
"Provable" only by making assumptions that are invalid ("validated" by creative interpretations of standards that have themselves been pushed in that direction).
> wouldn't you want the compiler to optimize it out?
Emphatically: NO. I'd want a diagnostic that tells me that there is dead code, and preferably why you consider it to be dead code. Because if I write code and it turns out to be dead, THAT'S A BUG THAT I WANT TO KNOW ABOUT.
This isn't rocket science.
> security issue you mentioned, that's basically what the compiler's doing: removing a branch that it knows never occurs.
Only for a definition of "knows" (or "never", take your pick) that is so broad/warped as to be unrecognizable, because the branch actually needed to occur and would have occurred had the compiler not removed it!
> The article you linked to in your blog post is most likely not serious
I think I noted that close relationship in the article, though maybe in a way that was a bit too subtle.
Hmm…let's try a simpler question, just so I can get a clearer picture of your opinion: what should the compiler do when I go off the end off an array? Add a check for the bounds? Not put a check and nondeterministically fail based on the the state of the program? How about when you overflow something? Or dereference a dangling pointer?
You seem to not be OK with allowing the compiler to trust the user to not do bad things–but you do trust them enough to out-optimize the compiler. Or am I getting you wrong?
I apologize for my tone, it was more patronizing that I had intended it to be.
> The craziness with undefined behavior is a fairly recent phenomenon. In fact, I started programming in C before there even was a standard, so all behavior was "undefined", yet no compiler manufacturer would have dreamed of taking the liberties that are taken today.
I feel that the current renewed focus on optimizing compilers has really been born out of the general slowing of Moore's law and stagnation in hardware advances in general, as well as improvements in program analysis taken from other languages. Just my personal guess as to why.
> Exactly: "matches what you are trying to do". The #1 cardinal rule of optimization is to not alter behavior. That rule has been shattered to little pieces that have now been ground to fine powder.
The optimizing compiler has a different opinion than you do of "altering behavior". If you're looking for something that follows what you're doing exactly, write assembly. That's the only way you can guarantee that the code you have is what's being executed. A similar, but not perfect solution is compiling C at -O0, which matches the behavior of older compilers: generate assembly that looks basically like the C code that I wrote, and perform little to no analysis on it. Finally, we have the optimization levels, where the difference is that you are telling the compiler to make your code fast; however, in return, you promise to follow the rules. And if you hold up your side of the bargain, the compiler will hold up its own: make fast code that doesn't alter your program's visible behavior.