Im a member of the wg14 and the Undefined beahviour study group. We are producinga technical rapport about UB and one thing we specificly point out is that staticly known UB should be treated as an error, not as "this cant be called" aka "unreachable". From what i have seen LLVM sometimes issues poison when it should just issue an error. The implications can be quite severe. Instead of assuming that a UB is programmer error, it assumes its the intention of the programer that the code will never run, therfor it can use this information to make assumptions about branching behaviour.
Undefined behaviour is the reason I can't use ISO C to do anything sensible in the domains of embedded programming or language implementation. Those are the exact domains that gcc with a lot of flags to disable the WG14 inventions excels at.
I don't need static warnings that the compiler refused to do what I asked it to.
I need the 'UB means arbitrary transforms of your code with a bias towards deletion' invention to go away. It's a dubious premise in C++ where it supports people writing clumsy application code. It's totally inappropriate for things like implementing a JIT.
I would be even fine with dead-code deletion when it carries a warning. But currently all dangerous UB decisions are totally silent, which is a nightmare
Pretty much all interesting low-level systems stuff is UB in C/C++ and has been for years. That's dumb.
The reason we need low-level systems stuff is to bit-bang on hardware; put data in hardware formats; look at the low-level representation of things that we "shouldn't"--like integers, floats, objects, structs, stack frames; and implement the language runtime itself. The intent is usually pretty clear from what's written, and the mental model of the systems programmer is "I know I need to make the hardware do this thing". Compilers optimizing away UB and otherwise treating it as never going to happen actively get in the way. UB makes a systems language not a systems language anymore.
No, I really want to point into the stack and walk stack frames, look at the guts of objects, arrays, map memory, forge pointers to new code (because I wrote a JIT compiler), etc. That is brazen UB in C/C++ and compilers will do horrible things to you.
If you really want guaranteed frame/stack layout I don't think there is any other way than writing your own compiler/IR. No optimizing compiler written in the last 50 years will give you such a guarantee.
If you "just" want introspection, that's a little bit more reasonable; in principle something like dwarf unwind info could be extended for that. But, a) that a lot of work for an extremely niche problem and b) there is no guarantee that the in-memory representation of objects is stable at instruction boundaries, I think you would need something like GC write barriers.
With C there is the ABI which is platform specific but can't change without all hell breaking loose. Also after twenty terrible years for profilers frame pointer are returning.
So the ABI allows you to do this sort of stuff reliably as long as the compiler isn't doing inane things with UB.
The ABI doesn't mandate were locals are located on a stack frame thought so I'm not sure how would you inspect those.
If you simply meant that you want rely on an ABI, then that's fine, although relying on those details might be UB for standard conforming code, it is obviously well defined for compilers that conform to the ABI. Just because it is undefined from a standard point of view, doesn't mean is still undefined for a specific compiler + platform.
You will still need to use compilation firewalls and barriers to avoid aliasing or pointer provenance issues of course.
UB is a guaranteed compilation error in constexpr.
However, in regular code UB may only become known after multiple optimization passes (e.g. constant propagation, inlining).
It's difficult for the optimizer to meaningfully err at that time. The code may have been heavily modified by other optimization passes, which can add UB to the code (e.g. UB may have been data-dependent (1/x), but optimizer added a specialized copy of a function for x=0 where it became unconditional). The IR may have been generated by a non-C language that used UB/poison intentionally to guide optimizations.
> Somewhat expectedly, gcc remains faithful to its crash approach, though note that it only inserts the crash when it compiles the division-by-zero, not earlier, like at the beginning of the function. […] The mere existence of UB in the program means all bets are off and the compiler could chose to crash the function immediatley upon entering it.
GCC leaves the print there because it must. While undefined behaviour famously can time travel, that’s only if it would actually have occurred in the first place. If the print blocks indefinitely then that division will never execute, and GCC must compile a binary that behaves correctly in that case.
Don't worry; a function blocking indefinitely (i.e., there is some point where it stops giving off side effects, and never returns) is also UB. C++ attempts to guarantee a certain amount of forward progress, for now.
But a function blocking indefinitely while repeatedly writing to a volatile variable is well-defined.
So the compiler cannot remove a function call followed by UB unless it knows that the function won't do that.
In theory, the compiler could know that since `printf` is a well-known standard function.
In practice, `printf` might even exit the program via SIGPIPE, so I don't think any compiler will assume that it definitely will return.
"undefined behavior - behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements"
It says "for which" and not "for the whole program". So the "interpretation" that a program becomes invalid or other requirements are affected was never supported by the text in the C standard. This was justified by some with arguments such as: "no requirements" includes behavior that can go back in time or otherwise break the rules of the real world. Of course, if you start to interpret a standard text in this creative way, you can justify anything. A reasonable interpretation must assume some general properties of the application domain it applies to (and in C there also more specific rules about how requirements are to be interpreted in 5.1) and must assume that wording such as "for which" has actual meaning that needs to be taken into account and can not simply be ignored. In C23 "Note 3" was added that aims to clarify that such nonsensical interpretations are not intended:
"Note 3 to entry: Any other behavior during execution of a program is only affected as a direct consequence of the concrete behavior that occurs when encountering the erroneous or non-portable program construct or data. In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an operation with undefined behavior in the execution of the program"
Does this means that compilers cannot in general reorder non-side-effects operations across side effects, even if those operations wouldn't be globally visible if not for UB? Alternatively, is UB ordered with side effects? What's the ordering of UB with regard to other thread operations? Does it guarantee sequential consistency? I guess happens-before is guaranteed only if something else would guarantee happens before, but it means further constraint on ordering of, for example, potentially faulting operations across atomic operations.
Ie:
int ub(int idx, _Atomic int* x)) {
char y[1000];
int r = *x; // 2
r+= y[idx]; // 1
return r;
}
Statements 1 and 2 can't be reordered as any UB in accessing y[idx] is sequenced-after any side effect that happens-before 2, even if y is a purely local, non-escaping variable. This puts constraints on register allocation for example.
This opens a big can of worms.
edit: from a few quick tests GCC seems quite good at preserving ordering of potentially faulting instructions across side effects, even when reordering would be profitable (for example hoisting operations out of loops). It might be a posix requirement in practice because signals make them "visible".
edit2:
ok, this "fails":
extern volatile int x;
int ub(int d, int c) {
int r;
for (int i = 0; i < 100; ++i) {
r+= x; // 1
r+= d /c; // 2
}
return r;
}
GCC -O3 will hoists out of the looop the potentially faulting (and expensive) division at [2] before the volatile load at [1] (which is a side effect). Would you consider this a valid transformation or is it time-traveling UB?
Potentially this could be fixed by unrolling out the first iteration of the loop and preserving the first volatile access above the division, which can then be cached. This would also help with the variant where the volatile access is replaced by a function call that currently gcc doesn't optimize.
The can of worms is not so big actually. In general, observable behavior is only I/O and volatile accesses. This is not about side effects in general (which can be optimized according to the "as if" rule). So many things can still be reordered vs potentially trapping operations. Also potentially trapping operations can be reordered. For multi-threading we have "happens before" relationship, so a natural interpretation is that everything which happens before the UB is safe.
The reordering of a volatile access after a potentially trapping operation is not conforming. I think it is an important property of volatile that it prevents this optimization, so I hope that GCC will be fixed eventually. A potentially trapping operation can also not be hoisted above a function call, and compilers that did this all got fixed in the mean time.
> If the print blocks indefinitely then that division will never execute, and GCC must compile a binary that behaves correctly in that case.
Is `printf` allowed to loop infinitely? Its behaviour is defined in the language standard and GCC does recognize it as not being a user-defined function.
Your reasoning is incorrect. Here is how I reason about it.
Division by zero is undefined behavior. The compiler can assume that it will not happen.
If the divisor is not zero, then the calculation has no side effects. The compiler may reorder the division above the print, because it would have no observable difference in behavior. This could be useful because division has a high latency, so it pays to start the operation as soon as the operand values are known.
If the divisor is zero, the UB says that there is no requirement on how it's compiled, so reordering the division above the print is legal.
const int div = 0;
if(div) {
return 1/div;
}
return 0;
The statement at line 3 would have undefined behaviour, yet is never reached so this is a perfectly legal program and any transformation that hoists it above the check is invalid.
If you replace 'if(div)' with an opaque function call, that doesn't change anything as the function might never exit the program, never return, long jump or return via an exception.
Division has no side effects, and division by 0 is UB. UBs only occur in invalid programs, so behaviour in case of UB is not relevant to a discussion of side effects or their lack thereof, in language terms these are not programs at all.
> While undefined behaviour famously can time travel, that’s only if it would actually have occurred in the first place.
I've always been told that the presence of UB in any execution path renders invalid all possible execution paths. That is, your entire program is invalid once UB exists, even if the UB is not executed at runtime.
If you do `5 / argc`, that's only undefined behavior if your program is called without any arguments; if there are arguments then the behavior is well defined.
Instead, the presence of UB in the execution path that is actually taken, renders invalid the whole execution path (including whatever happens "before" the UB).
That is, an execution path has either defined or undefined behavior, it cannot be "defined up to point-in-time T". But other execution paths are independent.
Thus, UB can "time-travel", but only if it would also have occurred without time travel. It must be caused by something happening at runtime in the program on the time-travel-free theoretical abstract machine; it cannot be its own cause (no time travel paradoxes).
So the "time-travel" explanation sounds a lot more scary than it actually is.
Yes. It's possible to get `argc` to equal zero, though by invoking the program using `execve(prog, {NULL}, {NULL})` on Linux. This has, rather famously, caused at least one out-of-bounds error in a security-critical program (CVE-2021-4034 "Pwnkit", LPE by invoking Polkit's pkexec with a zero-length argv).
It's possible to call programs without any arguments, not even the path to the binary. I believe passing the path to the binary is merely a shell convention, because when calling binaries directly from code (not through the shell), sometimes it's possible to forget to specify arg 0 (if your chosen abstraction doesn't provide it automatically). I bet this has caused tons of confusion for people.
It is not. The presence of UB in an execution path renders that execution path invalid. UBs are behaviours, essentially partial functions which are allowed to arbitrarily corrupt program state rather than error.
However "that execution path" can be extensive in the face of aggressive advanced optimisations.
The "time travel" issue is generally that the compiler can prove some paths can't be valid (they always lead to UB), so trims out those paths entirely, possibly leaving just a poison (a crash).
Thus although the undefined behaviour which causes the crash "should" occur after an observable side-effect, because the program is considered corrupt from the point where it will inevitably encounter an UB the side-effect gets suppressed, and it looks like the program executes non linearly (because the error condition which follows the side effect triggers before the side effect executes).
Hmm, it could be that once UB is encountered the entire program becomes invalid, then. In practice, a lot of UB is quite subtle and may not necessarily result in complete disaster, but of course once it's occurred you could end up in any number of completely invalid states and that would be the fault of the UB.
> Hmm, it could be that once UB is encountered the entire program becomes invalid, then.
The UB doesn't actually need to be encountered, just guaranteed to be encountered eventually (in a non-statistical meaning), that is where the time travel comes from e.g. if you have
if (condition) {
printf("thing\n");
1/0;
} else {
// other thing
}
the compiler can turn this into
if (condition) {
// crash
} else {
// other thing
}
as well as
// other thing
In the first case you have "time travel" because the crash occurs before the print, even though in a sequentially consistent world where division by zero was defined as a crash (e.g. in python) you should see the print first.
Yes, in this case the title without the "how" is misleading, since it suggests they actually "handle" it and not just do whatever. So I think this title passes the howey test. Edited.
> While the compiled programs stayed the same, we no longer get a warning (even with -Wall), even though both compilers can easily work out statically (e.g. via constant folding) that a division by zero occurs [4].
Are there any reasons why that is so? Do compilers not reuse the information they gather during compilation for diagnostics? Or is it a deliberate decision?
In the second example, the constant is propagated across expression/statement boundaries. It is likely, that this happened on IR level, rather than on the AST level.
I'd imagine the generic case becomes a non-trivial problem if you don't want to produce fluke/useless diagnostic messages.
The compiler might already be several optimization passes in at this point, variables long since replaced by chained SSA registers, when it suddenly discovers that an IR instructions produces UD. This itself might end up being eliminated in a subsequent pass or entirely depend on a condition you can't statically determine. In the general case, by the point you definitely know, there might not be enough information left to reasonably map this back to a specific point in the input code, or produce useful output why the problem happens here.
Correct. And to add to this answer slightly: there might not be enough information because to keep all the context around, the compiler might need exponentially more memory (even quadratically more memory might be too much; program sizes can really add up and that can matter) to keep enough state to give a coherent error across phases / passes.
Back in the day when RAM wasn't so cheap that you could find it in the bottom of a Rice Krispies box, I worked with a C++ codebase that required us to find a dedicated compilation machine because a standard developer loadout didn't have enough RAM to hold one of the compilation units in memory. Many of these tools (gcc in particular, given its pedigree) date back to an era where that kind of optimization mattered and choosing between more eloquent error messages or the maximum scope of program you could practically write was a real choice.
There is very strong bias in clang not to emit any diagnostics once you get to the middle-end optimizations, partially because the diagnostics are now based on whims of heuristics, and partially because the diagnostics now become hard to attribute to source code (as the IR often has a loose correlation to the original source code). Indeed, once you start getting inlining and partial specialization, even figuring out if it is worth emitting a diagnostic is painfully difficult.
void foo(bool cond) {
int a = 0;
if (cond) a = 10;
if (cond) printf("%d\n", 10 / a);
}
into:
void foo(bool cond) {
int a = 0;
if (cond) {
a = 10;
if (cond) printf("%d\n", 10 / a);
} else {
if (cond) printf("%d\n", 10 / a);
}
}
and then screaming that (in its generated 'else' block) there's a very clear '10 / 0'. Now, of course, you'd hope that the compiler would also recognize that that's in a never-taken-here 'if' (and in most cases it will), but in various situations it might not be able to (perhaps most notably 'cond' instead being a function call that will give the same result in both places but the compiler doesn't know that).
Now, maybe there are some passes across which false-positives would not be introduced, but that's a rather small subset, and you'd have to reorder the passes such that you run all of them before any "destructive" ones, potentially resulting in having to duplicate passes to restore previous optimization behavior, at which point it's not really "reusing".
The do reuse information. But you have no guarantee that the point at which information is used is run after the point at which something is discovered.
They do try to run things so everything's used. They also try to compile quickly. There is a conflict.
Undefined behavior isn't a threat; it's a promise.
That came out wrong. What I mean is that it's not a threat from the standard that its buddy the compiler will ruin your computer, your day, and your life. Instead, it's a promise from the programmer to the compiler that the programmer won't perform some operations.
Since the programmer promised not to do some things, the compiler assumes those things aren't done while reasoning about the code, but if it turns out one of those things was done, the compiler's whole chain of reasoning is potentially invalid. So the compiler's options are to trust the programmer, and make the assumptions, or to instead treat the programmer as a dirty liar, and not make any of the assumptions while expending much effort considering all the subtle ways the programmer could try to trick it. Actually, there's a third option, trust but verify, where the compiler trusts the programmer to make a best effort, but verifies the assumptions when convenient. Unfortunately, it's not often convenient to verify in C or C++, but broadly compilers are getting better at it over time.
I'll just add that the third option is usually not considered viable because the main point of using C/C++ is for performance, thus people writing code want their compiler to optimize it as much as possible.
I don't think compilers consider undefined behavior a license to kill; I think the problem is genuinely hard. The priciple of explosion[1] tells us that any wrong assumption, even the most innocuous looking, can lead to an unboundedly wrong conclusion. The more complex your graph of logical deductions, the worse the problem gets (and good optimizations require a lot of complex logic).
It may be possible to limit the bad effects of faulty assumptions by being very careful about how deductions propogate from unreliable assumptions. However it's certainly not easy, especially without throwing out all optimizations that interact with a potentially undefined operation.
It is possible to design a languge such that a compiler doesn't need to make assumptions to perform optimizations, if programs are required to give enough information to compilers that they can statically prove properties they need to perform optimizations. But C and C++ are not designed that way, and it can't easily be retrofitted on top of them.
I ask because when I wrote that post, I couldn't remember if I had heard the term before. In other words, I am not sure if I copied someone else or not, and if I did, I want to add attribution.
It's more insane to me that the people who are putting in these "sanity checks" don't know enough about the language to find out that their "sanity checks" are both wrong and useless in the first place.
UB is usually not as obvious as a divide-by-zero-constant in plain sight. The divide-by-zero-constant could be the result of a complex constant folding operation, involving values injected into the build process via build options or code generation, similar for any other instance of undefined behaviour.
I don't think any programmer puts UB on purpose into the code, even if they can enumerate from memory all 200 or so cases of UB just in the C standard (no idea how big the list is in the C++ standard - thousands maybe?).
The original sin was compiler writers exploiting UB for optimisations in bizarre ways instead of working with the C and C++ committees to fix the language to enable those types of optimizations without requiring UB, or at least to classify UB into different 'hazard categories' (most types of UB in the standard are completely irrelevant for code generation and optimization)
> I don't think any programmer puts UB on purpose into the code
I agree with you. But the problem is not people putting UB in their code, either on purpose or by mistake: we all do that, every day!
The problem is people trying to defend their code once it has been made clear to them that it contains UB, and trying to fight the compiler rather than fix their error.
Do you maintain this stance when the UB is scattered around third party libraries that aren't especially welcoming of patches intended to obfuscate the code while placating some subset of compilers?
How about when code is written correctly and then later the standards body makes previously implementation defined behaviour into undefined? I've got some code that calls realloc that WG14 declared to be undefined years after I wrote it and I doubt that experience is unique.
The evangelical attitude that the standard committee knows best and your code is wrong and you should immediately down tools to work around the compiler noticing the opportunity to miscompile it is quite popular. I think it's a really compelling argument to build nothing whatsoever on the ISO C/C++ stack and replace what you do have with something less hostile, aka anything whatsoever - rust, python, raw machine code written in hex - none of them have this active hostility towards the dev baked into the language design.
ISO C only specifies minimal requirements. It exists because people sit together and agree on those. If people do not agree we can not standardize it. We are also not the ones making your compiler miscompile your code. The compiler people write the compiler that miscompiles your code! They same people sit in the committee and do not agree to changes that would specify other behavior. So if you do not like what your compiler does, this would be the place to go to and complain.
For realloc, different implementations did different things and clearly said they will not change. There wasn't really any other choice. If your program was written for one implementation where it works, it can continue to do so, but it was never portable to other implementations. The standard now simply reflects this reality.
ISO says it's ok to aggressively rewrite programs on the assumption that no undefined behaviour ever executes. At least some compiler developers are incentivised to do whatever makes benchmarks faster.
Put the two together and you get a fast and fragile language implementation. I know why the benchmark people push the compiler in that direction. I'm doubtful that WG21 or WG14 especially want this emergent property.
My suspicion is that this is an accident of history that has too much unwarranted inertia behind it. The moral stance that it's all lesser programmers erroneously writing wrong code is aggravating in that context as it actively opposes anyone making things better.
But ISO does not explicitly say anything like "it is ok to aggressively rewrite programs...". ISO C says, "xy is defined and z is not defined in the standard and then an implementation is free to do whatever this wants". Note this is generally how standards works. You put them blame on ISO to give implementers the freedom, but not on the implementors to exploit this aggressively. Why do programmers still rush to the compilers that exploit this the most instead of choosing other ones? Why do they not complain more? Unfortunately, the effect of blaming ISO has exactly the opposite of what you may want to achieve. It deflects the criticism from the ones who do the decisions to ISO who can only harmonize behavior but does not really have the power to force implementors to do anything, as the realloc story shows. Even worse, this criticism weakens ISO even more, so we will become even less likely that we will be able to fix things using the limited de-facto power a standard has.
No, it is definitely ISO C (and C++) fault [1]. Without guidance of what's acceptable behaviour and what isn't, DWIM and "please don't break my code" is not a spec. Consider type punning via enum, which is still UB in c++ while explicitly supported by most implementations. Most users are wary of making use of it because of possible, if only theoretical, portability problems. Generally provenance, aliasing and the basic memory model are a mess and need a more rigorous underpinning after having been patched during the years with sometimes conflicting wordings.
Compare to the concurrent memory model: while DRF-SC still has a plenty of UB, it is at least possible for a competent programmer to figure out the correctness of their code.
I certainly do not believe that is realistic to stamp out all UB from C and C++ (at least while pretending that the resulting languages have anything to do with the original ones), but there is a lot that the standard could do to try to limit the most egregious cases, possibly providing different levels of conformance (like it is done for floats and IEE754).
[1] of course implementors are part of the committee so they are not blameless.
I agree about the state of many things as you say and especially the point about more rigorous underpinning. But the flow of innovation and progress is not meant to come from ISO and flow to the compilers, the process is designed the other way round: Compilers implement things and ISO integrates this into the standard to avoid divergence. An ISO committee neither has the power nor resources to do what you want, and it is also not meant to work this way (we try nevertheless). But ideally, compilers vendors would need to work on fixing all these things and then ISO could simply standardize it. But for this, one would need to put pressure onto compiler vendors to actively work on those problems, and not blame ISO for not doing things it wasn't designed to do.
C++ has an absurd amount of UB to the point where there's an entire thought exercise devoted to how malicious a compliant compiler can be (Hell++.) There are things that are allowed and even common in C but UB in C++ (union type punning,) and things that people reasonably can assume would work but are UB anyway (signed integer overflow.) Then you have weird edge cases like assigning the return value of a two argument std::max involving temporaries to a reference. There are so many UB foot guns, no reasonable developer can be expected to keep track of them all.
When all hardware built for decades uses two's complement arithmetic and even the standards bodies have noticed this (e.g. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm) it's not remotely necessary to assume that overflowing a signed integer is undefined behaviour. It's totally defined exactly what that instruction is going to do on any hardware any electrical engineer is willing to build.
However, some benchmarks use iteration on a signed integer, and assuming that loop terminates makes it slightly faster, so in order to retain that marginal advantage over other languages, signed iteration shall be assumed to never overflow.
That's not why it's ub. It's ub to allow compilers to optimize x*2/2 to just x. If you want overflow, you can use unsigned, which has been defined to follow 2's complement semantics for quite some time.
> things that people reasonably can assume would work but are UB anyway (signed integer overflow.)
No, they cannot. You don't have the right to make any assumption about integer overflow in C.
Most of the whining about UB is from people who still refuse to accept that you don't have the right to think about your processor family once you write in any language that is not assembly.
Cannot find any articles discussing Hell++. Is it something specific? The thought seems very interesting and I'd like to read about it if you have any links :-)
Thanks!
I'm not sure about Hell++, but DeathStation 9000 is a "common" jocular name for a hypothetical system which is as pathological as possible with respect to implementation definined and undefined behavior.
The working draft standard for C++ (2022-12-18, document N4928) is 2,110 pages long at a reasonable-sized font.
I'd actually go the other way: I think most practicing programmers have not read the entire standard specifying most languages they use day-to-day and have no real idea what the abstraction-break looks like that turns their code into a format consumable by the next layer down.
How many Java programmers do we assume know anything about the bytecode of the JVM, for example?
To be fair, even if you look at just the language alone, C++ is still extremely large. But the real problem is when you combine this sheer size with the amount of subtle footguns.
If we compare to equivalent Python code, for example, the behavior is simple and straightforward: division by zero causes a runtime error that is reported in a well-defined manner. Similarly with other arithmetic issues in C++ that can trigger UB - e.g. integer overflow is just not a thing in Python (short of OOM). The detailed rules may well be complicated, but it doesn't matter as much when the behavior is intuitive and conforms to common sense expectations.
And that's why Python is slow and a director at a previous employer initiated a project to rewrite all Python in C++ and the CPU savings far exceeds the cost of some programmers.
Python is slow because it is insanely dynamic - much more so than, say, JavaScript. Look into what gets executed for something as simple as a method call before it even gets to the method body. Descriptor protocol etc.
It depends on what you're writing. I've seen slow C++ and fast Python. Not to imply your director made a bad call; especially if you already had a reference Python implementation to check against the new C++ implementation for correctness, you can gain quite a bit depending on The domain you're operating in by swapping out a Python engine with an engine written in another language (if for no other reason and it gives you an opportunity to check all your assumptions against a system solving real problems for real people).
Many sanity checks are preserved with -O0, so they're still useful in debugging. And they serve as documentation of expected conditions. I often put in checks that I expect to be statically eliminated to remind myself what is supposed to be true.
While it is a fair assumption that a programmer using a given language is fluent in it, it's generally unreasonable to expect them to be an expert. Here I bet that not the overwhelming majority of C programmers are aware of the different handling of divisions by zero depending on whether it's a floating point or integer operation.
Further, those 'checks' might be in place for a long time, outlasting compiler versions and perhaps even language standards.
I don't see how it's wrong to use an if statement to check that nobody's passing in a zero to your function that later does division using that variable. c has a whole load of footguns in this area, as UB can impact code appearing earlier in the file. You are right about them being useless or counterproductive in effect sometimes though.
> I don't see how it's wrong to use an if statement to check that nobody's passing in a zero to your function that later does division using that variable.
Indeed, it isn't. The check becomes useless if it happens after the division, though.
Did you notice that the check of divisor being zero happens after the division, not before? It's undefined behavior because the behavior cannot be defined. On Intel you get a #DE before you do the check; this behavior cannot be standardized and that's why it's UB.
Sanity checks are as much a message to the human reading the code, as they are a message to the compiler/runtime.
As a slightly contrived example, assert(sizeof(char) == 1) is true by definition and could be elided, but it might be a useful reminder to see it in the source, next to code that implicitly relies on this truth.
Clang has a mode (UBsan) that flags and optionally aborts if you execute UB. It runs several times slower because it has to check lots of things, but it might make sense for some programs.
UBSan is not exhaustive. There are large and important cases of UB it cannot detect. There are also cases where the compiler simply won't insert the proper checks even for detectable issues. None of the sanitizers are broadly intended for production use either, though UBSan offers a specific mode you can select (fsanitize-minimal-runtime) that's appropriate for some production uses.
GCC can also use UBSan (and ASan) [1], and furthermore, it shouldn't be "several times slower" (are you thinking of Valgrind?). Clang itself describes the checks as "small runtime cost" [2] and there's a minimal variant as well.
It cannot optimize away correct sanity checks. The sanity checks that it removes are ones that attempt to detect UB after it has occurred. If you instead make sure that checking for UB is checking before the dangerous operation then it is a correct program and the compiler won't surprise you.
> The sanity checks that it removes are ones that attempt to detect UB after it has occurred.
The point is that there is still value in that - your running program can at least say "oops" after the fact. You'll at least get errors when running your tests!
Removing the `printf ("oops")` altogether under the guise of "well, something 3 lines above was UB, so we can delete the rest of the program" is what users are complaining about.
The way the compilers go about it now, is that they don't even let you fail tests because you made an oopsie a few lines ago and now everything after that is considered safe for deletion.
I literally want every single one of those lines emitted in the final output.
1. I don't want the 'myvar[0]' assignment removed because the compiler determined that mvar is NULL. If it's there, at least I'll get a crash before the 'oops'. If the crash doesn't come, at least I have the 'oops' to warn in testing.
2. I don't want the `myvar == null` check removed. If the crash in the assignment didn't happen, I **want** the message 'oops' printed.
The compiler is clearly going against the intention of the programmer if it fails to emit code for any of those lines, because it turns a failing test into a passing one.
The challenge is that a compiler cannot really provide consistent "after UB occurs" behavior that you can rely on for these sorts of post-facto checks without pessimizing behavior in pretty bad ways in a lot of situations.
If you want to be able to observe values through invalid pointers then the compiler can't do things like store locals in registers since maybe you care about checking a value through one of these pointers. Huge mess! Like, whose to say that the line "myvar[0] = 'A';" doesn't write to the location in memory where the constant string "oops\n" is stored? How will the compiler ensure that your program does what you expect after a bogus write somewhere in memory?
> I'm not asking that it ensures that the line after the error executes, I'm asking that it not be eliminated.
I'm not sure what 'eliminated' even mean here. If the pointed to object has been replaced by a register, there might not even be a pointer to check. I.e. there might not be any meaningful code the compiler could generate for the line.
Compiler transformations attempt to preserve semantics of the code, do not attempt a faithful line-by-line translation (beyond toy compilers). If there is no semantic for a line, how can the compiler translate it?
Consider this code:
static inline int * get_x(void* ctx) { return (int*)ctx; }
static inline int indirect( int*(*getter)(void*), void*ctx) {
int * ptr = getter(ctx);
if(!ptr) { return 0; } // 1
return *ptr;
}
int main() {
int x = 10;
return indirect(get_x, &x);
}
After optimization, the code can be (and indeed is: https://godbolt.org/z/5vb9PdnYs) transformed to a simple "return 10". There is no meaningful translation for the line at [1] as there isn't actually any pointer or memory location the pointer could be pointing to in the translated program.
It doesn't even make sense to warn, as main, get_x and indirect could be in completely independent libraries and each make sense as written on its own (getter+ctx is just an hand built closure, so the code is far from non-realistic).
> If the pointed to object has been replaced by a register, there might not even be a pointer to check.
I dunno if that is relevant. In the example you posted, simply removing the `static inline` leaves a pointer to check. The value isn't moved to a register.
We are arguing the cases when the source lines are there, but then are not emitted. In the code samples under discussion, the pointer values are still there and can be checked. Their values are not sitting in some register.
> (getter+ctx is just an hand built closure, so the code is far from non-realistic).
Qualifying with `static inline` is unusual, and none of the objections are using static+inlined code as samples of poor code generation. The examples being presented are a lot more real (as they are taken from existing project, not contrived to make an argument).
The important bit is that the functions are inlined into main, so for that specific code path the check is removed. The fact it would be left there in the non-called function is not really relevant. I used static inline (just inline would have been enough) just to get rid of irrelevant code.
The code under discussion does not compile, so it is hard to discuss it. Please consider this variant: https://godbolt.org/z/EMzr4naMP
As you can see, while the check is still present in the non-inlined foo, main doesn't actually call it and omits the check. There is nothing left in main to check as there is no myvar object left.
This example and my previous one are similar: in both examples, not only the compiler could statically compute the value of the pointer, but it can track the pointee directly ('int x' in my example, the null pointer in your), hence it doesn't need to allocate any memory or registers for the pointer.
Separately, in both cases the compiler was able to statically prove that the pointer must be pointing to something (in my example because it knows the target, in yours because of the assignment through it), so the null check can be removed as the value of the condition is statically known to be always true.
Finally, I believe that in your example GCC found the contradiction with the pointer being both null and not null and realized that the whole main function is not possibly reachable, hence it isolates it with ud2 (clang is just silly).
What code would you expect GCC to generate in main to test the condition? Should it allocate a register, initialize it just for the purpose of performing the branch? Should it do it just for your example or also for mine?
I'm coming round to the idea that entirely eliminating lines of code in the optimizer is something that should be a warning instead, to be fed back to the user.
(NB entirely. That is, it's reasonable to coalesce arithmetic expressions together or inline functions such that a single operation in the output has subsumed multiple lines of code. However, if for a source S you can remove one or more entire lines of code to produce source S' with the same output, then something is wrong with the intent of the program or the compiler has misinterpreted it)
It's quite common in large C programs to have setups where some code might be under an if() statement and the if() condition is something that is compile-time 0 if support for a feature wasn't compiled in. (The idea is that you get syntax checking in a way that you don't for just ifdeffing-out the code, and you don't get weird nesting of preprocessor ifdefs with C ifs.) If you warn about the elimination of code like that, it makes that technique rather awkward.
(For instance, in QEMU we have code like: "if (kvm_enabled()) { do_something_kvm_specific(); }". In a no-KVM build, kvm_enabled() evaluates to compile-time false, and in a KVM build it does a runtime check of whether the user turned on KVM for this run.)
This would produce at least one warning for almost every nontrivial function in your program, even with your concession towards simplifying arithmetic expressions.
Copy elision and dead store elimination are extremely common examples. Redundant bounds check elimination on containers is another. I'd consider hoisting outside of a loop also to be eliminating code.
I suppose some of this depends on what you mean by "eliminating a line of code", as an optimizing compiler pretty quickly transforms the program into a state where any kind of 1:1 mapping between lines of code and instructions is impossible.
I suspect I have misunderstood the point you were making in the previous comment, are not the first two already things one can get warnings for, and the third a useful thing to know you're doing redundantly?
I don't think hoisting would be covered by this suggestion, pjc50 suggested deletion rather than moving.
I read the proposal very literally, which also means I'm unclear what options you have in your final paragraph:
Compile function as is; For each line, make a copy with that line missing and recompile; when this identical to the original, flag that line as "warning: optimised out of existence"
This approach would increase compile times by an outrageous amount, since you now need to compile each function N times.
Optimizing compilers can also produce different output but with identical semantics depending on a lot of heuristics. What happens when deleting a line of code produces different inlining or specialization decisions? What is a rigorous definition of "identical"?
"Deleting a line" is also not the only way that unexpected and perhaps unwanted behavior from UB reasoning appears. Unexpected code motion is absolutely a thing that people have complained about. I don't think you can so quickly insist that things like hoisting are out of scope here.
As a compiler--indeed, optimization--developer, one of the things that continually frustrates me about discussions on UB is that so many suggestions are based on just complete incomprehension of how compilers work. One of the first things compilers do is throw away the original source code; outside of initial front-end codegen, there is no reference to the original AST, and instead everything is based on a pseudo-assembly IR. A concept like an "entire line of code" just doesn't exist in IR; even mapping instructions back to line of codes is a difficult process because frequently they aren't even attributable to a line.
> even mapping instructions back to line of codes is a difficult process
Enough is retained to allow line-step debugging. Godbolt will even do a nice mapping for you between source and object lines. Yes, a lot of it gets thrown away. My point is "this line of code has no effect on the output" is a problem, and the developer should be warned rather than lulled into a false sense of security that their check is doing something.
(Code coverage has the same issue of doing line attribution, plus the difficulty of hitting error conditions in practice, but it would also show unexecuted lines of code)
> Enough is retained to allow line-step debugging.
Only without optimization. Every compiler I have ever used (and I have used a lot of them!) starts having difficulty with step-debugging at the lowest-level of optimization, typically -O1. Some functions work, most don't.
> One of the first things compilers do is throw away the original source code; outside of initial front-end codegen, there is no reference to the original AST
That doesn't have to be an immutable fact for all implementations of compilers for all languages. The Rust compiler has multiple IRs (AST, HIR, THIR, MIR & LLVM IR), and for all but the last one (codegen), we keep around, if not a reference to the HIR node that allows us to navigate its context, at least a Span that points at the user's code (sometimes a whole struct or linked list of "facts" we've collected so far, each containing Spans). rustc also throws away information to reduce memory consumption, much to my dismay, but we still perform later checks to re evaluate (sometimes as an approximation) what the reason for a certain "decision" was, so that we can talk to the user in terms they understand, and not in terms of compiler internals.
The ideal design would be to have a single implementation of all logic with two modes of operation: the low memory consumption one (what you state as the way compilers work), and a "track everything without discarding metadata until the end" mode, that can provide better human readable output.
I watched the video and didn't learn anything new. I know UB pretty well.
> Compilers generaly do not abuse UB
It's disheartening to see this take from the guy who said that one word broke C.
It is also false, in my experience because some people, including compiler developers, think that compiler freedom is the definition of UB, not a side effect of that one-word change:
So your claim in the video that such things are scary to compiler developers is sad. They have broken existing code; they are not afraid to.
I used to be somewhat confident (https://gavinhoward.com/2024/05/a-grateful-open-letter-to-je...) in the direction of C with you and JeanHeyd Meneide on WG14. But if you sincerely believe what you just said, then it seems you have changed your mind since you wrote that one word changed C. So I am not as confident anymore.
My "one word broke C" article is wrong in all kinds of ways. I understand C way better than i used to. The things that need fixing in C are not UB, its very complex memory model stuff. I have no interest in changing any fundamentals or adding new features.
00UB does need fixing, and C needs less UB. If you don't think so anymore, then C will become untenable for me to use.
> I understand C way better than i used to.
Maybe, but I did not see it in that video.
What I do know is that you do not know much about the humans using C or the environments where C is used. C is a tool for humans, not machines. Prioritizing machines over humans will make C worse over time, and implementors will use that to further excuse their behavior.
> Compilers generaly do not abuse UB (outside of compiler bugs), its just that UB is a very missunderstood subject.
Perhaps it is. It wasn't always so.
In K&R the term "behavior is undefined" occurs often. Everyone understood it's meaning. It meant "you get what the hardware gives you", meaning the compiler will output the same instructions it always does, but K&R didn't say what those instructions would do (typically because it couldn't). Of course what the hardware did on any given arch was perfectly well defined.
The definition had it's upsides and downsides. On the upside, programmers took advantage of their knowledge of the hardware they were targeting to write efficient code. Embedded programmers tend to do that sort of thing fairly aggressively (for example, there is often something useful stored in location 0). The downside is if they did that then their code wasn't portable.
The definition gets ugly if the programmer is trying to write portable code, because it means they don't get warned if they code they wrote wouldn't port easily. As a consequence writing non-portable code was and remains easy mistake to make. The sane solution was a --error-if-not-portable compiler option.
But that's not what we go is it? Instead the meaning of "the behavior is undefined" morphed from "hardware defined" to "implementations are allowed to assume that the respective runtime condition does not ever occur". From what I can tell compiler writes turned that definition into "the behavior is compiler writer defined" so they could gain some edge in the "who has the best optimiser" games they love to play. Consequently the definition the compiler writer uses is almost always "delete the code".
But doing that made it harder to write correct code. Whereas before the code always had the same meaning on some hardware, it now changes it's meaning depending on the same hardware on whether you supply -O0 or -O2. And it does so without warning, because we never got the --error-if-not-portable option. The result has been numerous bugs. For example take:
int parse_packet()
{
uint8_t buffer[1500];
int len = read_from_internet(buffer, sizeof(buffer));
uint32_t field_size = ntohs(*(uint32_t*)buffer);
if (buffer + field_size >= &buffer[len] || buffer + field_size < buffer)
return -1; /* error return */
/* continue parsing the packet. */
return 0; /* success */
}
In the K&R world it's clear what the programmer intends, and on all arch's I know a straightforward compilation would produce the behaviour he intended, and indeed "gcc -O0" produces what they expected. But "buffer + len < buffer" could only be true if len is so large it wraps to before buffer. That's UB so it triggers the "implementations are allowed to assume that the respective runtime condition does not ever occur" clause. Consequently gcc -O2 deletes that test. This really happened, and the result was a CVE.
There were two reasonable outcomes for this code. One is the K&R approach. The other was the --error-if-not-portable approach which means refuse to compile the code. I think most compiler users (as opposed to people playing word games in order to win some optimisation game) would call what actually happened "compilers abusing UB". That because no one wins from that particular "optimisation", except the compiler writer doing some micro benchmark. At best the programmer had a flaw the compiler knew about and exploited, but didn't warn him about. The users of the compilers output got hit with a CVE.
That's the best interpretation. The worst is the C standard committee has lost the plot. Their goal should be to produced a simple, clear standard even a novice programmer could safely pick up and read to learn the language. That is what K&R was. Instead we've arrived at the point governments are saying the language is too dangerous to use.
> clang used the fact that division by zero is undefined and thus argc must not be zero to entirely remove the condition if (argc == 0), knowing this case can never happen [2].
Serious logic error, surely. "Can never happen" does not follow.
"is undefined" there is rather hand-wave-y; really what that's supposed to say is that doing division by zero can do arbitrary things.
Sure, looking at the assembly it looks like the compiler has removed the 'if' branch, but, under "as if", it's equivalent to the compiler having rewritten 'a/b' to 'if (b==0) { printf("B\n"); } a/b', which strictly only modifies behavior at the UB.
And, while it may seem strange that UB results in such "copying" of code with minor alterations, that's not particularly rare at all. Take '(uint32_t)100 / someU32' on RISC-V, which'll result in -(uint64_t)1 on someU32==0 via the simple implementation of divu or divuw, despite that not being a valid uint32_t value, which'll result in significantly weird following behavior despite just being the most literal lowering.
> language specification doesn't define what should happen during execution.
this is subtile misleading unspecified and undefined behavior are not quite the same
UB means it's (simplified) specified that the compiler is allowed to do whatever it wants (or more concretely in most cases is allowed to assume that a specific thing is impossible when optimizing to a point where of it does happen pretty much anything can happen including things like it seeming that an int is two different values at the same time or (which might still be one of the more harmless wtfs which can happen))
Note that this article is incorrect about "The mere existence of UB in the program means all bets are off and the compiler could chose to crash the function immediatley upon entering it.". In C this is not true.
I have to do a lot of stuff to avoid UB in my code. I even implemented UB-free two's complement arithmetic using unsigned types just to avoid all of the problems.
And it has definitely gone too far when people think that the compiler doing anything on UB is the definition of UB, not just a side effect.
I am trying my best. As a first step, C23^1 will have this clarification:
"Note 3 to entry: Any other behavior during execution of a program is only affected as a direct consequence of the concrete behavior that occurs when encountering the erroneous or non-portable program construct or data. In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an operation with undefined behavior in the execution of the program."
Regarding signed overflow, I find it extremely useful that it is UB which allows me to instruct the compiler to insert a trap. This is very helpful to find bugs. For the same reason, the Linux kernel community (or some of them) want an attribute what would make overflow of annotated unsigned types also be UB.
It can be done as a library, but that is far less ergonomic than having the checks for actual arithmetic operators.
However, I think that consistency should win here. Given that unsigned arithmetic already wraps around - and this cannot be changed for backwards compatibility reasons - making signed arithmetic do something different is just bad design. It would certainly be better if the default behavior was to trap rather than wraparound or saturate, but, well, we're talking here about a language that is >50 years old, so that ship has sailed.
It would sure be nice to get proper dedicated operators for trapping and saturating arithmetic in some future version of the C standard, though.
What is consistent depend on the perspective. For me signed integers are a model for integers and overflow means that I exceeded the capabilities of the machine to model those correctly. Unsigned is a model for modulo arithmetic. Wrapping signed integer would not model anything useful.
If signed ints were truly a model for true mathematical integers, they would be unbounded (as in e.g. Python).
C is low-level enough that it's not particularly useful to think in those terms, IMO. If you're writing in C, it's usually for one of the two reasons: either you want to be "close to the metal" because you're doing low-level stuff, or it's a legacy codebase. From the first perspective, I can't think of any "modern" - as in, past three decades - architecture for which signed wraparound is not the default & fastest behavior. From the second perpective, a lot of legacy code actually assumes signed wraparound (because compilers used to just defer to what the hardware actually did).
FWIW the reason why C had UB for signed overflow historically is because back when it was being standardized, there were still machines around that used something other than two's complement to represent signed values. The original ANSI C89 straight up refused to specify the signed representation, and C99 narrowed it down to several options; thus, the obvious hardware implementation would in fact produce different results on overflow. Conversely, unsigned integers always had a well-defined representation for which wraparound was simply the natural hardware behavior. I don't think this was ever meant to have some kind of higher meaning.
Given that C23 explicitly mandates two's complement for signed now, IMO, the signed overflow behavior should just be made consistent with that as well as real-world hardware. The only claimed downside to this is that compilers can no longer "optimize" code that was previously broken but would now be well-defined, but I don't see why that is an actual problem.
You realize that I am part of the standard's committee? ;-)
I am programming C because Python is too slow for my use case (and too annoying and to unstable) and yes, I use C's integer types as an abstraction for mathematical integers. And I would say this is the case in almost all C code I see and most programmers I talk to. How would two-complement's wrapping for int be useful for anything? It does not model anything useful IMHO.
And the problem with compilers not optimizing is that people want them to optimize... Compilers not being able to to do this as well anymore would indeed be an actual problem for many people.
It would be extremely useful e.g. because you could perform overflow checks as easily as:
if (x + 1 < x) // overflow
as opposed to the much more complicated dance that you have to do today to be conformant.
I don't disagree that people want compilers to optimize in general. But can you give an example of optimization that is 1) relies on compiler treating signed overflow as UB, and 2) is actually useful (i.e. makes conforming code run faster)?
As far as modeling, the very fact that C ints are bounded - and lest we forget, unless you use `long long` everywhere, even `long` doesn't guarantee you more than 32 bits per the Standard, which is not all that large! - they are not a good abstraction for mathematical integers. C developers use it as such in practice because it is convenient, but consequently tend to ignore the overflow behavior, not the least because dealing with signed overflow is so inconvenient. Even today, I suspect that the vast majority of production C code out there is not actually safe wrt to signed overflow-induced UB due to overly large inputs (e.g. how many CLI tools do such validation for values that come out of argv?).
There's a reason zig made unsigned overflow undefined: it allows for more optimisations (Zig can also afford it thanks to their wrapping addition (+%) operator).
> I use C's integer types as an abstraction for mathematical integers.
Nothing on you, but in my mind, this is a sign of just how brain dead and unprofessional our industry is.
Using hardware integers as mathematical integers is like using aluminum instead of titanium for a high-stress part in a fighter aircraft. Sure, it does just fine in normal flight; what's the big deal?
And then the pilot needs to pull a 10-g maneuver, putting 10 times the stress on that part. And then the pilot dies because the aircraft breaks apart in midair.
If you need mathematical integers, use GMP. Or implement your own. It isn't hard. I implemented rationals using 32 bytes and no allocations when integers are no bigger than hardware integers.
Adding all of that with this:
> You realize that I am part of the standard's committee? ;-)
Look at any C program. What are integers used for? Counting things, loop indexing, etc. These are all semantics of mathematical integers. Yes, it is an approximation that only works as integers do not get too big. But the limits are still bigger than most numbers people do use in their daily lives. Most people could not calculate with higher number in their head, are you saying their mental model of integers is a small integer model different from mathematical model? This misunderstands the purpose of mathematical abstractions at a fundamental level.
Big number integers also has use cases, but this is has very special purposes. And also big numbers are not "true" mathematical numbers if you want to be really pedantic, because if you memory is limited also this abstraction breaks downs at some point. Models for true mathematical integers do not exist in the physical world, so that "int" isn't one, is meaningless pedantry.
> Look at any C program. What are integers used for? Counting things, loop indexing, etc.
Both of those should use unsigned integers, and they are limited by the size of the machine, so no, they don't need a mathematical model of integers. A good machine will have a size_t size that can hold any size of object, and by extension, any number of elements of any size, including char.
So for counting things and loop indices, size_t should be used, not any signed types.
In my code, I essentially use just size_t and unsigned char (because the standard did not specify the signedness of char up to at least C11). If I use something else, I am checking bounds.
> Most people could not calculate with higher number in their head, are you saying their mental model of integers is a small integer model different from mathematical model?
Are you seriously trying to accuse me of this? I am the one saying that hardware integers are not a sufficient abstraction for mathematical integers, yet you say that I assume that people have a small integer mental model?
No, I am not. I am the one telling you that using hardware integers in place of big ints is not good.
> Models for true mathematical integers do not exist in the physical world, so that "int" isn't one, is meaningless pedantry.
True, every abstraction is leaky, but big integers and rationals can get so large that it doesn't matter.
And yes, big ints are a leaky abstraction, but they leak so much less than hardware integers because they don't wrap and because they are not subject to 00UB when unsigned types are used.
This is great. But why do you care about what that standard says?
Making it UB in ISO C ensures that no portable program can rely on a specific behavior and this is what makes it possible to find bugs this way, because it is plausible to assume that a program with overflow is buggy. This is also why this does not work for unsigned. We can easily change a compiler to trap on unsigned wraparound. But this is mostly useless because there would be far too many false positives.
It does not need to be able to rely on having the behavior that causes the bug to be visible. We just need tools that make the bug visible, e.g. the undefined behavior sanitizers. Those tools work extremely well for signed overflow, but not for unsigned, exactly because unsigned wraparound is defined and for this reason many programs use it. So you can not distinguish between intended wraparound and incorrect wraparound.
For this reason, signed overflow is essentially a solved problem, while unsigned wraparound will be the source for many interesting bugs - exactly because it is not UB!
An integer constant expression with value 0 is guaranteed to convert to a null pointer (in fact, the "null pointer constant" was traditionally defined to be precisely that). But an integer with value 0 that's not an integer constant expression can potentially be legally a pointer value which is not a null pointer. Which means, confusingly, that it is legal for this to print true:
if (((void*)0) != (void*)(0, 0))
puts("true");
(Conversion between integers and pointers is a lot less well-specified than many people suspect, and that's before people start pondering provenance.)
"A pointer whose value is null does not point to an object or a function (the behavior of dereferencing a null pointer is undefined), and compares equal to all pointers of the same type whose value is also null."
My point was the dereference itself could be a valid dereference depending on architecture. You can have a pointer to address 0 or 42 or whatever and it all depends on what is mapped to that address. Could be ram, could be some other hardware etc
I guess I was nitpicking. For gcc and the target architecture that is compiling that code it is UB so it is optimizing things with that information
Does "A pointer whose value null" always means that it is literally value 0? That is how it is in practice but how C++ handles architectures where 0 is a valid address that a program can dereference? Does cpp compilers use another value for nullptr in such architectures?
It is UB for any C compiler and for any architecture because the language defines it as such.
However, because it is UB, the compiler doesn't need to do anything special if there's no trap on null dereference, since "reading something" is a perfectly valid subset of UB. The only thing that it needs to ensure is that no global, local, or heap-allocated C object can end up at address 0, which is easily done by "wasting" a single word at that address (or using it for some internal purposes).
It may still be that there is something useful at address 0 that some code may want to access. For example, in real-mode x86, the interrupt vector table starts at that address, so if you want to change what gets called to handle INT 0h, you'd need to write into it. But at this point you're already deep into architecture-specific stuff, and so it's not unreasonable to expect you to use some special facilities that a compiler for this architecture might provide to enable it, instead of the usual pointer dereference.
I am a bit confused, in another post you mentioned
The C spec only requires that integer constant expressions that have zero value are nullptrs. The moment you assign it to an int variable, it's no longer a constant expression as far as the language is concerned, so if you then cast that int variable to a pointer, the result is indeterminate
It is not an integer constant that gets assigned to a pointer so this is not a null pointer in that case? It is just a pointer with address of 0 (which very well might be the implementation of nullptr but from my understanding that is not required?) so shouldn't nullptr dereferencing rule not apply here?
Indeed, it's UB not because it's dereferencing a nullptr; it's UB because it's dereferencing an invalid pointer (i.e. a pointer that does not point to any C object).
An implementation could declare that there is a valid object at address 0, I suppose, but strictly conforming code cannot rely on this. As far as I remember, the only ints you can cast to a pointer and then safely use are those that were cast from (valid) pointers in the first place.
> For gcc and the target architecture that is compiling that code it is UB so it is optimizing things with that information
For every C and C++ standards compliant compiler it is UB. Every single one. By definition, whether or not the compiler then takes advantage of that. It may or may not be legal on the target machine in question, but C and C++ don't target the machine in question, they target an abstract machine with certain properties. The behavior of the abstract machine is then translated to behavior on the target machine. Dereferencing the nullptr on the abstract machine can result in any behavior at all, including what you describe here, but it is not promised by the standard.
If you want to access memory at address zero, you cannot do it in a defined way from within the language. You can drop to assembly or use compiler-defined extensions, but if you don't do it like that, you are playing with UB.
This is where I get a bit lost in the spec, I think... Can the compiler know that the value of `i` is 0?
If yes, then the spec says regardless of how the implementation defines nullptr, assigning literal 0 to a pointer variable is assigning a nullptr (i.e. the compiler must turn `0` into whatever the nullptr is in the underlying implementation).
If no, then 0 is just some unknown value and the assignment gives you an implementation-dependent (not necessarily invalid) pointer to whatever address `0` means on this architecture.
The C spec only requires that integer constant expressions that have zero value are nullptrs. The moment you assign it to an int variable, it's no longer a constant expression as far as the language is concerned, so if you then cast that int variable to a pointer, the result is indeterminate.
(Note also that in C, unlike C++, "const" really means that it is read-only, not that it is a constant expression, so that doesn't change anything.)
However, even if the resulting pointer is not a null pointer, dereferencing it is still UB unless it points at a valid C object. So if the compiler can statically prove that for this architecture, there's no valid C object at that address (e.g. because the compiler itself never produces code that would result in objects placed at that address), it can still treat it as UB, same as dereferencing a literal 0.
How does that interact with using C++ to access hardware features via direct memory, such as UART registers on embedded systems? Is all of that considered UB because the compiler can't prove there's anything at the UART "magic" addresses?
In practice, of course, a useful C++ compiler would let you do it, somehow. But that is firmly in the realm of implementation-specific stuff (keeping in mind that UB can mean anything including being well-defined by a specific implementation.
That summary talks about creating a nullptr out of an int, but not dereferencing it. And, in fact, takes care to note that dereferencing the created pointer is invalid.
"A pointer whose value is null does not point to an object or a function (the behavior of dereferencing a null pointer is undefined), and compares equal to all pointers of the same type whose value is also null."
Developers must avoid UB in their code, at all costs. Wondering what a specific compiler will do with your UB code is useless: as soon as you realise you have UB, go and fix it!
I would argue that one good point of the article is to illustrate to its readers just what it means, that compilers assume code to be free of undefined behavior.
The resulting "reasoning" can be counter-intuitive, and seeing what happens and what problems it can introduce must be instructive to people who are not aware.
Better informed/educated C programmers is an improvement on the state of the world, in my opinion.
More knowledge is good. Hopefully it inspires people to avoid UB and demand better diagnostics and debugging tools for UB. Avoiding UB can include such activities as completely avoiding programming in the language that has UB.
UB is a post-facto justification for compiler optimizations that only work on non-buggy programs. It's a circus because programs have bugs.
I don't think removing all UB is a tractable goal in most languages that are used and allow a lot of UB. In general, devs are choosing those languages because of some other feature that makes dancing on the rim of the cliff perpetually worth it.
Given that reality, knowing what your compiler will do with UB is useful and important because you're going to see it all the time in debugging. Even if it's ugly, it needs to be treated as a familiar face.
I mean, I "fix my code" by avoiding languages with a lot of UB like the plague. I don't think writing a million lines of UB-free C++ is a human-shaped problem. We can pretend this is just a "skill issue," but even if it is it's a skill issue that's endemic to the industry (https://cwe.mitre.org/data/definitions/758.html), so unless you're working on a project solo you're going to come across code that has UB and it's going to be your problem to fix it.
Given that, plus the reality that we insist on continuing to write a million lines of C++, treating UB as something you'll see frequently in your career is probably the reality-based behavior.
> treating UB as something you'll see frequently in your career
Sure it is! I'm not talking about skills: everybody writes UB, at any time. I'm talking about attitude: when you see UB, you must strive to remove it, not to protect it!
I didn't recognize it as your position because you presented "Developers must avoid UB in their code, at all costs" as supporting the preceding statement "This post has no point;" if your point is "developers should remove UB when they see it," then the point of this post is helping people see what UB looks like at the next layer of abstraction down as an aid to identifying and removing it.
(The C++ standard is over 2,000 pages. It's impractical to believe developers see UB by reading the C++ source code; most is identified by an application behaving incorrectly followed by inspection of the machine code to understand where the developer's assumptions about the C++ specification failed to match with the actual specification).
Just to be clear, I don't dislike this post for showing practical examples of UB. I'm just afraid that readers could take the wrong path and reason "This compiler does so-and-so, now I know how to better protect my current code", instead of the correct path of "Undefined means that I don't have the right to any expectation, thus I'll try and improve the portability of my algorithm".
Maybe I did not choose my words properly, I apologise.
If a compiler is certain there's UB should it format the developer's hard drive immediately, or does it have to wait until the program is run before formatting to be standards compliant?
On a more serious level, should it emit a warning? If it knows UB is reached, a warning sounds reasonable, but it's pretty rare that a compiler is 100% certain something is reached, and that analysis is probably not worth the cost.
> If a compiler is certain there's UB should it format the developer's hard drive immediately, or does it have to wait until the program is run before formatting to be standards compliant?
The standard has enough variants of UB that both options are possible!
For normal runtime UB (e.g. division by zero), it needs to wait until it is certain that the undefined code will actually be executed. So compilation/linking must not fail, the behavior is only undefined if the program is actually run.
IFNDR (ill-formed, no diagnostic required) is a different kind of UB; here it is expected that compilation/linking may already fail.
Then there's also the infamous "preprocessor UB" (e.g. https://wg21.link/p2621), which reuses the "undefined behavior" terminology despite not having anything to do with runtime semantics. I guess this is where the compiler might format your hard drive (old gcc versions actually launched nethack on undefined pragmas!).
You seem to care about developers avoiding UB. Exploring how compilers warn or don't warn for some of these cases is very meaningful - maybe it can lead to new warnings, or better tools so that developers can avoid UB.
To me, the very fact that one of the most used Programming languages ever, has undefined behavior - boggles the mind. IMO It is one of the worst parts of C (other than using pointers for Arrays, its string implementation and 0 based indices).
I really don't understand why the C standards body can't just define what should be the intended failure behavior in case of specific UB cases, and then the compiler developers to onboard this spec. There should be no impact to backwards compat because
a. this will be a new C version
b. No program ever should be defined around the UB behavior
But UB still exist in 2024 and will likely do till I am too old to whine on the internet about it.
What would be, for example, the desirable failure mode of an use-after-free? Or a write past the end of an array? Or any of the many UBs that are extremely hard to detect but have unbounded effect on the program?
Remember that the failure mode must be a) implementable, b) implementable at best minor loss of performance, c) not break the ABI.
I agree that C and C++ have way too much UB, but no UB is a pipe dream.
> What would be, for example, the desirable failure mode of an use-after-free?
Implementation defined
> Or a write past the end of an array?
Ditto
> Remember that the failure mode must be a) implementable, b) implementable at best minor loss of performance, c) not break the ABI.
And "Implementation defined" matches that for the two examples you gave.
What it does is force the compiler vendor to say "If you do $THIS, we will do $THAT."
At least then users can make informed choices, sort of like "Clang guarantees that dereferencing a NULL pointer results in an effective address of zero, but GCC makes no guarantees, therefore we'll be going with Clang".
I mean, seriously, there's a lot of footguns in C only because the standards writers said "Undefined behaviour" and never expected that compiler authors will take that to mean "remove code, or anything else, including nasal demons".
"Implementation-defined" means that there's something consistent that happens for a given implementation that can be documented for it. What consistent behavior would you expect to see from a typical implementation on use-after-free or writing past the end of an array? Can you give an example of how that would be documented?
> What consistent behavior would you expect to see from a typical implementation on use-after-free or writing past the end of an array? Can you give an example of how that would be documented?
Well, yes.
"Writing to memory that is not available or allocated will always cause the instructions for the write to be emitted. The statement, expression or function call that performs the write will not be eliminated."
The definition above is better than UB, which boils down to "Writing to memory that is not available or allocated may or may not cause all past and future lines of code to be eliminated from the emitting stage of compilation."
See the difference? The implementation-defined definition forces the compiler documentation to be honest: "We'll remove lines of code from your input" is better than "You broke the standards rules, so we get to do anything."
> Writing to memory that is not available or allocated will always cause the instructions for the write to be emitted. The statement, expression or function call that performs the write will not be eliminated.
That would be a significant constraint on the optimiser.
> That would be a significant constraint on the optimiser.
Maybe. Maybe you work on GCC (or on Clang) and know the dirty details of the code inside ... I don't.
What I do know is that the performance in past C compilers which emitted every memory access unconditionally was considered "blazing fast".
If this is a blow to performance, how significant is it? Can you compile two benchmarks with and without the 'perform every memory access unconditionally' flag? Is there a flag for this? If no one ever did this, how can we tell there would be an impact on performance, nevermind significant impact?
Because I recall seeing benchmarks for things like eliminating `if (x+100 < x)` showing the performance impact to be not even a rounding error. I also recall seeing a benchmark somewhere for with and without the flag that removes null-checks, and once again that performance impact was, for all practical purposes, zero.
In the era of speculative pipelined execution, in respect of a test/check that would be speculated in the pipeline well before it is needed, how exactly will the code run faster if the test/check is removed from the sources?
Once again, I admit that I am not in the weeds in GCC or LLVM or CLang development, but I fail to see how eliminating an instruction that never slows the processor can have a noticeable impact on performance.
I'm not a compiler writer, just a programmer that tries to understand the transformations the compiler is capable of.
-O0 is close to what you want. For hand optimized code there isn't a huge difference between the various optimization levels, but for higher level code C++ code, the difference between -O0 and -O3 are sometimes one or two orders of magnitude.
And yes, not all, or even most, optimizations apply to all programs. All loop optimizations are really most relevant for numerical, vectorizable code; branchy pointer-chasing code won't benefit from them. Branchy code can benefit from other optimizations though that can bring their own issues.
> To me, the very fact that one of the most used Programming languages ever, has undefined behavior - boggles the mind. IMO It is one of the worst parts of C (other than using pointers for Arrays, its string implementation and 0 based indices).
Because the effect wasn't always this bad; past C compilers did not look at a null-dereference and say "This means that we can simply skip the dereference". They emitted the faulty code which would then crash reliably.
Same with things like writing past the end of an array: you'd overwrite whatever was in the memory or stack. Now, if the compiler detects UB, it would silent discard it.
It's the whole "compiler silently doing something unexpected, other than exhibiting the wrong behaviour that the programmer expected." that makes the issue worse now than it's been in the past.
I agree it's surprising and can only suggest the following explanations:
* Undefined behaviour lets compiler developers improve the performance of certain standard benchmarks (slightly but significantly) and they don't want to give that up.
* Most of the people who are interested in programming language design have already deserted C/C++ so the C standards bodies are dominated by the remaining traditionalists.