This is an issue that is ignored by just about everyone in practice. The reality is that most developers have subconsciously internalized the compiler behavior and assume that will always hold. And they are mostly right, I’ve only seen a few cases where this has caused a bug in real systems over my entire career. I try, to the extent possible, to always satisfy the requirements of strict aliasing when writing code. It is difficult to determine if I’ve been successful in this endeavor.
Here is why I don’t blame the developers: writing fast, efficient systems code that satisfies the requirements of strict aliasing as defined by C/C++ is surprisingly difficult. It has taken me years to figure out the technically correct incantations for every weird edge case such that they always satisfy the requirements of strict aliasing. The code gymnastics in some cases are entirely unreasonable. In fairness, recent versions of C++ have been adding ways to express each of these cases directly, eliminating the need to use obtuse incantations. But we still have huge old code bases that assume compiler behavior, as was the practice for decades.
I am not here to attribute blame, I think it the causes are pretty diffuse honestly. This is just a part of the systems world we failed to do well, and it impacts the code we write every day. I see strict aliasing violations in almost every code base I look at.
One of the issues that worked against Euclid's adoption was that its compiler strictly disallowed aliasing. That said, https://dl.acm.org/doi/pdf/10.5555/800078.802513 claims that while they tended to write potentially-aliased code at first, after one made the Euclid compiler happy, subsequent development wasn't likely to reintroduce it.
The insight in languages like Rust is that aliasing is actually fine if we can guarantee all the aliases are immutable and that's facilitated by default reference immutability. [These alias related] Bugs only arise when you have mutable aliasing which is why that doesn't exist in safe Rust.
That paper also highlights that checking is crucial, their initial Euclid compiler just required that there's no aliasing, but never checked. So of course programmers will make mistakes and without the checks those mistakes leak into running code. The finished compiler checked, which means the mistake won't even compile.
Shifting left in this way is huge, WUFFS shifts bounds misses left - when you write code which can have a bounds miss in C of course it just does have a bounds miss at runtime, there's a stray read or overwrite and chaos results maybe it's Remote Code Execution, in Rust the miss panics at runtime - maybe a Denial of Service or at least a major inconvenience. But in WUFFS it won't compile - you find out about your bug likely before it gets sent out for code review.
Most software can't be written in WUFFS, but "most" is doing a lot of work there, plenty of code which should be in WUFFS or an analogous language is not, meaning mistakes are not shifted left.
Indeed, another problem is that we have no tools, other than very imperfect linters/compiler warnings, to identify aliasing violations. Even today I don't think sanitizers can catch most cases.
More often than not when I realise that I am violating strict aliasing it is because I am doing something that I want to do and the language is not going to let me. Much hand wringing, language lawyering and time wasting typically follows.
However, it is generally too hard (in C and C++) for compilers to tell whether you were wanting to do the thing at any one particular place.
So compilers have two options: Assume that you never do the thing, or always assume that you don't do the thing.
The former is often better for performance in practice, and it's true most of the time, so here we are.
As has been pointed out elsewhere, one of the strengths of Rust is that it shifts how pointers (references) work and allows the compiler to more often know for certain that you don't do the thing, without making assumptions.
> The reality is that most developers have subconsciously internalized the compiler behavior and assume that will always hold.
I blame this on how people like to teach C and present C.
It's very important that the second anyone conceives of the idea of learning C that they first off informed that trying things and seeing what happens is a highly unreliable method of learning how C programs behave and that C is not a high level assembly language.
If you teach C in relation to the abstract machine instead of any real world machine you will understandably scare off most people. Which is good, since most people shouldn't be learning or writing C. It's a language which can barely be written correctly even by people with the necessary self discipline to only write code they're 100% certain is well defined.
> It is difficult to determine if I’ve been successful in this endeavor.
Why is your program so full of casts between pointer types that you have difficulty determining if you've avoided strict aliasing?
Yes, if you treat C as a high level assembly language (like the linux kernel likes to do) then it becomes difficult to reason about the behaviour of your programs where 50% of them are in the grey area of uncertainty of whether they're well defined or not.
If you are forced to write C in a non-learning context, don't write any line of code unless you're certain you could tell someone which parts of the standard describe its behaviour.
> Here is why I don’t blame the developers: writing fast, efficient systems code that satisfies the requirements of strict aliasing as defined by C/C++ is surprisingly difficult.
C/C++ isn't a language. So I will stick to C because I don't know nor care about C++.
That being said, it's not hard to write efficient C which satisfies the requirements of strict aliasing except when you're dealing with idiotic APIs like bind or connect. Most code by default, assuming you use appropriate algorithms and data structures, is performant. The only time it becomes difficult with regards to strict aliasing is if you're micro optimizing.
While non-trivial, the case of converting between unsigned long and float shown in the article is entirely possible to do with completely safe C constructs. Likewise serialization/deserialization of binary data never requires coming close to aliasing unless you're dealing with a "native" endian protocol. In the case of general serialisation and deserialisation, compilers will reliably optimise such operations into one or two instructions (depending on whether you're decoding same-endianness or not).
> Why is your program so full of casts between pointer types that you have difficulty determining if you've avoided strict aliasing?
I write database storage engines. Most of the runtime address space is being dynamically paged to storage directly by user space. You can't use mmap() for this. Consequently, objects don't have a fixed address over their lifetime and what a pointer actually points to is not always knowable at compile-time. These are all things that have to be dynamically resolved at runtime with zero copies in every context the memory might be touched. Fairly standard high-performance database stuff. The intrinsic ambiguity about the contents of a memory address create many opportunities to inadvertently create strict aliasing violations.
I've been doing it a long time, so I know the correct incantation for virtually every difficult strict aliasing edge case. Most developers are ignorant of at least some of these incantations because they are surprisingly difficult to lookup, it took me years to figure out some of them. When developers don't know they tend to YOLO it and hope the compiler does the desired thing. Which mostly works in practice, until it doesn't.
Recent versions of C++ have added explicit helper functions, which is a big improvement. Most developers don't know the code incantation required to reliably achieve the same effect as std::start_lifetime_as and they shouldn't have to.
> These are all things that have to be dynamically resolved at runtime with zero copies in every context the memory might be touched. Fairly standard high-performance database stuff. The intrinsic ambiguity about the contents of a memory address create many opportunities to inadvertently create strict aliasing violations.
So how would do this in Rust, if at all? (That's the context of this subthread and the admonition not to play type punning games.)
Rust assumes noalias even for objects of the same type, and that's because the entire language is built on the foundational assumption that you cannot have both a mutable and immutable reference (or two mutable references) to the same object alive at the same time.
The main issue in C and C++ is that them aliasing rules are based on types, and the other aspects of the rules make type punning by casting pointers undefined behaviour no matter how valid either type is for a given region of memory. Rust actually does allow type punning through the obvious routes of std::mem::transmute or the same casting of raw pointers. It's wildly unsafe, and you need to have done your homework on whether the pointers represent valid states for the type you're punning to (right layout, alignment, and values), but it's defined behaviour if you're following those rules (one reason being that the lifetimes of the two references don't ever overlap in rust's model). In C and C++ you need to follow all those rules but also you need to jump through some extra hoops with the right incantations, just a pointer cast will be undefined behaviour even if you got everything else right.
(whats 100% not-OK in Rust is casting away constness. If you have a '&T' you must not touch, on pain of a thousand bugs)
The implementations of this type of thing in Rust I've seen just use "unsafe" everywhere. Even in strict C++ it is tricky to express the dynamic resolution of pointer type correctly. Type punning is a bad idea but there also weren't many alternatives so that was the idiom everyone used for a long time. Bugs do occur due to this, so the admonition against type punning has a purpose. It can be done without strict aliasing violations in C++, but the methods were a bit arcane until quite recently.
These scenarios cause other problems for Rust e.g. DMA hardware tacitly holds an invisible mutable reference to objects, but most developers never have to deal with cases like this. C++ provides some tools to annotate the code so that the compiler understands it cannot see all references to an object or that the lifetime is ambiguous.
This type of code is not common but high-performance storage engines are kind of a perfect storm of architectural requirements that break the core Rust invariants.
Rust's aliasing restrictions only apply to references. If you only stick to pointers and never at any point create a reference, you can alias to your heart's content.
For example, the equivalent[0] of the article's Offset Overlap example is perfectly valid according to Rust's abstract machine. What makes it hard is avoiding the creation of references. If I create a reference, then there's a good chance that the lifetimes don't get correctly linked up, and I accidentally have shared mutation/use after free/other UB.
Oh, wait, I just saw your name, I know who you are. But you are one of very few people on this planet writing C or C++ who get a pass on this kind of thing.
Almost nobody is using C or C++ to write super duper large data high performance databases. And even people who do work on databases don't need these breakneck levels of performance that you've dealt with.
In most cases people are breaking aliasing rules for no real performance advantage. These people should just stop, a large majority of code doesn't need to worry about aliasing rules because the vast majority of code written in C doesn't have these crazy performance requirements.
> The intrinsic ambiguity about the contents of a memory address create many opportunities to inadvertently create strict aliasing violations.
I don't get what you mean, at least the way you've explained it. Your memory might be volatile in the sense that it gets reused but if code is still operating on that memory then you don't have aliasing issues, you just have issues.
You can operate in terms of char * when it comes to your userspace paging implementation and your code which requested this paging (do you use use a segfault handler to implement this?) just operates in terms of whatever type it originally cast the void * value returned by your userspace mmap reimplementation. Am I misunderstanding something here?
> std::start_lifetime_as
I got into reading, since I don't know C++, I only know C, and this sounds like a relevant whitepaper:
By the sounds of it, this is a problem in C++ only, so it explains why I wasn't aware of such an issue. So you're telling me that in C++ you can't reliably implement a userspace mmap (or even use normal mmap) implementation before C++23 because without std::start_lifetime_as the C++ abstract machine doesn't provide a way of specifying when an object's lifetime starts?
This makes me wonder, what even is the incantation you're referring to?
> So you're telling me that in C++ you can't reliably implement a userspace mmap (or even use normal mmap) implementation before C++23 because without std::start_lifetime_as the C++ abstract machine doesn't provide a way of specifying when an object's lifetime starts?
std::start_lifetime_as is just a nice wrapper around an older incantation: do a no-op memmove and cast followed by a constant-folding barrier. C allows type punning with unions but I would assume the constant-folding issue would still exist. Compilers finally became clever enough about constant-folding to cause problems when you reinterpret the type at runtime.
First of all compilers disagree on many interpretations and consequences of
abstract machine rules. Also compilers have bugs.
So a proficient C/C++ programmer does have to learn what compilers actually do in practice and what they guarantee beyond the standard (or how they differ from it).
> C/C++ isn't a language.
It isn't, but it is a family of languages that share a lot of syntax and semantics.
> First of all compilers disagree on many interpretations and consequences of abstract machine rules.
List them. I am not aware of any well defined parts of the C standard where GCC and Clang disagree in implementation. Only in areas where things are too vague (and are effectively either unspecified or undefined), or understandably in areas where they're "implementation defined".
If there are behaviours where a compiler deviates from the standard it is either something you can configure (e.g. -ftrapv or -fwrapv) or it's a bug.
> Also compilers have bugs.
Nothing you do can defend against compiler bugs outside of extensively testing your results. If you determine that a compiler has a bug then the correct course of action is definitely not: "note it down and incorporate the understanding into your future programs"
> So a proficient C/C++ programmer does have to learn what compilers actually do in practice and what they guarantee beyond the standard (or how they differ from it).
There are situations where it's important to know what the compiler is doing. But these situations are limited to performance optimisation, the knowledge gained through these situations should only be applied to the single version of the compiler you observed it in, and you should not use the knowledge to feed back to your understanding of C or the implementation.
It's almost impossible to decipher how modern C compilers work exactly and trying to determine what an implementation does based on the results of compilation is therefore extremely unreliable. If you need to rely on implementation defined behaviour (unavoidable in any real program) then you should be relying solely on documentation, and if the observed behaviour deviates from the documentation then that is, again, a bug bug.
> It isn't, but it is a family of languages that share a lot of syntax and semantics.
I am not a C/C++/C#/ObjectiveC/JavaScript/Java programmer.
C++ and C might share a lot of syntax but that's basically where the similarities end in any modern implementation. People who know C thinking they know enough C to write reliable and conformant C++ and people who know C++ thinking they know enough C++ to write reliable and conformant C are one of the groups of people who produce the most subtle mistakes in these languages.
I think you could get away with these kinds of things in the 80s but that has definitely not been the case for quite a while.
> List them. I am not aware of any well defined parts of the C standard where GCC and Clang disagree in implementation.
Perhaps it's not "well defined" enough for you, but one example I've been stamping out recently is whether compilers will combine subexpressions across expression boundaries. For example, if you have z = x + y; a = b * z; will the compiler optimize across the semicolon to produce an fma? GCC does it aggressively, while Clang broadly will not (though it can happen in the LLVM backend).
This is behavior is mostly just unspecified, at least for C++ (not sure about C).
I'm aware of some efforts to bring deterministic floating point operations into the C++ standard, but AFAIK there are no publicly available papers yet.
P3375R0 is public now [0], with a couple implementations available [1], [2].
Subexpression combining has more general implications that are usually worked around with gratuitous volatile abuse or magical incantations to construct compiler optimization barriers. Floating point is simply the most straightforward example where it leads to an observable change in behavior.
You're very right that this goes above and beyond anything the C standard specifies aside from stating that the end result should be the same as if the expressions were evaluated separately (unless you have -ffast-math enabled which makes GCC non-conformant in this regard).
If the end result of the calculation differ (and remember that implementations may not always use ieee floats) then you can call it a bug in whatever compiler has that difference.
I have no idea how C++ defines this part of its standard but from experience it's likely that it's different in some more or less subtle way which might explain why this is okay. But in the realm of C, without -ffast-math, arithmetic operations on floats can be implemented in any way you can imagine (including having them output to a display in a room full of people with abaci and then interpreting the results of a hand-written sheet returned from said room of people) as long as the observable behaviour is as expected of the semantics.
If this transformation as you describe changes the observable behaviour had it not been applied, then that's just a compiler bug.
This usually means that an operation such as:
double a = x / n;
double b = y / n;
double c = z / n;
printf("%f, %f, %f\n", a, b, c);
Cannot be implemented by a compiler as:
double tmp = 1 / n;
double a = x * tmp;
double b = y * tmp;
double c = z * tmp;
printf("%f, %f, %f\n", a, b, c);
Unless in both cases the same exact value is guaranteed to be printed for all a, b, c, and n.
No, it's not a compiler bug or even necessarily an unwelcome optimization. It's a more precise answer than the original two expressions would have produced and precision is ultimately implementation defined. The only thing you can really say is that it's not strictly conforming in the standards sense, which is true of all FP.
I read up a bit more on floating point handling in C99 onwards (don't know about C89, I misplaced my copy of the standard) and expressions are allowed to be contracted unless disabled with the FP_CONTRACT pragma. So again, this is entirely within the bounds of what the C standard explicitly allows and as such if you need stronger guarantees about the results of floating point operations you should disable expression contraction with the pragma in which case, (from further reading) assuming __STDC_IEC_559__ is defined, the compiler should strictly conform to the relevant annex.
Anyone who regularly works with floating point in C and expects precision guarantees should therefore read that relevant portion of the standard.
"Strictly conforming" has a specific meaning in the standard, including that all observable outputs of a program should not depend on implementation defined behavior like the precision of floating point computations.
It can be controlled through compiler options like -ffp-contract
In my opinion every team finds fp options for their compiler through hard time bug fixing :)
and I am still in shock that many game projects still ship with fast math enabled.
> I am not aware of any well defined parts of the C standard where GCC and Clang disagree in implementation. Only in areas where things are too vague
well, the part of the standard that are vague and/or underspecified is a very large "Here be dragons" territory.
Time-traveling UB, pointer provenance, aliasing of aggregated types, partially overlapping lifetimes. When writing low level codes, it makes sense to know how exactly the compilers implement these rules.
In particular, regarding aliasing, GCC has a very specific conservative definition (stores can always change the underlying type, reads must read the last written type) that doesn't necessarily match what other compilers do.
>> It isn't, but it is a family of languages that share a lot of syntax and semantics.
> I am not a C/C++/C#/ObjectiveC/JavaScript/Java programmer.
C#, Java, JS share a bit of syntax, but certainly not semantics. ObjectiveC/C++ definitely belong. There is a trivial mapping from most C++ constructs to the corresponding C ones.
> well, the part of the standard that are vague and/or underspecified is a very large "Here be dragons" territory.
Sure, but the answer as I said earlier is: don't touch those parts of C.
The subset which _is_ well defined is still perfectly powerful enough to write highly performant software.
It's not like I'm advocating for you to use the brainfuck subset of C.
> When writing low level codes, it makes sense to know how exactly the compilers implement these rules.
Almost nobody is writing C low level enough for this and I've written embedded code which didn't need to worry about strict aliasing.
This is again just a misconception, almost no real programs need to delve this deeply into the details.
> In particular, regarding aliasing, GCC has a very specific conservative definition (stores can always change the underlying type, reads must read the last written type) that doesn't necessarily match what other compilers do.
It doesn't matter what other compilers do as long as in terms of the abstract machine these differences do not break the rules set out in the standard. Again, you do not need to know these details for 99.99% of program code.
> C#, Java, JS share a bit of syntax, but certainly not semantics. ObjectiveC/C++ definitely belong. There is a trivial mapping from most C++ constructs to the corresponding C ones.
There's a mapping from any of these languages to any other one, in some cases also quite trivial, the amount of overlap is immense, but C and C++ have heavily deviated.
I am a C expert, I do not claim to be a C++ expert, every time I look at C++ I am increasingly surprised at just how it redefines something core about C. Something I just learned in this very thread is https://en.cppreference.com/w/cpp/memory/start_lifetime_as which doesn't exist in C because apparently C and C++ define object lifetimes completely differently.
It's dangerous to keep pushing this notion that C and C++ are very similar because it leads to constantly leads to expert C++ programmers confidently writing subtly broken C code and vice versa.
Not sure if this is exactly the same scope as what you're asking about, but here's an ESSE '21 paper titled "The Impact of Undefined Behavior on Compiler Optimization": https://doi.org/10.1145/3501774.3501781
Obviously it’s going to vary from program to program. And you always have to be skeptical that removing the safety for performance hasn’t given you a faster but faulty program.
That being said, my intuition matches what little anecdotal data I’ve seen from real perf-sensitive systems, and I’d ballpark 10-15% where it matters.
Real-world performance: not enough to be measurable, certainly remotely enough to make up for the time we lose to debugging.
But no-one cares about real-world performance, people pick C and pick a C compiler because they want the thing that's fastest on artificial microbenchmarks.
I learned C on the Amiga, back in the late 80's, and the OS made heavy use of "OO-ish" physical subtyping with structs everywhere. I don't think anybody even thought about strict aliasing violations.
Compilers in the 1980's really weren't sophisticated enough to have this problem. A function call was a hard barrier that was going to spill all GPRs, inlining was almost unheard of. What the code did was what you saw, and if you had an aliased pointer it's because that's what you wanted.
And when it became an issue c. late 90's, it was actually "NO strict aliasing" that was the point of contention. Optimizers were suddenly able to do all sorts of magic, and compiler authors realized they were getting tripped up by the inability (c.f. the halting problem) to know for sure that this arbitrary pointer wasn't scribbling over the memory contents they were trying to optimize. You'd get better (often much better) code with -fno-strict-aliasing, which was tempting enough to turn it on and hope for better analysis tools to come along and save us from the resulting bugs.
The Amiga C compilers most likely didn't do a lot of optimizations where strict aliasing would matter though (at least from what I remember it was pretty straight forward, a memory read or write in C typically resulted in a memory read or write in assembly).
Basically, C code compiled to assembly in the Amiga era looked much more straightforward than the output produced by modern C compilers (with optimizations enabled at least), you could put both side by side and see a near 1:1 relationship between the C code and the assembly code (maybe also because the Motorola 68000 seems to have taken a lot of inspiration from the PDP instruction set).
There are no obvious changes to a widely used programming language standard.
Even small changes often require years and many revisions to be accepted - burnout is common. You would need to build a consensus that this change is desirable - that's highly unlikely at best. Strict aliasing has been widely implemented since the 1990s and many compilers benefit from the rules; many compiler vendors are on the committee. You'd have to convince them that they should make their customer's code slower.
What might be achievable, however, is some kind of technical report on undefined or implementation defined behavior. Many compilers have options that allow programs with some undefined behavior to behave as the user would expect. Microsoft's C and C++ compilers, for example, don't enforce strict aliasing and allow some forms of integer overflow in loop conditionals. There would be substantial value in defining a common profile for these options. It would still be an uphill battle to get it through the committee, though.
A while ago the C++ committee tried to standardize function argument evaluation order. It actually made it to the draft standard, but it had to be reverted when it was presented with real world performance regressions.
If we can't even get that, I doubt strict aliasing will ever be voted out.
Compiler writers tell me that it makes a big difference to optimization. I am careful to never cast anything in ways that there are problems and so I run with strict aliasing. My project started in 2010 though, so we had plenty of prior best practices to help us know better and no legacy code that is hard to refactor to make correct. We have had out share of memory issues, but never anything that could be blamed on strict aliasing.
How do you accomplish the waiting operation? If it does not synchronize with the other thread, the compiler will optimize away the load. This isn't too surprising once you assume that not every *x in the source code will result in a memory access instruction. I would even say that most C programmers expect such basic optimizations to happen, although they might not always like the consequences.
In a multithreaded, hosted userspace program the wait operations should synchronize with another thread. This involves inserting optimization barriers that are understood by the compiler, therefore it can't optimize the this case to always return 0.
In embedded this situation is quite common when x points to a hardware register. The typical solution is to declare x as volatile[1] which tells the compiler to omit these optimizations.
It's very common for beginner embedded programmers to forget to do this and spend hours debugging why the register doesn't change when it should.
No, because in practice that "wait until" operation will act as a memory barrier. The obvious one is a function call. Functions are allowed to have side effects, one possible side effect is to change the value pointed to by an externally-received pointer.
At lower levels, you might have something like an IPC primitive there, which would be protected by a spinlock or similar abstraction, the inline assembly for which will include a memory barrier.
And even farther down still, the memory pointed to by "x" might be shared with another async context entirely and the "wait for" operation might be a delay loop waiting on external hardware to complete. In that case this code would be buggy and you should have declared the data volatile.
> No, because in practice that "wait until" operation will act as a memory barrier.
This is a wrong, a memory barrier would not salvage this code from UB. The read from `x` must at the very least be synchronized, and there might be other UB lurking as well.
No, synchronization is a different issue entirely. The question upthread was whether it was OK for the compiler to optimize the code to return a constant zero without actually reading from the pointer. And it's not, because any reasonable implementation of "wait for" will defeat aliasing analysis.
You're right that if you try to write async code with only compiler instrumentation, you're very likely to be introducing race conditions (to be clear: not necessarily on architectures with sufficiently clear memory ordering behavior -- you can play tricks like this in pure C with x86 for instance). But that wasn't the question at hand.
That's the most straight-forward example of undefined behavior badness you'll find. Things on practice are usually way less intuitive than this (mostly because people notice and avoid writing those straight-forward problems).
Here is why I don’t blame the developers: writing fast, efficient systems code that satisfies the requirements of strict aliasing as defined by C/C++ is surprisingly difficult. It has taken me years to figure out the technically correct incantations for every weird edge case such that they always satisfy the requirements of strict aliasing. The code gymnastics in some cases are entirely unreasonable. In fairness, recent versions of C++ have been adding ways to express each of these cases directly, eliminating the need to use obtuse incantations. But we still have huge old code bases that assume compiler behavior, as was the practice for decades.
I am not here to attribute blame, I think it the causes are pretty diffuse honestly. This is just a part of the systems world we failed to do well, and it impacts the code we write every day. I see strict aliasing violations in almost every code base I look at.