This example will do just fine, in fact. It is arguably not undefined, as you say. It is also silently miscompiled by both Clang and GCC.
And there you have one of the problems with type-based aliasing optimizations: the people who write compilers have not read the standard and just make it up as they are going along. They have been acting like the six-year-old who pretends to be reading rules from the back of the box in a game of Monopoly.
If you split the type definitions across different translation units, it will stop working (because it can't know they are compatible), but this is one of the weird edge cases.
"And there you have one of the problems with type-based aliasing optimizations: the people who write compilers have not read the standard and just make it up as they are going along. They have been acting like the six-year-old who pretends to be reading rules from the back of the box in a game of Monopoly.
This is, well, bullshit. The situations get incredibly complex very quickly, and it's completely unclear in a lot of cases what the standard meant to happen.
You should not assume bad faith without a good reason.
The vast majority of people implementing this stuff either are committee members, or work closely with them, so saying "they haven't read the standard" just makes you look petty, because, in a lot of cases, they helped write the standard.
At the same time, while implementing it, i think we filed something like 15 DR's against the standard, some of which are still unresolved because the committee didn't know what to do and punted on it due to lack of consensus. So if you want to say who makes it up as they go along, i think you may be pointing fingers in the wrong direction:
Take, for example, DR 236 which was filed in 2000, and was not resolved until 2006, and they basically punted (note how the answer does not answer the questions):
(note also that most DR's from the same time period were resolved in 1 year or less)
GCC had at least 3 committee members who were consulted on most of this stuff, and agreed with the current set of interpretations.
In short, if you think it's so easy to do, feel free to fix it.
I think you will find yourself quickly in a world of trying to figure out what anyone meant to happen.
Contrary to what you seem to think, there are rarely objectively right answers, just interpretations that you can get consensus or no consensus for.
(Not sure what one would expect from a programming language standard built by something akin to the UN)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14319 (12 years ago, status: suspended)
This is the union case you can't make work right all the time, and is actually two DR's, as the bug says. This is a case where the standard is completely broken (and will likely never be fixed)
The viewpoint of committee members i spoke with is that explicitly visible union accesses should be required to get the right answer here, because anything else is insanity.
I could place your two memory accesses in a union, in a different translation unit, and pass it to this function, and you would never have any reason to know they alias, because it looks like i handed you two struct pointers. IE imagine the function and main were in two different files, one of which had the union, and other other did not.
So either the compiler assumes that literally all memory, everywhere, aliases, forever (despite any other rules that the standard says exist, which directly contradict this), or we require programmers to make explicit union accesses (or your answers change depending on how much code the compiler can see, and whether it does whole program optimization or not or ....)
Like I said, this is a case where the standard is truly broken, and the best you can do is try to build consensus about what to do.
Note also the bug was suspended to figure out what the language was supposed to mean here. Nobody said it was not a bug, they said "no idea what supposed to happen here".
The committee punted (contrary to the last comment, they didn't really resolve the question), so it's never been worked on.
Naturally. The GCC developers have not documented their choices; what GCC actually does is fuzzy enough that a GCC developer who actually worked on the implementation of type-based aliasing optimizations gets it wrong. In these conditions, I am going to fix the fact that they didn't document their intentions (let alone doing what the standard says, since we have established that the standard committee, composed of compiler authors, is not helpful—I don't know what you think this is extenuating circumstances; to me, it makes things worse) by reverse-engineering GCC's assumptions, and extrapolate to what GCC might do in the near future, and help programmers determine whether their C programs might betray them now or soon.
In fact, this is exactly what we have been doing. Drop me an e-mail if you wish to help beta-test it: cuoq at-sign trust-in-soft.com
0: c7 07 03 00 00 00 movl $0x3,(%rdi)
6: c7 05 00 00 00 00 04 movl $0x4,0x0(%rip)
d: 00 00 00
10: 8b 07 mov (%rdi),%eax
12: c3 retq