integer overflow is UB because on some architectures it was trapping.
In practice for the last 30+ years the default behaviour has been non-trapping. So much so that making it trapping would break vast amounts of software that depend on it, so you can't change the general case behaviour in C, C++, etc, or "safe" languages like Java, C#, etc.
Newer languages do recognize this and make trapping the default behaviour, but "rewrite everything at once" is simply not a tractable problem.
> In practice for the last 30+ years the default behaviour has been non-trapping. So much so that making it trapping would break vast amounts of software that depend on it, so you can't change the general case behaviour in C, C++, etc
You can change it in C and C++, since the current behaviour is undefined i.e. give control of your computer to hackers.
GCC and Clang should make -ftrapv the default. They won't, because whichever one does it first will then perform worse on benchmarks than the other, and that's the only thing the devs care about. But they should.
You can change it to be trapping behavior, but doing so is problematic for architectures that cannot detect overflow at all of the supported widths in hardware because the software checks are slow.
Unfortunately (IMO), C and C++ has a sizable community that is unwilling to accept pessimizing behavior for various atypical architectures. This is not unreasonable, but it hugely limits the ability of the language to make decisions that work great for the 99%ile case.
Because too much code is completely broken if you do.
The only things that make use of overflow being UB are optimizing compilers, and they have reliably broken code for because of this for 20 years. This means most developers have realized that pretending non-twos-complement architectures still exist is nonsense, and both C and C++ have significant pressure to actually define overflow as being 2c.
No. It isn't broken. There is a huge amount of code that assumes this behavior, and works correctly, because that is the way hardware works. That is a huge amount of code that has worked reliably for decades, because it is correct, because the expected behavior matches the hardware behavior.
Saying "it is UB therefore is a security bug" is nonsense.
Saying it shouldn't be UB is useful, and is being addressed by the C and C++ standardization committees, and that work will not change the behavior, it will simply remove the "it's UB" nonsense that optimizers occasionally use. At that point the defined behavior will be a twos complement overflow.
Saying it's UB and therefore can be arbitrarily broken is equally nonsense, breaking code that is correct, within the confines of actual real machines, for no reason other than "it's UB" is not anymore helpful than saying "why don't you just rewrite it all in X".
It's actually incredibly difficult given your definition of what is allowed, to write anything in C that is not UB.
> There is a huge amount of code that assumes this behavior, and works correctly, because that is the way hardware works.
It doesn't though, because gcc et al don't care how the hardware work, they can and do happily miscompile that kind of code into security vulnerabilities instead. If you're talking about embedded code that's compiled with a specific vendor's non-optimizing compiler then yes (but changing GCC and Clang's defaults will have no effect on that kind of code), but if you're talking about code for mainstream desktop/server systems then no, it already doesn't and can't rely on wrapping overflow.
> Saying it's UB and therefore can be arbitrarily broken is equally nonsense, breaking code that is correct, within the confines of actual real machines, for no reason other than "it's UB" is not anymore helpful than saying "why don't you just rewrite it all in X".
But it's not correct, not just in theory but in practice. In real life, code that does this and gets compiled with a modern optimizing compiler like GCC or Clang is already an RCE unless proven otherwise. Yes wrapping is what a naive assembly translation would do. But the compilers don't do naive assembly translation and haven't for decades.
> It's actually incredibly difficult given your definition of what is allowed, to write anything in C that is not UB.
Yes, which is why we keep getting security vulnerabilities like this one.
In practice for the last 30+ years the default behaviour has been non-trapping. So much so that making it trapping would break vast amounts of software that depend on it, so you can't change the general case behaviour in C, C++, etc, or "safe" languages like Java, C#, etc.
Newer languages do recognize this and make trapping the default behaviour, but "rewrite everything at once" is simply not a tractable problem.