How does this relate to the change in the C++20 standard about two's complement?...

gumby · on Sept 24, 2020

That change was simply to reflect de facto reality, removing some undefined behavior in order to make life easier for people writing portable code. I haven’t seen a ones complement machine in decades.

quietbritishjim · on Sept 24, 2020

That makes it sound like, in the old spec, you could totally ignore the problem so long as your particular architecture is two's complement. But that's not true: remember that C and C++ are (notoriously) not just high-level assembly, but instead are defined against an abstract virtual machine. What that means in practice is that if your code invokes any undefined behaviour, even if it seems like it ought to work in a particular way given the actual machine it runs on, then the optimiser is allowed to do whatever it likes.

For example, if you convert a number from unsigned to signed and then test `if (x<0)` then the compiler is allowed to optimise away that whole if statement because there's no standards-compliant program that would ever take that path. As another example, if the compiler can prove that a value will be greater than INT_MAX and then you convert it to signed then the computer is allowed to replace that conversion, and the code following it, with a call to `abort()` – after all, no compliant program could possibly follow that path so it can assume that what it has really proved is that that code will never be executed.

This change to the standard means you don't need to worry about those possibilities, even on machines that actually are two's complement.

quietbritishjim · on Sept 24, 2020

The main effect is that you can convert from an unsigned number to a signed number and it will do what you would expect from a two's complement representation. Previously this was undefined behaviour (not even just implementation defined!) if the value was greater than would fit in the signed type. This was true even if you were doing an explicit cast, so doing the conversion in a safe way was notoriously fiddly. [1]

Converting signed to unsigned had always been fine, all the way back to C89 I believe. That was specified as being converted modulo 2^N, which matches what happens naturally with two's complement, even on architectures that aren't. (With the change in C++20 the wording for this conversion is simplified, but the effect is unchanged.)

The introduction of the proposal [2] lists some other changes and non-changes. It mentions that demanding wrapping behaviour in arithmetic operations had been a goal of the proposal but there was strong resistance to it so it was dropped. One thing that stands out is that right shift of negative numbers is now defined as sign-extended rather than undefined behaviour, which I think had probably caught a lot of people out in the past. Oddly, it mentions signed to unsigned conversion as a change even though it's effectively the same as before, and signed to unsigned isn't mentioned in the introduction at all even though that's a major change. The rest are mainly to do with representations e.g. if you're memcopying to or from signed numbers.

[1] https://stackoverflow.com/questions/13150449/efficient-unsig...

[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p090...