
Catching integer overflows in C - fanf2
https://www.fefe.de/intof.html
======
WalterBright
D has an unusual approach where the characteristics of an integer type can be
mixed and matched as needed by the user:

[https://dlang.org/phobos/std_experimental_checkedint.html](https://dlang.org/phobos/std_experimental_checkedint.html)

~~~
saagarjha
Interesting. Does this compile down to the appropriate assembly instructions
for checking for overflow, if available, or does it end up emulating it with a
couple dozen instructions? That is, is this low-cost or does it have a
measurable performance penalty and should only be used in debug builds?

~~~
WalterBright
It depends on the implementation, which does what it has to do to work with
the backend. If the backend supports intrinsics that efficiently check for
these things, then they'll be used.

It's really the same issue any language would have with the particular backend
it uses.

What's different about the D approach, however, is that the user can select
which of several behaviors on overflow, and then select which of several
responses to overflow, instead of being stuck with just one or two options.
The tests we've run shows it generates the same code as if one had done it by
hand.

------
olliej
The correct thing to do these days is to use compiler builtins, but assuming
you can’t (such misery), the correct thing to do for multiplication involves
division (because the performance cost of treating multiplication as fixed
width 2s complement is so high ;) )

Specifically for

X = a * b

You have to do:

If (b == 0) X = 0 Else if (MAX(type of X) / b >= a) X = a * b Else Handle
overflow error

Oof

~~~
Annatar
Using compiler built-ins makes one’s code instantly unportable and is
therefore unacceptable.

~~~
exDM69
On the contrary, compiler built-ins can make your code _more_ portable (across
different hw architectures) at the expense of some other compilers.

GCC and Clang have mostly compatible compiler intrinsics and with those two
compilers you can compile your software to almost any platform out there.

If you need to get your code compiled on MSVC or ICC or some niche C compiler,
you can work around this by adding some kind of wrapper around them with
#ifdefs (not pretty).

With compiler built-ins you get SIMD, atomics, bitops (popcnt, clz, etc) that
are portable to almost any CPU instruction set out there. Without them, you
have to re-write that code for every instruction set you intend to support.

So your options are:

1) Stick to standard C and write code that compiles to suboptimal output code
(such as using division to get multiply-with-overflow)

2) Use compiler built-ins and get portability across different HW at the
expense of not supporting other compilers

3) Write wrappers around compiler built-ins with #ifdefs to support all the
compilers you want (add fallback to standard C if you need to)

4) Write your code using HW-specific intrinsics or inline assembly

Out of these, option 2) will give you maximum portability with least amount of
effort. The other options are more effort and the benefits are arguable. If
you think there are other viable options, please do share.

~~~
BeeOnRope
A variant of (3) that seems close to ideal is to use conditionally defined
functions as suggested, that someone else has already written and generously
shared, such as portable-snippets [0]. It has a component specifically to
offer cross platform builtins with fallback/emulation on plaforms without
direct support.

[0] [https://github.com/nemequ/portable-
snippets](https://github.com/nemequ/portable-snippets)

~~~
exDM69
Thanks for the link, that looks quite useful if compiler portability is
needed. It's still more effort than option #2, but quite an acceptable trade-
off.

------
pixelbeat__
I've found int overflow one of the most pernicious issues in C programming,
and have been maintaining notes (which reference this article) at:

[http://www.pixelbeat.org/programming/gcc/integer_overflow.ht...](http://www.pixelbeat.org/programming/gcc/integer_overflow.html)

------
bvinc
Rust decided for overflows to cause panics in debug mode, but continue
normally in release mode. They have special function calls for the few times
that you actually WANT overflow. This seemed to he the only good option. It
allows the programmer to express intent. In the future, if hardware allows for
cheap hardware traps in the case of unintentionally overflow, they reserve the
right to use them in release mode. C seems pretty hopeless since there is no
way to know whether or not the coder intended on overflow or not.

~~~
millstone
> It allows the programmer to express intent.

Can the programmer express the intent that this will not overflow? i.e. is it
possible to coerce the Rust compiler to optimize (x*2)/2 to x? "Cheap hardware
traps" cannot substitute for such a compile-time optimization.

> C seems pretty hopeless since there is no way to know whether or not the
> coder intended on overflow or not.

Well this doesn't matter: if signed arithmetic overflows the code is
undefined, regardless of intent.

In practice comments can express intent, and unsigned arithmetic can with some
finessing produce most codegen. Explicit operators would be a big improvement
though.

~~~
lyinsteve
I don’t know much about rust specifically, but the overflow traps happen in
debug (i.e. -O0) builds, and are not emitted when optimizations are enabled.

LLVM’s constant folding and canonicalization passes will convert (x * 2) / 2
from

%0 = mul i64 %x, i64 2

%1 = div i64 %0, i64 2

to

%0 = shl i64 %x, i64 1

%1 = shr i64 %0, i64 1

Which constant folding will reduce to just %x.

~~~
millstone
Well ok but (x*3)/3 will be a lot worse :)

~~~
Franciscouzo
Not by a lot, the divide by constant can be optimized to a multiplication, if
you're doing it in a loop (why would you care if you do it only once), you can
reach a throughput of one multiplication per cycle, so (x*3)/3 would take 2
cycles on average.

~~~
millstone
This is a silly line of argument; optimizing (x*3)/3 to x unlocks further CSE,
inlining, etc. so the benefit may be arbitrarily large.

Rust made the right tradeoff here but it is a tradeoff!

------
phibz
Most (all?) modern processors have instructions and flags specifically for
querying and reacting to integer overflow/carry.

This seems like a situation where dropping to assembly to perform the addition
and testing the flag makes sense. GCC has a mechanism for inline assembly
where you list the variables you read from, you write to, and registers you
modify and the optimizer plays nice with you. You could provide it as a
function to use instead of doing the addition in C.

~~~
olliej
The issue isn’t what the hardware can do, but rather that the C spec leaves it
undefined. Over the last decade have increasingly abused this idea basically
making the spec a land mine of UB and then claiming that UB allows them to do
whatever they want, even when the programmer intent is obvious.

That said in modern clang and gcc you can (and should) use the overflow
builtins which are all essentially:

    
    
      Bool __builtin_[operation]_overflow(type a, type b, type* out)
    

Returning true if the operation overflowed. These produce “ideal” code, e.g

if(overflowoperation(x,y,&out)) ...

Produces the platform branch on overflow instructions.

~~~
jcranmer
> The issue isn’t what the hardware can do, but rather that the C spec leaves
> it undefined.

Eh, the undefined nature of signed overflow in C is a bit of a red herring.
The most common overflow bug in practice is actually 100% well-defined and
still 100% wrong: multiplying two numbers to feed into a malloc.

In practice, the undefined signed overflow really only comes into play when
you're trying to explicitly check for overflow by doing the operation and then
seeing if it overflowed. But, as you mention, compilers give you bultins to do
this kind of checked overflow which is quite frankly a much easier way to
solve the problem, especially since it's more likely to actually get matched
to using hardware flag bits to check for overflow.

~~~
olliej
Yup you are right on that, but the obvious way to check for overflow is UB so
gets optimized out. Many security bugs in code people thought was correct. The
correct, standard compliant non-UB way to check for the overflow requires a
zero check and a division. It’s super non obvious. I like to imagine C
committee has made standardized definition for the current builtins but I also
have low expectations.

------
Annatar
These are all work-arounds: the permanent fix is to expose the processor’s
carry and zero flags in C.

~~~
saagarjha
Not every processor has these flags.

~~~
Annatar
Processors had these en masse (6502, Z80, MC68000) since the ‘70’s of the past
century. Without those, it’s extremely difficult to implement basic arithmetic
with numbers larger than what the processor’s registers support.

Which contemporary processors don’t have the C and Z status bits?

Post scriptum, since “Hacker News” is of the opinion that I’m posting too
fast:

MIPS does have a status register, and this register does have an overflow bit:

[https://www.doc.ic.ac.uk/lab/secondyear/spim/node10.html](https://www.doc.ic.ac.uk/lab/secondyear/spim/node10.html)

~~~
saagarjha
MIPS?

\------------------

Addendum addressing your updates:

> MIPS does have a status register, and this register does have an overflow
> bit:

Yes, it does, but it doesn't function as a simple carry or zero bitfield. In
MIPS, there are two instructions for performing addition: add and addu. add
generates a trap which you can respond to in an exception handler (which is
where you check for this exception code)–as opposed to addu, which overflows
silently. Most programs explicitly looking to check for overflow just use addu
followed by a sltu (set less than, unsigned).

~~~
Annatar
“u” in “addu” means unsigned, so if I’m using it, I understand what the
implication is.

~~~
saagarjha
What implication are you talking about?

~~~
Annatar
That there is no bit used for the sign, and so I know when the registers will
overflow. A lot of the assembler code I wrote exploits this very fact, since
when used cleverly it’s a performance optimization.

