
Fast integer overflow detection (2012) - luu
http://kqueue.org/blog/2012/03/16/fast-integer-overflow-detection/
======
revelation
(This article may need a [2012])

GCC 5 has builtin, efficient overflow checks:

[https://gcc.gnu.org/gcc-5/changes.html](https://gcc.gnu.org/gcc-5/changes.html)

 _A new set of built-in functions for arithmetics with overflow checking has
been added: __builtin_add_overflow, __builtin_sub_overflow and
__builtin_mul_overflow and for compatibility with clang also other variants._

~~~
TheLoneWolfling
...Which only helps you if you discard compatibility, unless I'm mistaken.

The question remains of how to detect overflow on add in portable C code.

~~~
forrestthewoods
Detect on add? Not sure. Detect before add? Simple!

bool int_add_safe(int a, int b) { if (a >= 0 && b >= 0) return INT_MAX - a >=
b; if (a < 0 && b < 0) return INT_MIN - a <= b; return true; }

If you ever have an instance that will result in an overflow you've got some
kind of bug on your hand. Or at least some kind of circumstance that requires
special handling. You don't want to overflow, not perform the add, or perform
the add and discard the overflow. All three sucks and sweep a bug under the
rug in some instance. Computers suck sometimes. :(

~~~
nly
It's possible to simplify your code.

    
    
        bool int_add_safe(int a, int b) {
            if (b < 0) {
                return (a >= INT_MIN - b);
            } else {
                return (a <= INT_MAX - b);
            }
        }
    

Codegen comparison: [http://goo.gl/Tgj7nu](http://goo.gl/Tgj7nu) (If you
switch to GCC it makes a mess of your version). You're right in that all
portable solutions suck, but on x86 it will certainly be faster just to do the
add and check the overflow flag.

~~~
TheLoneWolfling
Neither of those two compile down to what I imagine would be the best on x86,
namely an add and then a check of the overflow flag.

Is there any portable version that compiles down to that on GCC or Clang on
x86?

~~~
acqq
Once you have a function consistently used in your code the portable version
can be used for the targets for which you don't provide an asm implementation,
and the asm (or the solution customized to the compiler) can be selected
implementation otherwise, covering the most common processors of today.

~~~
Someone
Alternatively, bug the compiler writers to make their compiler recognise that
portable version. That way, you don't have to maintain all those special code
paths.

Getting the compiler writers to help you is easier if your project is
significant, or if you use a portable version that a big open source project
uses.

~~~
acqq
My proposal costs the implementer just a few #if lines in one file. Your
alternative costs much more people much more time, and the gain is then
just... the implementer saves a few #ifs.

~~~
TheLoneWolfling
By that logic most HLLs shouldn't have been made.

Remember - write once run many times. Implementing it in GCC / etc _once_
means that _many_ people won't have to do it themselves - especially as it's
far more likely that one implementation will be bug-free than many people
writing their own.

~~~
acqq
It's not the feature that anybody would use as is, when you'd use it you'd
prefer to call the function. But then what's behind the function is irrelevant
for the users of the function. We don't force compiler authors to put in the
compiler what can be nicely put in the libraries.

It's: write one header with a few #if's once, everybody use it, whenever she
needs it.

You'd just call

    
    
        is_safe_to_add_ints( s, t[i] ) 
    

and be sure that on the common platform it's optimal thanks to the #if's in
the header, and not write

    
    
        bool bsafe = (t[i] < 0) ? s >= INT_MIN - t[i] : s <= INT_MAX - t[i];
    

every time you'd want to check if the addition would be safe.

~~~
Someone
But you want

    
    
       if( !canSafelyAdd(a,b)) return ERROR;
       c = a + b;
    

To compile down to an add and a check of the overflow bit. For that, you need
help from the compiler.

There are many examples where compilers special-case what is in a library. For
example, compilers know they do not have to call strlen every time through
this loop:

    
    
        for(int i = 0; i < strlen(s); ++i)

~~~
acqq
The enough help you need is, in the implementation of the library function #if
X86 #if GCC etc and then the assembly for each case, otherwise the slow but
standard implementation. And that's one file, the code never to be directly
used but by function name. It's already done this way in the current libraries
for a lot processor specific stuff.

------
danbruc
Integer overflow is really a strange problem - it is no problem at all on the
assembly level but then we abstracted away the details in our higher level
languages for convenience or what ever only to realize that we now regularly
shoot our feet. It is nothing more than a language design mistake.

~~~
sillysaurus3
Hm? In Lisp or Python, you don't have overflow issues because ints can't
overflow. It's only a problem at the assembly/low level.

EDIT: The point of the parent comment is that integer overflow "isn't a
problem at the assembly level, only in high level languages." But high level
languages actually prevent the problem. You don't even have to worry about
ints overflowing when you use Python or Lisp. You _do_ have to worry if you're
assembly or are using a lower level language like C. So the situation is
actually opposite from what the parent comment describes.

~~~
danbruc
Arbitrary-precision arithmetic is probably not a general solution but it might
indeed be worthwhile to consider using it more often and especially if
performance is not a major concern.

~~~
sillysaurus3
Performance is not a major concern. Security and reliability trump performance
considerations. In my experience, especially at large companies, unfounded
"performance concerns" are the root of much evil.

This isn't true all of the time. HFT and gamedev, for example. But gamedev
only uses C++ because everyone else uses C++, and HFT often uses languages
other than C/C++ from what I've heard. Though the performance-critical
sections are usually transformed into C.

Though as I write this, I think of Chromium and Firefox and realize that C++
was probably the correct choice for both. On the other hand, I hear Rust is
making some pretty great strides in that context.

~~~
danbruc
I would be interested in the impact of just replacing every fixed size integer
with variable length ones. Plain integer array - gone. Single cycle math
operations - gone. I really can't tell what the impact would be.

~~~
dezgeg
A sane implementation of bigints uses pointer tagging such that arithmetic on
sufficiently small numbers (e.g. two 63-bit operands on a 64-bit machine) is
still just a couple of cycles, with just an extra comparison and a (most
likely optimally predicted) branch in the hot path.

Then, assuming the bigints are implemented in the core language, we could make
the compiler optimize the bigint checks away when possible. For example,
consider this hypothetical function:

    
    
        void memzero_range(char* array, int start, int end) {
            for(int i = start; i < end; i++)
                 array[i] = 0;
        }
    

Now, what can we infer from the 'array[i] = 0;' in the body of the loop?
Certainly, on a 64-bit machine it's not possible for i to go over 2^64 without
experiencing undefined behaviour. So a smart (and conforming) compiler could
just do if (is_bigint(start) || is_bigint(end)) abort(); once, and then
execute the loop itself without any overhead from bigints.

Or well, that's at least my personal dream of what should happen.

------
UnoriginalGuy
It seems like a lot of problems in C/C++ is caused by "this code might be
removed duration optimisation." Makes the whole environment undefined.

Why has nobody pushed for marking key sections of code with "don't optimise
away" flags?

In C# you have this:

    
    
        [MethodImplAttribute(MethodImplOptions.NoOptimization)] 
        public static string GetCalendarName(Calendar cal)
        {
           return cal.ToString().Replace("System.Globalization.", "").
                     Replace("Calendar", "");
        }

~~~
TheLoneWolfling
(Most) things that are undefined behavior are so for a reason.

In this case - you don't know if the system you're on even stores numbers as
twos complement. And if it doesn't, it may not have the same behavior. So even
the unoptimized version may fail.

~~~
to3m
I think in most cases, people writing code to check for overflow _do_ know
whether the system they're on stores integers as twos complement, and they'll
have a pretty good idea what will happen when overflow occurs. A portable
approach for this sort of thing would be very useful, but in the mean time,
perhaps we could settle for compilers that don't strip out reasonable code?

I know, I know - I'm asking a lot.

My favourite two links on the subject: [http://blog.metaobject.com/2014/04/cc-
osmartass.html](http://blog.metaobject.com/2014/04/cc-osmartass.html),
[http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasin...](http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasing.html)

~~~
heinrich5991
The good thing is that this allows to optimize out in reasonable places too.

------
wfh
Chromium has implmenented a safe numerics library for overflow detection -
see:

[https://code.google.com/p/chromium/codesearch#chromium/src/b...](https://code.google.com/p/chromium/codesearch#chromium/src/base/numerics/&sq=package:chromium)

