
Catching integer overflows with template metaprogramming (2015) - jgeralnik
https://capnproto.org/news/2015-03-02-security-advisory-and-integer-overflow-protection.html
======
DannyBee
"Some day, I would like to design a language that gets this right. But for the
moment, I remain focused on Sandstorm.io. Hopefully someone will beat me to
it. Hint hint. "

Some language already did beat him to it: Ada

[http://www.ada-auth.org/standards/12rat/html/Rat12-2-4.html](http://www.ada-
auth.org/standards/12rat/html/Rat12-2-4.html)

Ada's rules for type invariants check them in all the places you would expect,
and are straightforward to use.

It's about as correct as you can get without it being impossible to compute :)

~~~
kentonv
I thought Ada did only runtime checks? I am asking for (and my template
hackery implements) compile-time checks. Runtime checks are slow and don't
fully mitigate security bugs, though they probably downgrade the bugs to DoS
attacks. But if these are actually checked by the compiler, that's pretty cool
and I might actually have to play with Ada at some point!

~~~
jackmott
Idris or F*? I think they are not practical because compile times end up
extremely slow. I think want you want will necessarily have this effect of
very slow compile times.

~~~
indolering
Came here to suggest Idris, as it compiles to C. F* is still experimental and
Idris is just coming up on a RC. They could also drop down into C and manually
verify it.

------
michaelt

      Why don’t programming languages do this?
    

I worked on a contribution to clang-analyser a few months ago - one of the
things I learned was (1) No-one likes code analysers that raise false alarms
and (2) it's really difficult not to raise false alarms. For example, consider
checking this program for divide-by-zero bugs:

    
    
      for (int i=-99999 ; i<99999 ; i+=2) {
        printf("%d\n", 100/i);
      }
      for (int i=-100000 ; i<100000 ; i+=2) {
        printf("%d\n", 100/i);
      }
    

A programmer can clearly see that in the first loop there is no divide-by-zero
as i will jump directly from -1 to +1 - whereas in the second loop the counter
will hit 0 and trigger a divide-by-zero.

But for a static analyser, your options are:

1\. Model i as a single value, and simulate all ~400,000 passes through the
loops. Slow - you basically have to run the program at compile time.

2\. Same but truncate the simulation after, say, 20 loops on the assumption
that most bugs would have been found on the first few passes. This misses the
bug in the second loop.

3\. Model i as "Integer in the range -99,999 to +99,999" and generate a false
alarm on the first loop.

4\. Support arbitrarily complex symbolic values. Difficult - as the value of a
variable might depend on complicated business logic.

I guess the benefit of starting a new programming language is you can choose
option 3 and say "Working code is illegal, deal with it" from the start.

~~~
kentonv
For my purposes, I would be happy with a way that I can specify constraints
specifically in certain critical code sections, where I'd be happy to work
around some false positives. Most other code could go on ignoring overflow as
it does today.

------
staticassertion
That's a pretty impressive response - committing to fuzzing with AFL,
improving current fuzzing test coverage, adding new coding guidelines, and a
paid security audit.

In regards to 4 (valgrind) perhaps utilizing other sanitizers would be a good
idea.

Awesome that they tested whether those methods would actually have caught this
bug - and multiple did. That's a smart metric.

And a writeup about TMP and compile time safety to boot.

I'm struggling to come up with anything more that I'd want in a disclosure.

~~~
kentonv
[https://twitter.com/BenLaurie/status/575079375307153409](https://twitter.com/BenLaurie/status/575079375307153409)
:)

The sad part of the story is that Cap'n Proto actually hasn't had a release
since then, so (so far) I've effectively gotten out of doing any of those
release-oriented things I promised. Many changes have gone into git master,
but Sandstorm used git master directly, meaning there wasn't sufficient
incentive to do the work of a release. (And as an understaffed startup we
could never quite justify spending the time.)

The good news is that I joined Cloudflare two weeks ago, who is a big Cap'n
Proto user, and it looks like I'll have more time and justification to do
releases there. (And yes, I do intend to fulfill all the promises in doing
so.)

~~~
staticassertion
> The good news is that I joined Cloudflare two weeks ago, who is a big Cap'n
> Proto user, and it looks like I'll have more time and justification to do
> releases there

That's great news, thanks for the disclaimer/ info.

------
jbapple
Two other ways of checking integer bound invariants:

1\. Undefined behavior sanitizer, which checks integer overflows at run-time,
like INT_MAX + 1. Can optionally check other invariants; see
[https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-
Options.h...](https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html)
and
[https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html).

2\. Abstract interpretation - using a lot of constraint solver cycles and
enough math to choke a dinosaur, you can check some integer bounds at compile
time "relationally", like knowing that x < y. See, for example, "Pentagons: A
Weakly Relational Abstract Domain for the Efficient Validation of Array
Accesses": [https://www.microsoft.com/en-
us/research/publication/pentago...](https://www.microsoft.com/en-
us/research/publication/pentagons-a-weakly-relational-abstract-domain-for-the-
efficient-validation-of-array-accesses/). I'm not sure, but that code might be
available in
[https://github.com/Microsoft/CodeContracts](https://github.com/Microsoft/CodeContracts).

~~~
kentonv
#1 doesn't get you very far since it can only catch cases that are locally
detectable -- where the compiler can prove that some lines of code will always
result in an overflow.

Consider the code:

    
    
        int f(int x) {
          return x + 1;
        }
    

This code could overflow, but surely the compiler won't complain about it --
it would instead assume that you don't intend to pass a value for x that is
too large. In order for the compiler to have enough information to complain,
it would need you to specify explicitly what the range of x is -- so that it
can detect violations both within the function definition and at call sites.
There is no built-in way to specify this.

But, this is exactly what my template hacking adds -- a way to specify the
actual range of each variable.

I haven't looked closely at your #2. Presumably it also requires some
constraint annotations to be added to your code. But if they've provided tools
that can process those annotations and detect violations, that's pretty cool.

~~~
nickpsecurity
The best ones you can straight-up buy are SPARK and Astree that I'm aware of.
I've already linked to SPARK. Astree is limited in application but pretty
amazing in what it does:

[https://www.di.ens.fr/~cousot/publications.www/CousotEtAl-
ES...](https://www.di.ens.fr/~cousot/publications.www/CousotEtAl-ESOP05.pdf)

My idea was to subset as many components as possible to static-ish components
it could handle then focus more on integration testing. Then one could reap
the benefits in a larger project. Similar idea for SPARK, which is probably
cheaper.

~~~
touisteur
SPARK is indeed very adapted to some kinds of code (protocols, serialization,
deserialization, index manipulation, masks...). Proving the absence of runtime
errors in such code is easy (not much help needed for the provers, just type
your data structures well) and makes you think of every case. Fun. Even better
: once AoRTE is proved, remove _all_ runtime checks and get C-like
performance. You can also prove some isolated functional properties ('the
answer to message X can only be message Y or Z'... just sprinkle some asserts,
post-conditions or check the nice Contract_Case syntax). Or you can go to
'platinum'-level and prove your complete protocol/algorithm. The tech is great
and it keeps improving, it gets easier every year...

~~~
nickpsecurity
Oh yeah, Rod Chapman and others at Praxis did amazing work. It's probably the
single, most-practical tools in engineered software since abstract, state
machines for specification or Cleanroom for building. Their error rate was so
low.

The traditional problem, though, was that some tools were good for specifying
full correctness of high-level models while others were good for verifying
properties of code itself. They rarely did both well. Still true with SPARK.
Before DeepSpec, I was already pushing for mixing them up a lot with different
tools on different components with some sort of unified logic. You might find
it interesting that one of those was with SPARK and the successful Event-B
method. I'll give you a paper show how hard it is to use Event-B by itself
down to low-level code followed by how a combo improves things.

[http://eb2all.loria.fr/html_files/files/landingsystem.pdf](http://eb2all.loria.fr/html_files/files/landingsystem.pdf)

[http://journal.ub.tu-
berlin.de/eceasst/article/view/785/782](http://journal.ub.tu-
berlin.de/eceasst/article/view/785/782)

One last thing before I head out is that the landing system paper is great for
illustrating difficulty of correctness. It shows a small number of straight-
forward requirements and design specs. Then, verification conditions are
derived down to the low level code that must all be maintained true for total
correctness. As in, it makes explicit all the stuff a developer must get right
in their head to correctly solve this simple problem. I find it's a nice
reality check for people even if they don't know the notation on intrinsic
complexity of software & how that affects verification. Especially why you
want to use simpler, automated methods than hope to test your way out of that
level of complexity w/ corner cases. ;)

~~~
touisteur
Thanks for the paper.

It's indeed a game-changer to be able to mix Ada2012, Spark, and C in an
industry-grade programming environment. You can gradually convert a legacy
codebase. I've been replacing some crufty C parts with Ada-with-Spark-
restrictions then full-Spark-with-AoRTE-proof and even some functional proof
with great success and can now stop worrying about crashes in security-
sensitive attack-surface parts. Network/file interfaces, etc. Most of my
experiments are very encouraging ! Coupled with CodePeer and/or afl-fuzz for
Ada code, almost-bug-free code becomes somewhat more accessible.

Rod Chapman is also a great showman with great success stories, also very
funny... and Yannick Moy at AdaCore is an amazing teacher and doing great work
leading Spark further, including on small embedded targets.

I think you're doing great work around here reminding everyone once in a while
about Ada & Spark, especially in articles that devolve into 'Rust-would-have-
caught-it' :-).

~~~
nickpsecurity
Appreciate the kind words. I plan to get back to experimenting myself. One of
my ideas is coding in both Rust and Ada/SPARK to let borrow checker catch
temporal errors while Ada/SPARK handles the rest. Optionally a C version or
use an Ada-to-C compiler if Taft's is still available to throw every free
analyzer or tester for C at it. Far as apps, check out Ironsides DNS that was
done in SPARK. Borrow-checking and testing it will be my first project if it
get back into Ada.

Note: There's also work right now on a verified front-end for SPARK to
integrate it into CompCert. Then, the ASM is verified as well. Exciting times.
:)

EDIT to add: Yeah, that Rust would've caught it was a bit annoying. Worst was
them saying it was first, safe, systems language. pjmpl and I demolished that
one, though. Now we just get the other one which is at least accurate if it's
memory bugs. Progress. (Shrugs)

~~~
touisteur
Well then you're going to be happy : there's a new Spark-to-C compiler.
[http://blog.adacore.com/spark-tetris-on-the-
arduboy](http://blog.adacore.com/spark-tetris-on-the-arduboy) Maybe that's the
one you're thinking about ?

Also, I heard Rod Chapman (at the last AdaCore tech day) say they had great
success using 'the' Ada-to-C compiler some years ago. Looks like AdaCore still
have the code for it, not doing much with it. You might get the code if you
ask nicely (maybe T.Taft directly now that he works for AdaCore) :-D.

If you get any new idea on some combination of Spark, Ada and other tech, I'd
send yannick moy an email :-D. Always helpful and insightful.

If you plan on doing your borrow checker as some kind of new static analysis
pass for Ada, I'd also look into libadalang, there's some great (open source
!) progress there. I also played a lot with gnat2xml but I don't recommend :
it works but it was painful...

I also wondered some time ago about a Spark-to-rust compiler. But I'm waiting
for my rust skills to improve first.

Anyway, it's very strange to see rust every so often mentionned for embedded
software. I get the interest for 'normal' unconstrained software. But after 10
years of large-scale real-time embedded-systems programming in Ada and I maybe
faced once a memory safety bug that the borrow checker would have prevented.
Mostly the bugs you get are constraint_errors (overflow, underflow, array
bounds,...) most already caught by the type system at compile or static-
analysis time. The rest are mostly 'real' bugs : logic or temporal. You never
doubt of the state of the program when reading the code, such a lower
cognitive load... Going back to C from time to time is so painful, especially,
when you think of all the help you're missing.

When in need of pointer go for not null access types, when in need for dynamic
allocation go for controlled types (scoped deallocation) and storage pools...
Everyone (except haskellers...) is missing on so much, it feels like living in
the future where the compiler works (a lot !) for you. Sorry for the rant...

By the way I found one of the ways to catch temporal errors (other than
uninitialized variables that are caught) in Spark is to use an enumerated
state variable and check it all the time (post/pre or in invariants). It's
almost always a good idea anyway to have an explicit state instead of some
'nah none will call this code in such sequence...'.

Also interesting : like functional programming or the borrow checker, or
strong typing, Spark changes the way you think about your code. You think
about how 'provable' your design and code is and it makes code more readable,
simple. You can also just start with data flow annotations (also great for
safety and security-critical code), makes reading code easier.

I'll be looking out for your next dive back into Ada then !

------
norswap
> Some day, I would like to design a language that gets this right. But for
> the moment, I remain focused on Sandstorm.io. Hopefully someone will beat me
> to it.

Two language that do it: Whiley ([http://whiley.org/](http://whiley.org/)) and
Dafny ([http://rise4fun.com/dafny](http://rise4fun.com/dafny)).

I expect (but not sure) that the proving power is about the same as what can
be achieved by template meta-programming.

------
winstonewert
I'm unclear on how this would work out practically. Cases where my integers
belong to a fixed range seems rare. I'd like to have seen more discussion of
how it caught the bugs it did.

~~~
kentonv
Cap'n Proto by its nature is doing a lot of work with integers that represent
sizes, indices, offsets, addresses, etc. These tend to have well-defined
ranges -- and unusually bad failure modes. So it works unusually well here.

------
danschumann
Boo I thought it was a reference to fictional Tom Parris's holodeck program

