
Formal proof and analysis of an incremental cycle detection algorithm - lelf
http://gallium.inria.fr/blog/incremental-cycle-detection/
======
superpermutat0r
I remember reading somewhere about programs being turned into generating
functions and then asymptotic complexity follows from that. Not just
asymptotic in the worst-case but also the best-case and average case.

It did not require a formal proof. It just turned code into generating
functions.

It was probably work by Philippe Flajolet, also at Inria.

I think I also remember it being used for memory reads and writes in addition
to comparison ops analysis.

Really great stuff.

~~~
Armael
(author here)

There are indeed several interesting lines of work on automated approaches to
complexity analysis, using various techniques. To name a few, these include
the very cool work of Jan Hoffmann and others on RAML
([http://raml.co](http://raml.co)), relying on a a type-based analysis, or the
work of Dan Licata, Norman Danner and others on "recurrence extraction", which
is closer to what you mention (a program is turned into a recurrence equation
which can be in turn analyzed further to deduce its complexity).

The strengths of the approach (based on a program logic and formal proofs)
that we demonstrate in the work linked here, is basically that it scales up to
very complicated programs and/or very subtle complexity analysis.

Automated approaches are intrinsically limited in the class of programs they
can handle (often, they either handle purely functional programs, or first-
order integer-manipulating C-like programs), and in the class of complexity
bounds that they can infer without human assistance (most of the time, only
polynomial bounds, no log/polylog/...).

We do require a (very qualified!) human to carry out the formal proof, but in
return we get a system that can handle higher-order code with mutable data
structures, and subtle complexity analysis that _do depend on the functional
correctness of the overall algorithm_. This is something that an automated
tool cannot hope to achieve in a press-button fashion.

In case some people would like to know more about the various approaches to
(formal) time-complexity analysis, I should mention that the related work
section of my PhD thesis contains such an overview :-)
([http://gallium.inria.fr/~agueneau/phd/manuscript.pdf](http://gallium.inria.fr/~agueneau/phd/manuscript.pdf))

~~~
anonsivalley652
Nice work.

Formal methods, automated theorem provers and safer languages have come some
ways since I started in the era of staring at glowing CRTs and fiddling with
keys. To me, if we can't prove we're building things on a solid foundation,
we're just doing one order-of-magnitude removed from writing code without
tests.

0) Prove the compiler is correct.

1) Prove the OS is correct. (seL4, for example)

2) Prove programs are correct.

I think we're rapidly approaching barriers in effort, error-detection, and
cognitive abilities that even the best handful of highly-skilled homo sapiens
sapiens have difficulty proving/doing/scaling. At what point (or tapering era)
do we more seriously consider systems for autonomously generating code# with
automated correctness, worst-case and timing analysis? We have massive,
massive code-bases of Linux, LLVM, OpenBSD, etc. without formal proofs but
with unknown-unknown numbers of bugs... just the other day, someone working in
Rust ran into an LLVM bug involving bounds checking. (Yikes!)

# Let's first make it easy by constraining the solution-space to simplistic,
imperative, SSA w/ liveness (lifetimes) logic.

------
chubot
Not sure how related this is, but one thing I found interesting is that the
recent QuickJS by Bellard uses ref counting + cycle collection.

[https://bellard.org/quickjs/quickjs.html#Garbage-
collection](https://bellard.org/quickjs/quickjs.html#Garbage-collection)

That is, it's the same strategy that CPython uses (although CPython's detector
is generational). People seem to say that CPython's design is old or
suboptimal, but a brand new interpreter made the same choices.

And as the manual says, a major reason is that the interpreter can hold
pointers to values without any special consideration. In contrast, v8 has to
carefully wrap everything in Handle<>, and there are a bunch of elaborations
on that which seem tricky to me. Then again, doing ref counting with 100%
accuracy by hand also seems tricky to me.

~~~
andreareina
AIUI part of the trade-off between cycle counting and tracing is ease of
implementation vs performance. I also remember reading somewhere
(unfortunately the where is lost to the mists of my memory) that as the
reference implementation simplicity was an explicit goal of CPython, to the
extent that there's little optimization done to the bytecode.

