
Coming Soon: Machine-Checked Proofs in Everyday Development - fuklief
https://media.ccc.de/v/34c3-9105-coming_soon_machine-checked_mathematical_proofs_in_everyday_software_and_hardware_development
======
haskellandchill
My current reading list:

\- Fundamental Proof Methods in Computer Science

\- Handbook of Practical Logic and Automated Reasoning

\- Software Abstractions

\- The Little Prover

Yes, currently I have Athena, OCaml, Alloy, ACL2 (via Dr Racket/Dracula) all
up and running to work through the exercises in these books. I'm very
skeptical anything will come of this, but it interests me and I can see the
value.

Hopefully the speaker's next book Formal Reasoning About Programs
([https://github.com/achlipala/frap](https://github.com/achlipala/frap)) will
be a nice next step, I have Certified Programming with Dependent Types but
have not cracked it yet. Also keeping an eye on Verified Functional Algorithms
([https://softwarefoundations.cis.upenn.edu/vfa-
current/index....](https://softwarefoundations.cis.upenn.edu/vfa-
current/index.html)).

~~~
meneame2
Three extra classics:

* [https://books.google.es/books?id=YseqCAAAQBAJ](https://books.google.es/books?id=YseqCAAAQBAJ)

* [http://www.concrete-semantics.org/](http://www.concrete-semantics.org/)

* [https://softwarefoundations.cis.upenn.edu/](https://softwarefoundations.cis.upenn.edu/)

~~~
adultSwim
+1 for Software Foundations

------
jondubois
Mathematical proofs of code are not suitable for most systems.

Even unit tests which don't involve proofs can be a problem sometimes because
they lock down API inputs and outputs. It's already a major problem when a
developer wants to change some code and they have to spend hours updating unit
tests every time. When in small teams, heavy 100% coverage unit tests slow
down productivity, possibly by 5x or 10x.

I imagine that adding proofs as part of the unit test suite would further
increase the productivity cost overhead by a massive factor.

In software development, you should avoid treating the code like if it's
special or perfect because it's not and it will need to be changed in the near
future when requirements change (and they will always keep changing). So
locking it down into a specific state with proofs all around it is a bad idea
for the vast, vast majority of cases.

We need more wisdom in software development and that means taking a step back,
looking at the big picture and asking questions like what are the negative
side effects of each new methodology that we are introducing into our
projects? There are always downsides, and we're crazy to pretend that every
fancy new methodology is a silver bullet.

~~~
hmottestad
It's hard to know the intent of a developer by reading their code. You could
always add a bunch of comments, which would help, but in the long run you risk
them getting out of date.

My unit tests are basically "comments" for my code, that show me how the code
is supposed to work. When I make changes, the tests will tell me if I broke
the intentions of the initial developer.

Sure, when doing big changes, just updating and maintaining the tests is a lot
of work, but so would maintaining documentation and comments be. So we have
less documentation and fewer comments, and more tests.

~~~
jondubois
I find integration tests more useful for documentation because they show you
how the system works at a higher level which is precisely what new developers
need to understand.

If the code itself is written properly then the individual classes and objects
used throughout should be high cohesion and easy to read without needing any
comments or tests.

There are always cases where unit tests are important though like the parts of
the code that deal with money.

But there is no benefit in adding unit tests if you're building a standard
chat system for example; integration tests are sufficient in that case.

Also getting 100% test coverage is a complete waste of time if you don't
actually test the right kinds of input and output parameters.

~~~
zxcmx
I think a part of the difficulty is in the language of coverage.

"100% test coverage" usually means statement coverage due to the nature of
popular tooling.

If you deeply care about correctness you are usually more interested in _path
coverage_ which is more tied to things like heap shape or state space...
(which depending on your system design might encompass stuff in your
database). Also yeah, technically the exceptions which can be thrown at each
point in your software if you work in a language with those.

Unfortunately there's not much tooling for that so it's not visible in the
same way as the more naive metrics.

Anway my point is that you can waste time pushing for "100% statement
coverage" measurable by naive tools but still not exercise important paths,
which I think is your point.

~~~
z3t4
I want to clarify that the whole point of unit tests is to have 100% coverage,
that all code paths are exercised by the tests. If you do unit testing and
have code that is not covered it's either a bug or dead code that should be
removed.

------
jpochtar
We already have a basic form of machine checked-proofs in everyday code: type
systems. Type systems encode and prove simple theorems like "f(x: int) -> int
returns an int, if x is an int". Typescript and Mypy show us that these proof
checkers don't need to be built into a language. Even better, they show that
proofs can be over subsystems of a program, since Typescript and Mypy both
allow for partial/incremental typing. We just need to extend these systems to
support more complex theorems.

~~~
bcherny
Incremental type systems are super interesting. Lots of work being done in the
last 10 years that blurs the lines between static and dynamic typing
(monkeytype, reticulated Python), types and values (via the triumph of
inferred types), kinds and types (powerful trait systems like Scala's), and
even values and code (code synthesis like magichaskeller, or excel's
flashfill).

A bit of self promotion, I touched on this in a recent blog post if you're
interested in this sort of stuff:
[https://performancejs.com/post/hde6a33/JavaScript-
in-2017:-Y...](https://performancejs.com/post/hde6a33/JavaScript-
in-2017:-Year-in-Review,-Predictions-for-2018)

------
qznc
Machine checked useable compilers exist (CakeML, CompCert). Useable kernels
exists (SeL4). A few people officially work as proof engineers.

The future is here, but not evenly distributed. ;)

~~~
haskellandchill
Were you able to find a position? You are qualified, but it's very
competitive, few jobs and many PhD level candidates with experience.

~~~
qznc
Did not try. I'm only looking in southern Germany and afaik nobody does that
here.

------
adamnemecek
Cryptocurrencies and digital contracts are going to pump an insane amount of
money into this.

~~~
seanwilson
I'm not so sure to be honest. I've been super surprised at just how few high
profile hacks there have been against cryptocurrency protocols so far even
when you've got clients implemented in C++. I wish I could explain why as well
because you'd think all you'd need is a small flaw to attack the whole network
when most nodes are running the same software.

~~~
firethief
Normal security practices are applicable to the node/wallet software. Smart
contracts are another story -- think of the DAO heist. Short programs whose
behavior determine the fate of $$$ are exactly the kind of thing where
verification excels.

------
Animats
In ten years.

That's where I thought we were in 1983 - ten years away.

~~~
wolfgke
> In ten years.

> That's where I thought we were in 1983 - ten years away.

Obligatory xkcd:

> [https://xkcd.com/678/](https://xkcd.com/678/)

~~~
MikkoFinell
> Obligatory xkcd

God I cringe every time someone posts that. It's like the "Big Bang Theory" of
internet comics.

------
ted_dunning
I think that the classic ironical demonstration of the limits of proof systems
comes up with the fact that CoqIDE 8.7.1 crashes immediately on OSX if you try
to cut or paste anything.

The problem, it turns out that they included a new version of OCaml which was
incompatible with the older version of GTK that was being used. Result: crash
due to seg fault.

We really need both proofs and pragmatics. Proofs can give us confidence in
the fundamentals when used appropriately for the processor, or for consensus
algorithms like Raft, or key parts of the OS. Tests and statistical analysis
can give us confidence that we haven't overlooked something crucial.

Some systems obviously aren't very appropriate for formal methods (take
machine learning, for instance, there can be no formal spec for what a fraud
looks like). There statistics and testing will have to suffice because we have
clear value on average even if we cannot formally guarantee the system.

Other systems are much more appropriate for formal methods. I should hope that
a heart monitor, CPU, voting machine or aircraft flight management system will
have substantial amounts of formally proved code (I know that I am dreaming
with the voting machines; let's settle for getting it down on paper). I don't
want to collect a statistical sampling of whether an airplane works as we
iteratively work out the bugs and specs in an agile fashion.

------
mcguire
Good talk!

So, one question: Adam gives one example of a proof where the initial
automation fails because it does not consider a theorem that was previously
proved. He also gives an example of some complex code where the proof is too
tedious to do without lots of automation. What happens when your complex code
changes in such a way that you need some extra lemmas but don't know what they
are?

~~~
yaantc
It's manageable in practice I believe, and Adam quickly shows a feature
related to this in the demo.

First, if some new features are added with associated properties, then some
theorems/lemmas related to these properties will have to be added and proven.
I don't think that part is problematic, because it's a "local" decision. This
work is focused on the change.

The possible problems come from the impact of these changes to existing
results. If you need to go and make other changes all over the place it
wouldn't be scalable. But then automation and coq hint database can help. In
the demo example you referred to Adam adds the lemma as a rewrite hint, and
then it's picked by coq automation elsewhere. That's how local changes can
automatically be leveraged globally by proof automation in existing proofs,
with no changes (ok, not always. But that limits the impact).

------
skybrian
What progress has been made recently in making writing proofs easier? Are
there any research projects targeting mainstream programmers?

~~~
agentultra
Plenty of advances in logic and category theory have enabled dependently typed
languages to be both theorem prover and practical implementation language.

See Lean.

~~~
vladf
I'm interested but "Lean programming languages" and its varieties are
difficult to Google. Would you mind posting a complete reference?

~~~
cf
I think this is a good application paper for Lean showing how you can use it
to develop machine learning algorithms.

[https://arxiv.org/abs/1706.08605](https://arxiv.org/abs/1706.08605)

------
crb002
We already have them. It's called compiler type checks. Albeit having an SMT
solver in your unit tests is nice too.

~~~
UncleMeat
Type checking is (usually) intraprocedural and flow insensitive. That's what
makes it easy. The other stuff is harder.

