
How did software get so reliable without proof? (1996) [pdf] - elcapitan
http://www.gwern.net/docs/math/1996-hoare.pdf
======
jedharris
Thanks for this! Hoare (partially) answers a question I asked in the comments
on "The Little Typer" (a good treatment of programming with dependent types).

That approach to dependent types brings our code closer to being verifiable.
Unfortunately as a number of commenters testified it also makes software
(currently) impossible to scale to complex algorithms, dependence on loosely
coupled libraries, and so forth.

Since Hoare wrote 22 years ago, I'm now wondering if there's a fundamentally
intractable problem with improving formal validation of complex software.
Dependent types _seem_ like a natural way of embedding the programmer's
assumptions and reasoning into software, but clearly current technology in
that area is still very very far short of helping with software at current
scale.

Meanwhile our practical means of building complex software have improved
considerably and we can develop systems that are reliable "enough", even at
enormously greater scale than Hoare anticipated. The gap between viable
software complexity and formal methods of describing software semantics has
grown much wider since Hoare, rather than shrinking.

I wonder if we could step back from this style of formal methods, and look for
an alternative approach. We need ways to capture the actual reasoning and
methods that work for programmers building complex systems -- rather than
trying to pretend they are mathematical objects.

~~~
mikekchar
For me, notation is everything. A computer program is a model of a problem.
There is more than one representation of that problem: in the static code and
in the running code. If you go to a kind of naive way of writing programs
(that we all went through when we were first starting out), you write the code
and then you step through it (either on paper, or with a debugger), viewing
how the code is running.

There are many techniques available for helping you reason about your code.
However, it's important to understand that software code is a notation for
modelling the problem. Often formal methods introduce _another_ notation that
proves that the encoding of the problem in the first notation is "correct"
according to some definition of "correct". The more we move into modelling the
correctness in the second notation, the more we move the burden of reasoning
about the issue to the "code" in the second notation.

It's often very useful to model the problem more than once. For example, I'm
quite a big fan of unit tests. "Give the code in the system, I expect X to be
true/false in this situation". I'm just as likely to make mistakes in my unit
tests as I am with the production code. However, it's the comparison of the
two that allows you to reason effectively about the code. They are two sides
of the same coin (intentionally coupled -- which, as an aside, is why you
should avoid trying to decouple your tests from your production code!!!).

Even though it is useful to model twice, I'm not really sold on the idea of
doing it in general (and I have a very specific idea of how to write unit
tests that's heavily coloured by this point of view). Everything that you add
to the code should make reasoning about something _easier_ and _more
convenient_ than not doing it. Otherwise you are better off with just writing
the production code. You will have less complexity to deal with overall.

In my opinion, a good example of a tool that helps incredibly well are
strongly typed languages _with_ type inference systems. For a small penalty of
type notation, the system can tell me, without a doubt, when I've made a
mistake with the types of my variables. Not only that, but the notation itself
allows me to reason about why I've gotten myself into a problem. It can even
help me reason about what the function does, without having to even look at
the implementation. This is incredibly helpful.

I think as we go forward, we need to be looking for this kind of tool.
Dependent types seem like a perfect example of this. However, there is always
going to be a small price to pay. Writing code in a statically typed language
is harder than writing code in a dynamically typed language -- precisely
because the dynamically typed language allows you to write code without having
to think up front about the consequences. Of course, you may leave bugs in
your code because of it, but the initial cognitive load is smaller. Similarly,
it's much better to write code where you can reason about the run time
behaviour without it running, but that is much harder than writing some code
and walking through it, playing "computer".

We need to balance these kinds of factors because a tool that is not used is a
useless tool. I'd be perfectly happy to work in a pure functional, statically
typed language because it is _much_ easier to reason about my code. Many of my
colleagues simply are not willing (or possibly even able) to pay the upfront
cost of working in that kind of ecosystem. Hence, I'm not able to work in it
either.

~~~
thrmsforbfast
This was a great comment.

The fact that new notations/representations (be that tests, simulations,
formal methods, etc.) must justify their additional cognitive overhead is a
great point.

Aside from unit and integration tests, I think that substantial new
notation/infrastructure is most justifiable when the second notation/setting
allows you to model parts of the problem that can't be modeled in the original
notation.

Embedded software is a good example. Although there are some tools that allow
you to model the physical system alongside the control software, typically the
code is written in C and so there's no way of expressing very important
aspects of the problem in the implementation language. So a second notation
can help. For example, simulators and HiL setups for testing or things like
[https://github.com/LS-Lab/KeYmaeraX-release](https://github.com/LS-
Lab/KeYmaeraX-release) for formal methods.

Security is another good example. Typically code does not include an explicit
threat model or explicit security properties, and these aspects of the problem
are often scattered across the code and difficult to reason about. So being
able to capture those things directly in a second notation can add a lot. For
example, automatic vulnerbaility scaners for testing or [https://tamarin-
prover.github.io/](https://tamarin-prover.github.io/) for formal methods.

Distributed/fault-tolerant systems are yet another example where it's very
hard to state the thing you want to state in the implementation language.
Netflix's chaos monkey comes to mind.

------
grandinj
We've gotten to where we are using slow incremental improvements. Better
tools, better compilers, vastly better linters, unit testing, CI, etc. Now and
then we manage to take some academic research and turn it into practible stuff
e.g. the current generation of linters like FindBugs, Coverity, etc.

And that is pretty much as it has always been, a slow drip of practical
improvements fed by theoretical research.

------
nikofeyn
was 1996 really that different than today? because reliable is the last word i
would use to describe software today.

~~~
passwd
It surely is different. We at least managed to contain some of the issues in
normal operation. Back in the day, crashing whole OS in BSOD/panic style was
more common, drivers unstable, things written in not-so-safe languages more
popular.

So ultimately it depends what do you mean by reliable - by itself,
functionality might not be so much better, but yes, sometimes you don't mind
as much as you would do back then.

------
threatofrain
As a small data point I'd say that Leslie Lamport claims that Amazon has used
TLA+ on a good number of big projects; many people think of formal proving as
an activity that is separately layered on top of programming, as opposed to a
proving-language. Also, when people use Rust, are they using proving
techniques?

~~~
steveklabnik
Not yet; we’re still working on the semantics of the language, and proofs for
it; you need that before you can prove things about the code in the language.
Or at least, the stuff we’re most interested in.

The EU is funding some of this work.

~~~
timClicks
For people interested in looking deeper, this work is called RustBelt. Useful
web pages are the project's homepage ([https://plv.mpi-
sws.org/rustbelt/](https://plv.mpi-sws.org/rustbelt/)) and the research paper
describing the project
([https://dl.acm.org/citation.cfm?doid=3177123.3158154&preflay...](https://dl.acm.org/citation.cfm?doid=3177123.3158154&preflayout=flat))

------
exabrial
1996 being the pre Windows 98 Era

------
repolfx
Hmm. The paper argues that a predicted software crisis caused by unreliability
didn't materialise, and software became tremendously more reliable, despite
industry essentially ignoring decades of research into formal methods and
proof systems.

With this I agree. I'd also contribute another couple of factors that came
into play perhaps too late for a paper written in 1996: successfully bringing
garbage collection and managed memory languages to widespread usage, starting
with Java and JavaScript and the growth of the web that allowed much more
software to be written in server side scripting languages like Perl and
Python. And the deployment of crash reporters and online update tools, which
massively tightened the feedback loop between crashes occurring in the wild
and fixes being deployed.

Intuitively at least, I feel these two things made a huge difference to the
reliability and scalability of software. It's typical for large Java or
JavaScript programs to now incorporate so many libraries that dependency
version management is a significant pain point that is bottlenecking the
growth in software complexity. In the 1990s the closest you got to this was
DLL hell and that was more of a Windows deployment issue - the software worked
fine on the developer's box, it just didn't work when deployed to OS installs
that had different bits on the. Now we face issues where dependency graphs can
pull in two or even three different incompatible versions of the same library
_on the developer 's machine_. This is a testament to the new scales we are
exploring in software complexity.

However, the paper then goes on to make some possibly very dubious arguments,
I suppose because the author is a famous academic and maybe doesn't want to
contemplate the obvious conclusion. He says:

 _The conclusion of the enquiry will be that in spite of appearances, modern
software engineering practice owes a great deal to the theoretical concepts
and ideals of early research in the subject; and that techniques of
formalisation and proof have played an essential role in validating and
progressing the research._

 _However, technology transfer is extremely slow in software, as it should be
in any serious branch of engineering. Because of the backlog of research
results not yet used, there is an immediate and continuing role for education,
both of newcomers to the profession and of experienced practitioners. The
final recommendation is that we must aim our future theoretical research on
goals which are as far ahead of the current state of the art as the current
state of industrial practice lags behind the research we did in the past.
Twenty years perhaps?_

This is a very problematic set of conclusions which speaks to more general
concerns growing in my mind in recent times about the value of much academic
research.

Because of course, it's now more than 20 years since C.A.R. Hoare wrote this,
and yet formal methods are no more used now than they were in 1996. It's not
that there's a backlog of research that isn't used. It's that academia has
gone down this rabbit hole for decades and _it has never been widely used_. 40
years and hardly any success! Why would anything be significantly different
given another 20?

The paper does an admirable job of trying to congratulate everyone, whilst
putting a brave face on lack of adoption - saying that a 20 year lag time
between research and adoption is perfectly reasonable and a sign of maturity
in the software industry. With another 20 years of hindsight, we can perhaps
see now that industrial practice isn't lagging behind academic research, when
it comes to formal methods and proofs. It's written it off and ignored it for
good. The closest industrial practice gets to this is we use slightly stronger
type systems now, but nothing a programmer teleported from 1996 wouldn't
recognise.

~~~
setr
At least one aspect would be that stricter guarantees don’t really matter,
except when they do. If ease of use, learning, compile time, efficiency, etc
was equivalent to a non-strict language, then presumably industry would
naturally make the leap (ignoring the inherit inertia of established norms and
actual code). It would be dumb not too, if you could get fewer bugs for free.

But they’re obviously not equivalent, and most businesses only care about
correctness to a certain limit (at which point its superceded by other
factors; diminishing returns and all that), then formal certains will
naturally only appear in common use until they reach that limit.

To truly claim futility of formal systems, you should be looking at the guys
who prioritize it over basically anything else: if say even space companies
couldn’t give a shit about advancements in formal systems, then its clearly
divorced itself from practicality. If NASA prefers javascript over whatever
formal system, then clearly the academics have gone down a dark road.

But at the same time, is it the academic’s job to build practical systems, to
advance industry? Isnt that the role of the industrial researcher? As I
understood it, academia’s ideal role is to simply advance human understanding
in interesting ways: whether it has immediate practical consequence _should
not_ be in their domain. ie research ML decades before it has any real hope of
being useable. It’s industry itself who takes the role of poring over the
academic’s work, and figuring out if they can use it for anything. And if its
cost-effective.

And if industry specifically wants academia to look into things of immediate
interest, they shouldn’t be waiting on universities for it; they should be
hiring an academic to specifically research it.

~~~
repolfx
Resource allocation is a zero sum game. If taxes are being levied to pay for a
vast intellectual class living in ivory towers, that's less funding available
to pay for industrial research.

Academia love to claim that practical application should not be any concern of
theirs. But why should this be so? After all, they eat practical food and live
in practical houses. Someone has to pay for it all. And the idea, espoused in
this essay, that _eventually_ , on a _long enough_ timescale, their research
will all be adopted and it's just that industry is super slow to recognise the
awesomeness of their research and productize it - well, this claim seems to be
getting weaker all the time. We now have many areas of computer science that
have spent decades going down very expensive rabbit holes with no serious
adoption and none on the horizon.

Not just formal methods, but pure FP languages like Haskell remain rarely
adopted despite huge hype and massive academic investment. There was a few
days ago, a post asking what happened to the semantic web with many commenters
positing that the semantic web became an academic money pit that absorbed huge
quantities of grant money and delivered nothing whatsoever. I'm sure there are
many other examples lurking if you scratch the surface.

If academic funding was cut significantly and returned to companies in the
form of lower corporation taxes, perhaps connected to research incentives,
would it be so bad? There'd probably be less research done overall but if lots
of current research goes nowhere and is never adopted that'd be a net win.

~~~
int_19h
What do you make of corporations subsidizing seemingly pure academic research,
then? Take the aforementioned Haskell, for example - you speak of "massive
academic investment", but Simon Peyton Jones is still a Microsoft [Research]
employee, and Simon Marlow is a Facebook employee (and former MSR as well)!
And yes, you don't see Microsoft use Haskell in its products - but the ongoing
funding clearly indicates that there's substantial perceived value in that
investment. Given the amount of time this has been going on, surely that's
more than can be explained away by "hype"?

~~~
repolfx
I think the corporate world is still learning how to do effective research.
Too many companies think that doing research means replicating a tiny
university within their own walls.

I'm not saying industry always gets it more right than academia, but I think
even the orgs like Microsoft Research that are basically just clones of
academia, are still on a slightly better path. For instance MSR has done a lot
of cutting edge research into operating system design. Still, the money was in
the end wasted - nothing of Midori ever surfaced in real products.

The gold standard for me is Google. They're one of the few firms that managed
to completely fuse research and industrialisation into a totally cohesive
whole. Whilst they technically speaking have a research department, the
research they're known for is largely not done there and Google has managed in
the past to routinely create new innovations from all over the company.

It's worth realising that a lot of industrial and corporate research isn't
labelled as such. It's just "innovative product development".

Example: If we look at JetBrains, they developed a new programming language
from scratch (Kotlin) that has several innovative and cutting edge features.
Do we think of JetBrains as a sponsor of research or Kotlin as the product of
research, probably not, yet it solves many problems that were previously
unresolved, like how to make a new language that has seamless interop with
Java, and how to improve the usability of features like generics and nullity
checking. The work involved in creating it explored new frontiers that
academia has ignored, simply because usability improvements and practical
topics like interop with existing code is not considered 'smart' enough to be
worth the title of research.

You see this pattern crop up repeatedly. Academia, left to its own devices,
researches ever more exotic extensions to Haskell, knowing full well most
programmers don't use Haskell. The _corporate funded_ academic research done
by JKU Linz and Oracle has focused on how to make existing languages like Ruby
and Python super fast to execute, super debuggable, cheap to develop runtimes
for, how to solve the language interop problem once and for all and so on.
They've produced something real - Graal - which looks on track to actually let
Ruby call into C++ call into Java call into JavaScript call back into Ruby
again, in a seamless and fully optimised manner.

Now _that_ is cutting edge but incredibly practical research that could have
industry-shaking impact within the next 10 years. It isn't primarily funded by
the NSF or EU though. It's funded by a corporation.

~~~
irundebian
It's just lame to play off practical against theoretical research and academic
vs corporate research. Each of them has its advantages and disadvantages.

