
Why Don't People Use Formal Methods? - pplonski86
https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/
======
mikekchar
The really simple answer to this question is: because at a high level of
abstraction, it isn't useful. Most applications are being specified by people
who don't understand in any detail what they want the application to do. They
can't even imagine the problem well enough to discuss it let alone specify it.
A very large proportion of failed projects I've seen is due to this break
down. The people who have the time and inclination to investigate the details
are not the ones who care what it does. You can have an incredibly low defect
count and still not build the thing you need.

But if you look at a lower level of specification, then formal methods start
making a lot more sense. For example, let's say I have to implement a
communications protocol. This is something I can specify and that I can build
against that specification. Companies sell protocol testers that test your
implementation against a protocol. It might be nicer to have a formal
specification than a back end test -- especially because there are often more
scenarios you care about than you can feasibly test.

At an even smaller scale, we are now realising some benefits of this mode of
thinking. For example, we've had types in programming languages for a long
time, but lately we're able to actually make statements about what the code
does, simply by analysing the types. This is a massive step forward and I'm
really looking forward to things like constrained types making it into the
mainstream. Similarly things like Rust's borrow checker can allow you to make
analytical statements about how much memory you are using (if you are
careful), etc, etc. These are great.

I think that the success of formal methods is going to be through the backdoor
and at this design/implementation level for a long time. In fact, pretty much
forever.

~~~
jondubois
Completely agree. This line sums it up:

>> Before we prove our code is correct, we need to know what is “correct”.
This means having some form of specification

That is the problem. The people who come up with the spec don't fully know
what is required. It's even worse than that actually; the requirements change
constantly to account for changes in the technological and business
environment within which the project exists.

~~~
pytester
I think it's worse than that.

I did a course on formal methods at university. The 'spec' ended up being a
lot more complicated than the actual program.

I wasn't really confident afterwards that the "proved" software actually did
work.

I think it's possibly a suitable method for building very low level code (e.g.
sorting methods) where the spec is going to be exceptionally simple but it's
effectiveness drops off quickly the higher up the stack you go.

~~~
mpweiher
I had exactly the same experience. We did specification in Z, but the code
that was supposed to be "proved" correct was a clearer reflection of what we
wanted to do than the "spec".

So I actually had more confidence in the code than I did in the spec.

And I think this answers the question of the OP: people don't use formal
methods because

(a) the vast majority of problems don't start out with a formal spec

(b) the gap between informal and formal is vastly more relevant than the one
between different formalisms

(c) formal methods are significantly _worse_ at bridging the formal/informal
gap than most programming languages

And yes, when you have a sub-problem that _is_ mostly formal, then it can make
sense, particularly if there's a lot riding on it.

~~~
brazzy
> And I think this answers the question of the OP

Do you realize that the OP is actually linking to an article explaining this
(and a bunch of other aspects) in great detail?

~~~
mpweiher
While it's a good article, I didn't see it covering this aspect, particularly
(c).

------
twblalock
Formal methods don't make a lot of sense when the product changes from week to
week, or month to month, or even year to year.

It seems like the advocates of formal methods assume a waterfall method of
software development wherein you develop a spec, formalize it, prove that it
works, and then deliver it. That's just not how software works in industry. A
lot of it never had formal specs and never will, and whatever specs exist
change frequently. By the time a spec is perfected it is out of date and we
are already moving on to the next thing.

Agile development killed waterfall development and formal methods have not
adapted to the agile methodology.

~~~
Jtsummers
Formal methods make sense when you focus them on the right _part_ of your
system. Even if it's changing week-to-week, there's an underlying structure or
design that's staying consistent.

You wouldn't use formal methods on, say, the GUI layout of your system. But
you _could_ use it to prove that you're accessing your database in a way that
won't cause corrupted or incorrect data to appear. If you're working on
anything with concurrency, this is actually crucial. Show, for instance, that
your multi-user Google Docs clone won't end up with a corrupted document when
two users try to edit the same sentence or word at the same time.

That behavior won't change from week-to-week (both the multi-user nature and
the desire for stability/reliability), though the nature of the interface or
target platforms may change.

Also, Waterfall only existed because some people at NASA/DOD were idiots and
couldn't read past the intro to Royce's paper. It needs to die a terrible
death.

Formal methods can happily exist within an agile environment once you realize
that you don't have to deliver _code_ each week. You are able to deliver
prototypes, mockups, and documents along the way. Formal models can exist in
that group.

~~~
adrianratnapala
> That behavior won't change from week-to-week

Maybe "shouldn't" is a better world that "won't".

It is sadly common for a sudden whim to cause a new feature to thread a change
all the way from the GUI down to the database representation, with plumbing at
multiple stages in the middle.

~~~
TeMPOraL
At GP's level of specification, things shouldn't change even then, unless
you're doing an actual redefinition of what the software does and why. Even if
a new feature is cutting across all layers in your code, it's unlikely that
it'll alter the requirements like "parallel edits don't corrupt the document".

~~~
logicchains
>Even if a new feature is cutting across all layers in your code, it's
unlikely that it'll alter the requirements like "parallel edits don't corrupt
the document".

What if you're building MongoDB a decade ago, and the new feature is "perform
faster on benchmarks"?

~~~
AstralStorm
That is not a properly specified or useful requirement.

1) Your benchmarks have to be validated to correspond to real life usage.
Especially problem sizes and specific structure. (Whether collisions are
needed, what kind of queries etc.)

2) A proper performance specification is WCET or runtime histogram for given
piece of code, which can be checked with a model checker.

3) It is a requirement to pass a given test "better". If that sounds good to
your clients, well, you're selling a database to clueless people who somehow
care about a specific benchmark... Competitors will pick a different benchmark
to brag about.

4) Problematic parts of the code have to be identified, specified and fixed.
And tested. If you do not match performance spec by then you have a different
problem and need to redesign the software including algorithms used most
likely. Very expensive.

------
ghayes
For me, working on contracts on Ethereum, I have found formal methods to be
invaluable (we use a tool by Certora [0] based on CVC4 solver). The trade-off
shifts once you enter the world of immutable code, and the time we've spent
designing, reviewing and now formally testing has been invaluable to shipping
our product with confidence. We tend to think of our formal specifications
like the ultimate unit test (instead of testing a few cases, test every
possible case). Also, the prover has found subtle bugs which have really put a
smile on my face.

An example spec might look like this:

    
    
        description "Admin (and only admin) can set new admin" {
          // e0, e1 can hold any possible environment and are ordered sequentially
          env e0; env e1;
    
          // free variable for possible values of new admin
          address nextAdmin;
          havoc nextAdmin;
      
          // call to set the new admin
          invoke setAdmin(e0, nextAdmin);
    
          // assert one of two conditionals must hold for all cases:
          // 1. We were the admin and the next admin is what we sent to `setAdmin()`
          // or 2. We were not the admin, and the next admin is whatever is was before our call to `setAdmin()`.
          static_assert (
            e0.sender == invoke admin(e0) &&
            invoke admin(e1) == nextAdmin
          ) || (
            e0.sender != invoke admin(e0) &&
            invoke admin(e1) == invoke admin(e0)
          );
        }
    

It's a few lines of code to verify a simple function, but it's incredibly
powerful for us to know that our code works in all cases.

[0] [http://www.certora.com/](http://www.certora.com/)

~~~
josh2600
This makes sense, but I still think it’s really hard to formally verify
immutable code against a mutable VM.

Your code can be correct today but the VM can change tomorrow.

~~~
UncleMeat
This is like the quote about getting the right answer if you put in the wrong
inputs.

Of course if you change the program semantics then the program changes. This
is a limitation of literally all techniques, including testing.

------
maxyme
I did an internship at a company building aviation hardware and software.
Since everything was safety critical we used formal methods and things moved
at a snail's pace.

When verification becomes the most important thing everything changes. The
design specification doesn't make sense? It will take months to get it
changed, write whatever code you can convince someone verifies against the
design. Compile times are longer than an hour? Not a problem since the
majority of the job isn't writing code but instead doing code reviews,
verifications and writing tests. In fact writing code was about 15% of that
job for the lowest level engineer. It was a whole different world than
unregulated codebases are.

~~~
devoply
Sounds like a good job.

------
im_down_w_otp
As a person who does and who's founded a company partially on the basis of the
value of them, my opinion is that, in-part, it's because the ecosystem is
extremely fractured, the tools are often an esoteric projection of the inside
walls of one mad scientist's head turned into software, and they often don't
scale very well (mostly due to the two previously mentioned conditions).

Additionally, the tools are often designed to empower a kind of pre-existing
"priesthood" of a particular kind of verification practicioner, not
necessarily to generally enable a more rigorous software development life-
cycle by being approachable.

Granted we're trying to change a fair chunk of that, but that's sort of the
status quo at the moment.

Lastly, there are some cultural forces in the software industry which aren't
all that aligned or interested in making commonplace software development as
rote, boring, and predictable as steel beam construction. But, I think those
can eventually be overcome when people see the value of enabling a new bar for
the sophistication of software innovations (e.g. like FEA and CFD did for
mechanical systems).

~~~
agentultra
What's your company? I'm interested in the commercial success of these tools
and bringing down the cost of using them to industry-acceptable levels so they
can be used in more projects.

~~~
im_down_w_otp
Auxon Corporation ([https://www.auxon.io/](https://www.auxon.io/) a
placeholder site, obviously. Low priority). Founded late last year. We're
working on tools and technology for enabling people to build, automatically
validate, and verify life-critical automation (e.g. highly-automated driving
systems, industrial automation, etc.). But the ultimate goal is broad
application.

Currently we work primarily with AV companies, automotive OEMs, and partners
in their respective supply-chains. Our ambition is to create for Software & AI
Engineers the kinds of sophisticated, accessible CAE tools & capabilities that
have been enjoyed by Mechanical & Electrical Engineers for years. Software
Engineers have done a phenomenal job enabling other disciplines, but we've
done a less compelling job enabling ourselves in similar ways.

------
tanilama
If you want certain technology to be commercially successful, it needs to
provide value.

For the most part, most code are tolerated to be buggy, in exchange of fast
delivery. And for most internet applications, a single error has limited
scope, so the damage is controlled. For those applications, former methods are
not something that could help them succeed, they don't even write a lot of
unit tests.

However, in different scenarios, like financial trading or consistency
protocol building, where a single mistake will either cause you millions or
resulting in nonrecoverable data loss that make customers lose confidence in
your product, formal methods ARE indeed being used.

So it is simply a cost efficiency problem. For most disposable application
project, it is not worth it, so they don't invest. If the code of interests is
critical, and deserve every effort to get it right, then formal methods will
be favored.

~~~
TheFattestNinja
Can you give me more examples of FM used in finance? I feel like they _should_
be used, but in my limited experience with finance-related gigs I've been
aghast at the lack of even basic testing put in place...

~~~
bronsa
At Imandra ([https://www.imandra.ai/](https://www.imandra.ai/)) we've been
collaborating with a few big names in finance, from our website: "In 2017
Aesthetic Integration partnered with Goldman Sachs to help deliver the SIGMA X
MTF Auction Book, a new orderbook venue implementing periodic auctions.
Aesthetic Integration used Imandra, our own automated reasoning engine, to
formally model the Goldman Sachs design of the SIGMA X MTF Auction Book,
verify its key properties, and automate rigorous testing. Now that the venue
is live, Imandra is being used to perform ongoing validations of trading
operations."

------
agentultra
To counter the exhaustive number of comments about how expensive,
impenetrable, and unrealistic formal methods are in practice: _I have been
using formal methods in my work_.

To borrow the parlance of the author, I use TLA+ -- a design verification tool
-- in my work. Some of my work is in lightly regulated food and drug
industries but most of it is not. It hasn't cost my company a tonne of money
or delayed projects. In fact I don't think anyone has really noticed it too
much.

Some features in our products use an event sourcing pattern and the
architecture is built using a few different components that all need to work
in concert. The system can't be validated to be correct with the source code
alone. The integration tests can't capture every possible path through the
code to ensure we only hit valid, desired states. For those features I
maintain a couple of TLA+ specifications in order to ensure our design
maintains the properties we desire: we only hit correct states, never drop
messages in the face of component failures, that our event store is
serializable, etc. And that has informed how I direct the team to design our
software implementation.

I don't go to great lengths to validate that our implementation faithfully
implements the specification but I do use the TLA+ models to inform the
implementation and vice versa: sometimes we discover things during the
development process that I use to update my models. One informs the other and
it helps me to manage the complexity of those features with confidence.

It doesn't cost us a lot of time or effort for me to do this work. The
customer doesn't care at all: they only see the user interface but they do
expect their data to be consistent and that failures are not catastrophic.
Formal methods have been helpful in this regard for my team.

 _Edit_ I started using TLA+ a few years ago to find and resolve bugs in
OpenStack; it has been an effective tool and I always recommend learning it.

~~~
ausimian
I learnt and used TLA+ while working at msft, and I'm still using it today:

\- anytime I have some protocol or state machine whose behaviour isn't
obvious, I write a TLA+ spec. \- the process of writing it clarifies my
understanding and leads to new insights. These insights directly inform the
code and the tests. \- the model checker makes it very easy to check
sophisticated properties of the algorithm.

I don't use it all the time, but it's a tool I'm very happy I invested in.

------
jacques_chester
The trite answer is: because it's hard.

(An entire section in the article is devoted to this point.)

Programming languages and tooling have attracted phenomenal amounts of
attention and iteration in just a few decades. Folks hereabouts just about go
to the damn mats over the smallest syntactic quibbles.

Meanwhile, over in mathematician land, they're limping along with a write-
optimised mishmash of "the first thing that a genius in a hurry scribbled
three hundred years ago and now we're stuck with it".

I have very limited attention and brainpower and remembering what ∂ means in
_this_ book vs the other twenty places it's used in my library and where it
means something completely, entirely, utterly different, is just hard. So too
is remembering what Q(x) and W(z) mean for two hundred densely-written pages
of stuff optimistically described as accessible to anyone who has taken
highschool algebra (not mentioned: _and remembering it exactly_ ).

Notational ergonomics matter, dammit.

~~~
umanwizard
I think this is an exaggeration/strawman. Can you give an example of a
mainstream introductory math book that defines some esoteric notation like
"Q(x)" and then continues using it hundreds of pages later? Note that I don't
mean defining it locally within the context of one proof -- that's something
different.

The closest thing I can think of is notations like U(f, P) and L(f, P) for the
upper and lower Riemann sums of a function f with respect to a partition P.
These do show up in introductory real analysis textbooks, but they're pretty
basic/fundamental notions.

By the way, as far as I know, ∂ in introductory textbooks can only mean either
"partial derivative" or "boundary". Do you have examples of it meaning other
things?

~~~
jacques_chester
Most of what I've seen lately is queueing theory, markov processes or
statistics, and yes, I have seen notation used for hundreds of pages at a time
and I've seen these symbols used to mean different things (not including
partial differential).

One of the books I read recently[0] has a table of where symbols and formulae
are first defined. It's a godsend.

But I would prefer self-describing names to a lookup table. I don't put
comments at the top of each file of code with such a table.

[0]
[https://highered.mheducation.com/sites/0073401323/informatio...](https://highered.mheducation.com/sites/0073401323/information_center_view0/index.html)

------
zanny
I think it goes beyond conservatism for most devs as to why you don't formally
specify your design and its mostly rooted in why AGILE is so popular.

The sooner you show a finished product, no matter how broken and incomplete,
the sooner your client can realize all the things they didn't want they said
they did and replace them. If you give them a formal design document it might
be ironclad and foolproof but it doesn't put the finished product in their
face to scrutinize.

Money doesn't judge. It ends up in the hands of all manner of people. As
programmers we want money for time, and the people with the money rarely have
the education or willpower to even be able to comprehend a formal design. But
they are the ones calling the shots with some esoteric vision locked up in
their head we are meant to realize.

As someone who dabbles in art too there are substantial parallels between the
two. You sketch your piece and constantly refer back to your commissioner to
verify _this_ is what they want. Almost every single step of a painting or
animation or music production or anything creative a commissioner will change
their mind on it, either because you suggest something better or because they
recognize flaws in their realized vision.

Thats where "run fast and break things" even comes from. Making some piece of
garbage "sketch" is way more informative to your average "commissioner" than
having them write down to every intricate detail their vision for the finished
piece. Because like with all other creation those we write code for rarely
have the education to understand what they are even asking for. They have this
nebulous idea of a fraction of a finished product and you need to basically
teach them to see the whole picture along the way.

Until they actually have even a remotely reasonable grasp on what they want
all the testing, specs, design docs, etc are wasted because you still don't
know what you are actually making. The more endemic problem would be that we
still don't have an accurate way to convey the long term value in doing it
right once we have a glued together rapidly iterated abomination that now
needs to be redone properly. Thats where formal methods I feel have the most
value - a developer can produce a formal spec for a near-finished product
that, when they always fail to achieve the spec because the money stops when
it "looks done" at least the next guy months or years later who inevitably has
to pick up and fix this mess has a better idea of the intent than the person
paying them would on the repeat attempt.

~~~
Tempacc78534
You are right in general where the cost of failing is not that high, but
consider software on flights, a pacemaker or (the best example everyone in FM
quotes) a space mission. The cost of a small bug can cost you huge
money/lifes. In these cases companies have an incentive to verify their code.

Hey you don't have to go to these special cases. Amazon, google, Microsoft all
do Formal verification (or have started somewhat sophisticated testing) of
their critical pieces of code as a bug in them even if it causes an hour of
downtime will cost them in millions.

~~~
hwayne
I didn't include it in the article because I wasn't 100% confident in saying
so, but I'm reasonably sure that medical devices don't use FM. There's not a
whole lot of oversight in that area, and most of the standards are (to my
knowledge) about software process versus software methods.

Aviation also doesn't really use FM, but that's because they follow DO-178C,
and that is a scary scary standard.

~~~
Tempacc78534
It would way cheaper for them to verify their code than to probably follow
these standards but bureaucracies are bureaucracies.

Also, a little nitpicking in your write up but you should have also talked
about type systems. They carry the spirit of formal methods but rather than
being construct then check for correctness, are construction is correct by
default. Type systems are widely accepted and show that some of these ideas
can gain traction.

------
goto11
Formal methods assume the difficult part of software development is producing
code which precisely conforms to a specification. In reality the difficult
part is to even get to an unambiguous specification with represent the actual
business needs, which are often vague or unknown when development begins.

This lead to iterative development. But iterative development does not go well
with formal methods. With vague specs and iterative development, the code
itself ends up being the spec. If well-written, the code even reads like a
spec, and during the iterative verification, it is proven if the spec/code
conforms to the product owners actual (often unstated) requirements. Just take
user interface design: It is incredible hard to design a good GUI for a
complex application and requires multiple rounds of user testing and
refinements. But if you actually have a spec for an UI, implementing it is
trivial and can be done by the intern.

More fundamentally: If you do have the spec stated completely and unambiguous
in a formal language...why don't you just compile the spec then? Seriously,
why translate it by hand into a different language and then spend a lot of
effort to prove you didn't make any mistakes in this translation? DSL's and
declarative languages seem more promising as ways to bridge the gap between
spec and code.

That said, there are obviously domains where formal methods are worth it,
because you really want the extra verification in having the logic stated
twice. Especially in hardware development which doesn't lend itself to
iterative. It is just a small part of software development overall.

The most promising future for formal methods are in type system, since they
are not separate from the code, so they lend them selves much better to
refactoring.

------
timClicks
Absolutely love that Hillel seems to have single-handedly assumed the role of
taking formal methods to the people. His Apress title
([https://www.apress.com/gp/book/9781484238288](https://www.apress.com/gp/book/9781484238288))
is a wonderful example of technical writing that's somehow accessible, yet
deep.

~~~
hwayne
<3

You might also like _Software Abstractions_
([https://mitpress.mit.edu/books/software-abstractions-
revised...](https://mitpress.mit.edu/books/software-abstractions-revised-
edition)), by Daniel Jackson. I read the whole thing in two days and then
immediately emailed him gushing about how much I loved the book. It's
absolutely fantastic.

------
saurabh20n
As some other people have pointed out, you have to pick the domain and
application. Airbus controllers (abstract interpretation [1]), Microsoft
software driver verification (SLAM model checker [2]) were ones where formal
methods found use in the pre-2010 era. Since then, the domain has led to work
that brings Excel the capability to auto fill data [3], toy demonstrations of
how synthesize programs while verifying them correct [4], verify file systems
[5], and now verify and synthesize smart contract code (whose deployments are
expected to be permanent) [6]. This is just a sampling. There are many of
other uses.

[1] Astree [http://www.astree.ens.fr](http://www.astree.ens.fr)

[2] SLAM SDV [https://www.microsoft.com/en-
us/research/project/slam/](https://www.microsoft.com/en-
us/research/project/slam/)

[3] flashfill [https://www.microsoft.com/en-us/research/blog/flash-fill-
giv...](https://www.microsoft.com/en-us/research/blog/flash-fill-gives-excel-
smart-charge/)

[4] Invariant based synthesis [https://www.microsoft.com/en-us/research/wp-
content/uploads/...](https://www.microsoft.com/en-us/research/wp-
content/uploads/2016/12/popl10_synthesis.pdf)

[5] FS verification [https://www.usenix.org/conference/atc17/technical-
sessions/p...](https://www.usenix.org/conference/atc17/technical-
sessions/presentation/sigurbjarnarson)

[6] Solidity verification and synthesis [https://synthetic-
minds.com](https://synthetic-minds.com)

------
jfries
Formal verification is alive and well in ASIC design land. I suppose it's
especially suitable here because of the mix of late bugs being very expensive
and that the designs are fundamentally restricted to not too many fancy tricks
(for example, hardware can't be monkey patched in runtime).

But where I think formal really shines and saves a lot of time is in bring-up
of new code. Using formal, it's possible to write a simple unit-test style
test environment, and rely on the formal tool to insert all tricky stall
conditions, etc. It's then possible to verify things that are normally tricky
to both design for correctly, and also to test.

------
eirikma
The problem with formal methods is that any useful program must interact with
its environment (customers, filesystems, networks, sensors etc) in order to
make a difference, and nobody is able to capture, comprehend or relate to the
full specifications of those items and their subsystems. This results of
proofs based on a lot of incorrect and incomplete assumptions.

This problem apparently even extends to the model-checking software itself,
resulting in systems reported to be correct when then network message saying
it is flawed was lost:

[https://blog.acolyer.org/2017/05/29/an-empirical-study-on-
th...](https://blog.acolyer.org/2017/05/29/an-empirical-study-on-the-
correctness-of-formally-verified-distributed-systems/)

------
polskibus
My biggest issue with TLA+ is how detached the tooling and syntax is from what
is used by developers daily. PlusCal is only a bit better, but has other
shortcomings. A lot of what it offers can be simulated by writing data
driven/parametrized tests with NUnit/JUnit (it can be a lot of effort and
error prone though).

Ideally, I'd like to have a toolkit with less friction and to have more real
life examples that show not just how to do the first spec, but how to write
code afterwards, update spec with time, update the code, etc etc.

------
phtrivier
I might be missing something obvious, but from my (small) reading about TLA+
(and the lamport video course), my biggest concerns were:

A: even if you understand the spec you want to write, it's possible to get the
_implementation of the spec in the spec language_ wrong (see the Die Hard
exercise in the course) ; isn't there some kind of recursive catch-22 there ?

B: even if you understand the spec you want to write, and manage to write the
spec you want to write, and get its model checked, etc... Than you still have
0 lines of working code, and (unless I'm mistaken) you're going to have plenty
of occasions to _not_ implement the spec properly, and still get all the bugs
that you proved would not happen if you had written the code right (you
dummy.)

So it seems like the interesting part is the "game of thinking harder about
what you want to do", and it is (arguably) done (at least in part) by writing
automated tests beforehand, and the tests also have value _after_ you've
written the production code.

That at least could explain why the jump to formal method is not natural.

Looking forward to being wrong, though.

~~~
mbrock
There’s a huge misunderstanding that pops up in every discussion about “formal
methods.”

These methods are not perfect silver bullets to guarantee bug-free software
forever. They are supposed to be useful engineering tools that help you think
and verify your thinking especially for tricky corner cases and interactions.

So then it’s a bit like “Why would I think about the code I’m going to write
when I could later make a mistake in implementing what I though?” Well, even
so it has advantages over not thinking.

Not to suggest that “formal methods” are the only way of thinking. They’re
just not really what people imagine they are when they criticize them.

~~~
phtrivier
Fair enough.

If you have experience, what would be the rebutal to my fear of the "catch-22"
?

What I mean is, how often does it happen that you think about a problem right,
but fail to transcribe that in a formal method, you gives you either too much
confidence in a model that does not work, or makes you feel you have wasted
your time on mental self-molesting ?

How do you approach "experimenting" with formal methods to avoid this risk ?

------
jonathanstrange
Most of the answers already given are good, but there is a simpler one: _It 's
too expensive._

It's so expensive that even the highly regulated and safety aware aerospace
industry avoids formal verification for non-essential systems when they are
allowed to.

Moreover, formal verification only makes sense in combination with certifying
the software on some specific hardware, because it would be a pointless waste
of time to run software that is guaranteed to behave well on hardware that
makes no guarantees that it will execute the software correctly. Consequently,
even small changes in hardware may force you to re-evaluate the whole system.
So the maintenance costs of high integrity systems are also very high.

------
elihu
I think the biggest reason is that most projects depend on an enormous
mountain of software written in languages like C and C++, where we can't even
be sure we aren't indexing past the end of an array or casting a NULL pointer
to the wrong type, let alone prove higher-level correctness properties. (To be
fair, it's possible to generate C from a higher-level language/theorem proving
tool, but few people actually do this.) For instance, most of us use operating
systems, compilers, language runtimes, databases, web servers, web browsers,
command line utilities, desktop environments, and so on written in not-
terribly-safe languages.

Formal verification might be worthwhile for its own sake, but for most
projects you just need your own code to not be any more buggy than the rest of
the technology stack you use, which you aren't going to re-create from scratch
in a formally verified way because that's way too much work and the existing
tools are just so useful and convenient and just an rpm/yum/git command away.

The low-hanging fruit right now is to eradicate undefined behavior at all
levels of the software stacks we use. This is a daunting task, but I think
we'll get there eventually. Once we've banished heap corruption, segfaults,
race conditions, and similar problems to the point where developers say "hey,
I remember those things used to happen in the bad old days but these days I
couldn't cause them if I tried" and we can be sure that if we call a function
with the same input it will return the same result every time consistently,
then is the time to consider formally proving that the result returned is
correct.

(That isn't to say that there aren't some domains now where formal methods are
tremendously useful. Also, sometimes one can make a productivity case rather
than a product quality case for at least using semi-formal methods; using a
language with a strict type system can mean spending less time trying to
resolve weird bugs.)

------
jillesvangurp
I haven't encountered formal methods in the wild since I left university in
1998. I don't know of anyone in my wider circle of friends using formal
methods. Using it is by and large a niche thing in our industry. You see it
pop up in places where people are desperate to commit millions in extra work
just to avoid disastrous failures. I've never worked in such places. Use of
formal methods is a last resort in our industry: you use it when you've
exhausted all other options.

That being said, in the same period of time, people have learned to appreciate
static typing and unit tests. Unit testing was not a thing until people
started pushing for this in the late nineties. Around the same time, computer
science students were infatuated with languages that were dynamically/weakly
typed. Now 20 years later, type systems are a lot less clunky and those same
people seem to be upgrading their tools to having stronger types.

E.g. Kotlin has a new feature called contracts, which you can use to give
hints to the compiler, which then is able to use the extra hints to deduce
that accessing a particular value is null safe. This allows for methods called
isNullOrBlank() on nullable strings (String? is a separate type from String in
Kotlin). So, calling this on a nulled string is not an NPE and even better, it
enables a smart cast to String after you call this.

Also, annotations are widely used for input validation and other non
functional things nowadays. I think annotations were not a thing before around
15 years ago (I might be wrong on this). I think Rust is a good example of a
language with very strong guarantees around memory management.

Compilers of course have a very big need to be correct since compiler bugs are
nasty for their users. So, there is a notion of some of this stuff creeping
into languages. Combined with good test coverage, this goes a long way in
being minimally intrusive and giving you stronger guarantees. But probably
calling this formal methods would be a bit of a stretch.

------
AnimalMuppet
A few points resonated with me:

Proving that the spec is the _right_ spec is impossible (stealing a point from
Roger Penrose). At some layer, you have a transition from an informal idea in
someone's head to a formal specification. Proving that the formal
specification accurately matches the informal idea is impossible by formal
means, _since the idea being compared to is informal_. Put another way, it is
impossible to formally prove that the transition across the informal-to-formal
boundary was done correctly.

Formal methods burn up a lot of time, which could be spent on other things -
like iterating and refining the design and (informal) spec. If you build the
wrong thing, formally proving that what you built matches the formal spec
doesn't actually help. You might be better off getting a prototype in front of
users/customers and getting feedback, rather than formally proving that you
implemented your initial spec or design.

And, one thing the article doesn't mention: Formal verification isn't
bulletproof. Quoting from memory: "It's amazing how many bugs there can be in
a formally verified program!" (I believe said by Don Knuth.) Worse, _the
verifier can also have bugs_.

Note well: This does not mean that formal verification is useless. It's just
not 100% perfect. 95% is still worth something, though. If formal methods are
worth doing, they're worth doing for only 95% of the benefit.

------
jpollock
I've always wondered how I can write a specification in a language which is
complex enough to cover the problem, clear enough to be verified by the
customer, and not have it be as complex as writing the code.

~~~
Tempacc78534
The solution is to write partial specs.

For example, if your client wants a piece of code which doesn't crash and
sends a message over the network. You can write 2 partial specs i.e. 1) it
doesn't crash 2) it sends a message over the network.

The partial specs can be individually simple but the program which satisfies
all of them will be complex.

This idea of partial specs follow the general SE design practices as well,
where we describe fine grain tasks the program has to do (like when I click
this button, this should happen).

Another trick is to create domain specific languages to specify certain kinds
of specs. There is a DSL to write memory safety constaints and a special DSL
to write Distributed coordination between machines. Each of the specs written
in the respective DSL is easy to understand but a program written in general
purpose language won't be.

~~~
magicalhippo
Any message? Or a specific one?

I mean, at my work, a lot of our customers don't even know what our program
needs to send to some other program until months after the change needs to be
live...

------
westurner
Which universities teach formal methods?

\- q=formal+verification [https://www.class-
central.com/search?q=formal+verification](https://www.class-
central.com/search?q=formal+verification)

\- q=formal-methods [https://www.class-
central.com/search?q=formal+methods](https://www.class-
central.com/search?q=formal+methods)

Is formal verification a required course or curriculum competency for any
Computer Science or Software Engineering / Computer Engineering degree
programs?

Is there a certification for formal methods? Something like for engineer-
status in other industries?

What are some examples of tools and [OER] resources for teaching and learning
formal methods?

\- JsCoq

\- Jupyter kernel for Coq + nbgrader

\- "Inconsistencies, rolling back edits, and keeping track of the document's
global state"
[https://github.com/jupyter/jupyter/issues/333](https://github.com/jupyter/jupyter/issues/333)
(jsCoq + hott [+ IJavascript Jupyter kernel], STLC: Simply-Typed Lambda
Calculus)

\- TDD tests that run FV tools on the spec and the implementation

What are some examples of open source tools for formal verification (that can
be integrated with CI to verify the spec _AND_ the implementation)?

What are some examples of formally-proven open source projects?

\- "Quark : A Web Browser with a Formally Verified Kernel" (2012) (Coq,
Haskell) [http://goto.ucsd.edu/quark/](http://goto.ucsd.edu/quark/)

What are some examples of projects using narrow and strong AI to generate
perfectly verified software from bad specs that make the customers and
stakeholders happy?

From reading though comments here, people don't use formal methods because:
cost-prohibitive, inflexibile, perceived as incompatible with agile /
iterative methods that are more likely to keep customers who don't know what
formal methods are happy, lack of industry-appropriate regulation, and
cognitive burden of often-incompatible shorthand notations.

~~~
tdonovic
Almost University of New South Wales (Sydney, Australia) alum, and it's the
biggest difference between the Software Engineering course and Comp Sci. There
are two mandatory formal methods and more additional ones to take. They are
really hard, and mean a lot of the SENG cohort ends up graduating as COMPSCI,
since they can't hack it and don't want to repeat formal methods.

------
fmap
> Before we prove our code is correct, we need to know what is “correct”. This
> means having some form of specification

I suspect that most bugs are introduced because programmers do not have a
clear idea of what their code is supposed to do. There are simply too many
levels to keep track of at the same time. E.g., in C you are manually keeping
track of resource ownership and data representation at the low-level, while at
the same time keeping the overarching architecture present in your head. The
latter typically has several moving pieces (different threads, processes,
machines) and keeping a clear picture of all possible interactions is a
nightmare.

Formal specifications can help with this, by first specifying the full system
and then deriving precise _local_ (i.e. modular) specifications. In my
experience, programming is typically easy once I have a problem broken down
into isolated pieces with clear requirements. It's just that most systems are
so large that getting to this point takes serious effort.

Here you run into a cultural problem. You need to convince people to put
effort into something that they do not implicitly regard as valuable. It makes
sense to me that people aren't using formal methods, unless there is very
little friction when getting started.

------
motohagiography
The question begged on formal methods is, who is the customer of the design?
To accept a formal design, you must also be bound by the constraints and
timelines it specifies, which reduces your flexibility.

The perception the OP or someone here might correct is, formal methods
frustrate corner cutting and other product level optimizations that return
margins, optionality, performance bonuses, positioning and leverage, "moving
the needle," and other higher order fudges that are realities of human
organization.

One example is how producing a security design to provide assurances to
exogenous stakeholders becomes a forcing function on feature commitments. e.g.
"if you do this feature, you need these controls, which means if you don't
have those controls, you aren't delivering the feature by date X, and now
there is a paper trail that you know that. have fun at the customer meeting!"
This is a crude example, but makes the point.

From a product and exec perspective, that forcing function knocks the ball out
of the air / collapses the wave function in regard to "who knew what when," as
it relates to making feature commitments to customers, sales people, and
channel partners, etc. These are all people whose interests are not
necessarily aligned. Much of enterprise tech business is keeping that
equilibrium unstable in a kind of strategic fog while you build a coalition of
stakeholders that gets your company paid. A methodology that forces decisions
in those relationships will get mutinied by sales pretty fast.

(It's often hard to tell whether people are solving these higher order
alignment problems or just being dishonest, and many of them don't even know
for themselves.)

In this sense, formal methods solve engineers problems but not necessarily
business problems. Playing the cynical customer, I don't care how much
marginal residual risk there is, I only care who holds it. Above a certain
level, detail just means "caveat emptor."

It's a critical view of org behavior, but finding a customer for the models
themselves (other than compliance) would go a long way to getting them
adopted.

------
statguy
Surprised that no one has mentioned this yet
[https://www.joelonsoftware.com/2005/01/02/advice-for-
compute...](https://www.joelonsoftware.com/2005/01/02/advice-for-computer-
science-college-students/)

> After a couple of hours I found a mistake in Dr. Zuck’s original proof which
> I was trying to emulate. Probably I copied it down wrong, but it made me
> realize something: if it takes three hours of filling up blackboards to
> prove something trivial, allowing hundreds of opportunities for mistakes to
> slip in, this mechanism would never be able to prove things that are
> interesting.

------
cjensen
Do people read "The Mythical Man-Month" anymore? The answer to questions like
these are part of the book. First, there is no "Silver Bullet" which makes
software development easy. Second, adding a layer of additional effort like
Formal Methods necessarily either (a) increases development time significantly
or (b) exponentially increases headcount.

I'm not suggesting that you avoid Formal Methods. It might really fit your
needs -- for example, if you need high-quality code and can afford the
expense, Formal Methods may fit the bill. The important thing is that not all
development projects have the same needs.

~~~
chubot
I think you're making a good point, but your tone is unnecessarily strident.

~~~
Jtsummers
I'll disagree, I don't think they're making a good point.

GP implies that formal methods always _add_ time to projects, which is going
to be true for:

1) Projects that are building the wrong thing and spend time verifying the
wrong thing (failed validation)

2) Projects with people inexperienced in formal methods (most projects today)

The latter is just a matter of experience. Once people get used to thinking
about things in a more formal way, they'll get faster. Additionally, they'll
learn which parts need to be shown formally and which don't (or which parts
they can show via their type system versus those which need to be modeled
outside the language and type system, etc.). I don't need to formally prove
that my program writes message X to memory block Y, it can be verified by:
inspection, tests. I _do_ want to prove that my two processes _correctly_
exchange messages (since the nature of the system is such that we needed to
write our own protocol rather than being able to rely on a pre-built and
verified/validated one). But here I'm proving the _protocol_ , I still have to
test the implementation (unless my language permits me to encode proofs in a
straightforward way).

The former is a separate issue from formal methods. It's failed validation
efforts which is solved (or mitigated) by something else: lean/agile
approaches to project management (versus delaying feedback from customers, and
consequently delaying validation activities).

~~~
verinus
well if I read you correctly he IS making a good point:

> 2) Projects with people inexperienced in formal methods (most projects
> today)

which is exactly what matters and what adds a lot of time to dev.

1) cannot be addressed by formal method and is one of the most important
things when building software.

~~~
Jtsummers
It’s a temporary problem, not a permanent problem. People need training and
practice. I’d expect most people were slower when first using unit tests or
TDD than they are today. It takes time to build that experience into the
industry but it’s hardly an impossible problem. Especially as the tooling
(PlusCal, Alloy) for lighter weight formal methods gets better and more
accessible.

------
ummonk
Most software complexity is in interactions between systems. Container
frameworks, operating system, hardware drivers, random libraries, 3rd party
APIs, cloud provider APIs, client browser, etc.

Unless all of these are specified in a single language / framework, you can't
really apply formal methods to the system. Instead you'll only be applying
formal methods to individual library functions / features. At least for code
that isn't security-critical, these are already quite robust thanks to the use
of strong typing (especially with powerful typecheckers for algebraic data
types) and unit testing. Formal methods are merely an improvement on advanced
typecheckers and unit testing frameworks.

The real difficulty is in verifying complex interacting systems - the kind of
thing for which you have integration tests, which are quite difficult to write
in an exhaustive way and catch edge conditions.

Note also that even for security-critical code, formal methods don't really
guarantee safety either, since many classes of attacks are based on timing
attacks, especially cache-based timing attacks.

------
xvilka
Part of the problem is that TLA+ syntax and tooling is horrendous. Creating
something more modern and widely adoptable is a key for Mass scale
applications.

~~~
blub
TLA+ is an ugly language, just like PlusCal. I'll never understand how people
can come to the conclusion that it's fine to have syntax like:

* \A p \in people

* acc = [p \in people |-> 5]

* ( _\-- ;_ ) for wrapping PlusCal

* /= for inequality

* /\ for and

* seq1 \o seq2 for append

* and so on

All of these are from the Practical TLA+ book.

Honestly, TLA+ (or PlusCal) looks very impractical. If I want to use
mathematical notation I'll do on paper, not write some frankenstein version of
it in Eclipse.

At this point the whole syntax should be written off as a failed experiment
and they should kindly ask Anders Hejlsberg or Guido van Rossum for help.

~~~
pron
That's not the syntax, that's just the ASCII representation. The syntax, which
is what you get when the Toolbox presents your spec in pretty-printed form,
is:

    
    
      * ∀ p ∈ people
      
      * acc = [p ∈ people ↦ 5]
    
      * ≠ for inequality
    
      * ∧ for and
    
      * seq1 ∘ seq2 for append
    

That's fairly standard mathematical notation. In fact, one of the things I
enjoyed about TLA+ is that I didn't really need to learn the syntax, as it's
so standard. Even the non-standard bits (like alignment that can replace
parentheses) are quite intuitive.

And if you say, well, I still have to _type_ the ASCII, then you're missing
the point. Specifications are not internet rants where you dump a lot of
characters on the screen. They are usually quite short, and you spend nearly
all your time thinking or reading them.

------
LeanderK
I think it's probably also a programming-language thing. When I write in
haskell, the sytnax of the language and construction of the abstraction
hirarchy, with all the simpel and well-specified rules you have to follow,
makes it easier to see what to proof and why. Also, with property based
testing, it is essentially the substitution of proofs with anecdotal,
empirical evidence. I feel like that if I would have the option, there are a
lot of places where i would just naturally proof properties (I know about the
practical difficulties of a lazy language...but I think they are a bit
overstated. See richard eisenbergs thesis on dependent haskell).

I don't think this is obvious in python, at all. You are programming without
thinking much about invariants and properties.

------
wvenable
> This means having some form of specification, or spec, for what the code
> should do, one where we can unambiguously say whether a specific output
> follows the spec.

This seems to ignore a vast majority of possible software. It's easy to
confirm/test whether a function, with no state, takes a certain input and
produces a certain output. You can write that in a spec. You can unit test
that. But that ignores the whole concept of state; the whole state of the
system. If X happened yesterday, Y happens now, and X and Q are true, and it's
a Wednesday and a full moon then the output is Z. By the time you've encoded
that and every other variation into a spec, congratulations you've written the
software!

------
ejcx
Even aircraft software doesn't require formal methods. I have a copy of the
DO-178C on my laptop. It's mentioned once in the glossary, once in the
appendix A, and once in appendix B.

Formal methods would be great, but most people don't have extremely complex
state machines where lives depend on it being right.

~~~
hwayne
I had a section on "high-availability software" that I cut due to not being
able to research it thoroughly. For the most part cars and medical devices and
stuff don't use FM because there's pretty much no oversight. Aircraft, though,
are a super interesting case here. I'll be honest, I'd probably trust code
that's gone through DO-178C more than I would trust code that's been formally
verified.

I _believe_ there's been a push to add formal methods to avionics, mostly
because it'd potentially save money. But any FM stuff has to go through DO-330
and DO-333, which are just as rigorous and intense as DO-178C. The only stuff
that has so far is, unsurprisingly, very expensive and very proprietary.

~~~
Jtsummers
Having worked on aircraft software, including some very safety critical stuff,
I wouldn't necessarily trust code that went through DO-178(A|B|C).

There's an unfortunate tendency (just like CMMI and other certifications
needed/desired in that industry) to achieve the certification on paper by
backfilling. DO-178 requires a formal spec document (NB: Not the same as
formal methods) with traceability to the design and test procedures. Oops, we
let it get stale (Read: Didn't keep up to date). I've seen people turn that
into 5k pages with a good enough looking RTM that no one was ever going to
read, and it got certified.

A system I recently worked on (left the project, it was toxic, or maybe it was
me) had a "design" document that consisted of running the code through
doxygen. While that did produce documentation, the code was reasonably
completely annotated, it did not produce a _design_ document (which should
include rationales and things like, I don't know, state transition diagrams or
similar to illustrate the _design_ , not just the _structure_ ). And that
system is flying today (has been for a while).

These things turn into games by the developers and managers, like hitting 100%
code coverage in testing (which doesn't mean you've actually
verified/validated anything).

~~~
ejcx
Jtsummers, I don't disagree with you. The reality is the DO-178C is no
different than PCI or other compliance standards, and there are ways to game
it with docs.

With that said, if formal methods were really required to fly the plane then
they would add it to the DO-178[D] without hesitation =]

------
xiphias2
Bitcoin is at a point in valuation where a huge effort should be put into
using formal verification: a bug can worth hundreds of billions of dollars.
One problem is that it's already written in C++, which is a language with
incredibly complex semantics for program verification (and now it's too hard
to move away, so the C++ program has to be proven to be correct). The main
blocker is what others have written as well: it's very hard to do the formal
verification.

~~~
dranov
I agree, and there are efforts to develop formally verified implementations of
Nakamoto consensus (which I'm involved in). We've published a paper a year ago
about our efforts and first results and I'm currently extending that work for
my Master's thesis. Others are also building on top of that work [3].

We're nowhere near having a verified Bitcoin implementation (much less
verifying the existing C++ implementation), but we're making steady progress.
One could reasonably expect having a verified, deployable implementation in
3-4 years if people continue working on it.

It remains to be seen whether people would switch to that from Core (very
unlikely) or Core would adopt it as a component to replace its existing
consensus code (interesting possibility, but also unlikely).

[1] -
[https://github.com/certichain/toychain](https://github.com/certichain/toychain)

[2] - [http://pirlea.net/papers/toychain-
cpp18.pdf](http://pirlea.net/papers/toychain-cpp18.pdf)

[3] -
[https://popl19.sigplan.org/track/CoqPL-2019#program](https://popl19.sigplan.org/track/CoqPL-2019#program)

~~~
iso-8859-1
First you mention Nakamoto consensus, then you mention "existing consensus
code". Are you including Bitcoin script when you say consensus? It seems like
a huge task to formally model the whole consensus system, including
OP_CODESEPARATOR. How would you replicate something like the BIP-0050 problem?

[https://github.com/bitcoin/bips/blob/master/bip-0050.mediawi...](https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki)

~~~
dranov
> Are you including Bitcoin script when you say consensus?

Currently, no. Our model considers "consensus" to be agreement on the ordering
of transactions. We don't yet interpret the transactions.

However, agreement on the state follows directly _provided_ nodes have a
function (in the "functional programming" sense), call it `Process`, that
returns the current state given the ordered list of transactions. If all nodes
have the same the function, they will have the same state.

The BIP-0050 issue (in my understanding) is that different Bitcoin versions
had different consensus rules, i.e. their `Process` functions were different.
As far as I know, Bitcoin script is deterministic, so it should be possible to
define it functionally. We haven't begun looking at this, though, and I don't
know of anyone that has -- some people are working on verified implementations
of the EVM, however.

> It seems like a huge task to formally model the whole consensus system,
> including OP_CODESEPARATOR.

It is a huge task, yes, but it can be done modularly such that reasoning about
script interpretation can be almost entirely separated from reasoning about
Nakamoto consensus. (This is a bit subtle, since the fork choice rule needs to
check if blocks are valid, which relies on script interpretation.)

I don't really understand OP_CODESEPARATOR. As long as it is deterministic, it
shouldn't be an issue.

------
baybal2
I think all formal verification methods are built from bottom up, and
verification begins at standard C library. The bigger you program gets, the
more dependencies you have to verify.

If your tech platform does not have a long history of formal methods, you will
have hard time building the verification chain on your own.

In Javascript, for example, you will have to formally verify the whole
browser, and all its dependencies, before you can meaningfully verify the code
running inside it. That's very hard

------
namelosw
The main reason is formal method is still not cost-efficient, not even close.

The problem is similar to statically typed languages. Of course, having a
sound type system can help you avoid a lot of stupid mistakes. But the key
point of modern statically typed language does not lose a lot of velocities.
Sometimes even faster because there are better IDEs out there for auto-
completion, importing, and much other stuff. I switched from JavaScript to
TypeScript only to find I'm faster to write code without losing anything.

Before we make formal method widely accepted by the industry, we probably need
depdent type languages first. An excellent example is Idris.
[https://www.idris-lang.org/](https://www.idris-lang.org/) It's a dependent
type language, and it can also apply formal method to prove theorems. With its
IDE, it is more like pair programming with the type system, it can really fill
up a lot of code according to type. This kind of system could have huge
potential to be much more powerful than say, IntelliJ IDEA with Java etc.

After then you could write code very faster, you could optionally write some
proof for the core domain of your system if you like.

------
Nokinside
I have used statistical analysis and formal methods in production code. Mainly
Astree,
[https://www.absint.com/astree/index.htm](https://www.absint.com/astree/index.htm)

When you have bought the tools to verify code in some safety critical project
and know how to use it, it's easier to use it in less critical software too.
You will soon hit the wall where formal correctness loses to time constraints
and new features. You end up verifying only the code that is used repeatedly
in multiple projects and does not change.

Formal verification is not substitute for testing and code review. You can
verify that the code is correct but you don't know if you have the correct
code.

The answer is: Because correctness of programs produces only limited value for
normal use cases. It's more important to add more features than it is to
produce solid code. If you don't have code reviews where all new code and
changes are being read and analyzed by others (increasing the cost of the
software 5-10 times) your interest in correct code is limited.

------
ggm
because (as the fine article says) "proofs are hard"

we don't do hard. we don't use APL. we don't use z or coq or isabel or any
number of things, because we don't like hard things.

proofs are hard. its that simple.

------
01100011
I love the concept of formal methods. As someone who pushes for more
determinism in software(yay state machines), I have long dreamed of using them
to prove correctness. And then I go look at the goofy, domain-specific
language I'd have to learn, roll my eyes, and get back to doing 'real work'.

Formal methods will win when I can point the tools at a mess of C/C++/whatever
code and have the tool figure out the rest with minimal overhead from me.
Sure, sprinkling a few hints in my code would be fine, but trying to implement
my design in a formal language spec and then... convert it to my chosen
language, with the chance that I'll introduce bugs, seems like a waste of
time.

~~~
jen729w
I found this quote a while ago while researching state machines. The author is
using a moderately complex hypothetical system as a reason _not_ to use a
state machine.

“... For instance, you are asked to create an online order system. To do that
you need to think about all the possible states of the system, such as how it
behaves when a customer places an order, what happens if it is not found on
stock or if the customer decides to change or delete the chosen item. The
number of such states could reach 100 or more.”

This is why so much software is broken. The author just admitted that he can’t
be bothered figuring out how the system he’s about to build should actually
work. That’s a deeply strange mindset to me.

~~~
01100011
Well, state explosion is a real thing, which is why we have things like
hierarchical state machines. There's almost always a trade off between
capturing everything in a state variable and allowing a few flags and
conditionals. The classic example is a keyboard state machine where you don't
actually have states for each key pressed, you just store the key in a
variable.

------
chriscaruso
"Before we prove our code is correct, we need to know what is 'correct'. This
means having some form of specification, or spec, for what the code should do,
one where we can unambiguously say whether a specific output follows the spec"

What's the difference between the specification and code at that point? If
it's that detail oriented, then isn't the chance of bugs in the specification
at least as likely as being in code?

------
obtino
Back when I did my CS degree, we were taught the "B-method":
[https://en.wikipedia.org/wiki/B-Method](https://en.wikipedia.org/wiki/B-Method)

The course dropped in popularity with students and the university eventually
dropped it.

Looking back on it, it really should've been a mandatory course for all
software engineering students.

~~~
meh2frdf
In the 90s, z-notation was taught in many UK CS courses.

------
edpichler
Since I started designed before code I am at least 2x more productive, rework
was reduced to almost zero. Anyway, it is difficult for us to understand
things really abstractly, and to measure the future gain of not so fun
activities.

------
oggy
Last week I took part in a formal methods workshop where one of the topics
discussed was "why we don't have formal specs for most things" (which is a
more modest ask than verification). Many things we came up with were what
you'd expect: additional cost (which has to be taken into account when
changing programs), missing incentives, lack of awareness of FM existence,
lack of training, unpolished tooling.

Other, less obvious ones included the inherent difficulty in writing the
specs. Namely, FM is about verifying that a system model conforms to a
specification. For this to bring value, the specification has to be
significantly simpler than the model that we can deem it "obviously correct".
But such specifications can be extremely difficult to find. On one hand, they
are abstractions of the model, and finding good abstractions is one of the
hardest parts of software engineering. Another way to think about it is
Kolmogorov complexity; at some point you cannot "compress" the description of
a problem any further, i.e., some problems are just difficult by nature.

On the incentives front, one of the interesting viewpoints brought up was that
having formal specs for your product can make you more liable for defects, and
that some (in particular, hardware vendors) might prefer to have wiggle room
there. E.g., consider the Spectre/Meltdown fiasco and Intel's response.
Another thing brought up there was that software vendors often don't directly
feel the cost of the bugs in their software, at least not short-term. This is
different in hardware; e.g. ARM uses FM extensively, though I don't know if
they commit to specs publicly. The RISC-V architecture comes with a formal
spec, though.

On the other hand, I think that many of the current trends in the computing
world are making FM more aligned with the incentives of a core part of the
industry. The cloud brings centralization, with few large companies (Amazon,
Google, etc) providing a lot of complicated software infrastructure that's
critical to their success. An outage of an AWS service directly costs them
significant money, and can easily completely halt the customers' operations.
Moreover, they operate in an open and potentially hostile environment; a
corner case software bug in MS Office wasn't a big deal, and maybe could be
"fixed" by providing the customer a workaround. But such a bug in a cloud
service can become an exploitable security vulnerability. And in fact you do
see AWS having a dedicated formal methods team. Also, the TLS 1.3 development
included formal verification. On the other hand, AFAIK Google doesn't employ
FMs that much, even though they are in the same game as Amazon with GCP; so
there's definitely a cultural component to it as well. But I still believe
that it's likely that we end up with a small core of companies that provides a
lot of critical infrastructure that heavily uses FM, and that we see little
adoption of FM outside of this circle.

As for the article, I agree with most points made, though it unsurprisingly
has a bias for using TLA with TLC. I agree that TLA/TLC can often be the best
solution, but there are caveats. For example, models can be awkward (e.g.,
recursive data structures), the specification language is limited, and "just
throwing more hardware at it" doesn't really work that great for exponential-
time problems.

------
mbrodersen
It's simple really: people are happy to pay $ for software that kinda works
most of the time as long as it (most of the time) solves the problem they want
solved. Always follow the money!

------
earthicus
Does anyone here have any experience using semi-formal methods, like
'cleanroom' techniques? The commentaries I've heard (e.g. [1]) tend to be very
enthusiastic, but I don't have enough experience to judge their credibility so
I'd love to hear what others have to say on the matter.

[1] [http://infohost.nmt.edu/%7Eal/cseet-
paper.html](http://infohost.nmt.edu/%7Eal/cseet-paper.html)

~~~
hwayne
At one point a while back I emailed Stavely to learn more about how it's
changed over time, but he's been out of the software world for over a decade
now. Of the people I've met who've been on cleanroom projects, the _tentative_
impression I got is that it's fast and it works but it's really taxing on the
people involved. One of these days I want to do more careful interviews and
the like.

------
cttet
In some area it is almost impossible to write complete specs.

CSS is originally designed to be a end-user spec language to change the
website as they like. But layout is so complicated that it take a lot of
experience to define everything in position dynamically as requested.

And now people are using programs (js) to replace the job of CSS, to some
extent giving up on writing specs for layout, even with years experience on
it.

The hard thing is not the language itself, it is often the domain.

------
xwvvvvwx
The cost / quality trade off isn’t worth it for most applications. In fields
where it is (aerospace, smart contracts, etc.) then formal methods are used.

~~~
cs02rm0
Not a very exciting answer, but I think it's just that; cost vs. benefit.

------
lolive
Am i not already using it when I Ctrl-Space in IntelliJ?

------
mitchtbaum
It's also a lot easier to do change management on an abstract model of the
software you're working on than the design of that one program itself, because
the design choices made by other sister programs can be immensely helpful.
It's the inversion of making decisions in a vacuum. What do you call that?

\---

Edit for "random" keywords: model-driven engineering with domain-specific
meta-modelling languages

------
kolinko
There is a lot of work done now with formal methods used for smart contract
verification. MakerDAO formally verified correctness of their contracts last
year: [https://github.com/dapphub/k-dss/](https://github.com/dapphub/k-dss/)

------
luord
> In fact, the vast majority of distributed systems outages could have been
> prevented by slightly-more-comprehensive testing. And that’s just more
> comprehensive unit testing,

Key takeaway for me as far as code verification goes.

Design verification reminded me of a few subjects I studied in college.

~~~
zzzcpan
Take that claim with a grain of salt. More comprehensive testing still
requires deep expertise in resilience/reliability to even just understand what
to test for. And you can't have that unless you both study that domain and
experience first hand the sheer amount of problems such systems have in the
wild.

For example, you might assume there is no difference if you establish TCP
connection to a port 12345 or to a port 12346 of the same server. But there
is, they might belong to different buckets in a network stack somewhere and
one bucket might have too many packets in it and be slow and overloaded, while
another isn't and be perfectly fast. This could easily cause outages.
Distributed systems are strange like that.

------
sys_64738
It takes too long.

------
DanielBMarkham
I think the easy answer is: "because life isn't formal"

There is an intersection.

The places you might need defined and formal methods to make people happy and
the places you might actually want to sling some code to make people happy?
That intersection point?

Tiny

------
Graham24
when I was at university we did a module on Formal Methods including writing Z
Schemas. Out of a class of 30 only two people really understood them.

That's why they are not generally used, no one (or at least not enough) has a
clue how to create them.

I doubt they are taught on general Computer Science degrees anymore.

------
RavlaAlvar
Do life dependent code like those in NASA use these type of method to validate
their system?

------
coldtea
Because they're slower than brute forcing your way.

------
pytyper2
Excellent E & O insurance policies.

------
flerchin
Because the Agile Manifesto.

------
edoo
It is because formal methods rarely agree with the cost benefit analysis of
doing so.

At some point it is less expensive to just build the car and see if it rolls.

------
anth_anm
it's hard and mostly doesn't justify the cost.

------
zozbot123
> Why Don't People Use Formal Methods?

Because they're an emerging and unproven technology? Why don't people rewrite
everything in Rust? That would surely be a _lot_ more straightforward than
adopting the sorts of "formal methods" where you only write a handful of
(C-language equivalent) LOC per day, across the software industry! So why
doesn't it happen? There's your answer.

~~~
UncleMeat
Static type checking is abstract interpretation. If you program in Java you
are using formal methods.

~~~
barbecue_sauce
So we do use formal methods! Problem solved.

