
The Great Theorem Prover Showdown - panic
https://www.hillelwayne.com/post/theorem-prover-showdown/
======
tunesmith
I'm very much not at the level of accomplishing formal proofs. But isn't his
challenge different than what he's trying to prove? I guess my objection is in
the category of "even if formal proofs aren't easier in FP, it doesn't
matter". In particular, when people talk about the advantages of FP and
"correctness", they aren't literally talking about formal provability -
they're more talking about a spectrum of correctness, that the purely
functional approach catches moves _more_ buggy behavior from runtime to
compile-time, that the functional implementation is _more_ correct, etc.

If you adopt the strict definition of "correct", then "more" correct makes no
sense, of course.

I recently ported a fairly hairy lengthy algorithm from python to scala. From
something with a ton of mutability, scope confusion, exceptions serving as
GOTOs for business logic - into scala pure functions. The python was
inscrutable with a team of people treating it gingerly. The scala is easily
refactorable and each time we take another crack at it, it shrinks into
smaller and smaller code (and not by using crazy Scala) and I suspect that
much of the confusing hairy complexity will disappear. I don't really know how
to quantify this, but I don't think it's captured from an exercise that
compares difficulty in formally proving methods in IP and FP.

So yeah... it's an awesome exercise that showed a lot, and I really want to
dig more into provers, but I just don't see that his "therefore" follows, that
"it's (not) easier to reason about FP than imperative".

~~~
fusiongyro
> If you adopt the strict definition of "correct", then "more" correct makes
> no sense, of course.

On the other hand, it's pretty easy to widen the scope of what you're proving
correct about a program in such a way as to make it impossible for Haskell to
help you. For instance, "accomplish this in O(n)" or "use less than 1 MB" or
"in less than 1 second." People usually only concern themselves with the
aspects of the implementation that Haskell prioritizes by surfacing as return
values. But there are certainly settings where the method used is as important
as getting the answer mathematically correct--in fact the rise of NoSQL shows
us that there are even situations where doing the wrong thing quickly is
better than aborting in the face of unrecoverable errors.

~~~
thesz
Haskell can help you with "O(n)", "less than 1MB" and many other constraints
of that sort.

You should use domain-specific language embedded into it for that.

If you do that that way, Haskell can help you with particular details of the
implementation with types (I think that O(n) would be easy - I did
parametrization on the length circa 2008) and help you to connect your
implementation and the rest of program, again, with types.

I did exactly that. From well-typed parsers for stream processing hardware to
(not-so-well typed for historical reasons) strict realtime algorithms.

~~~
fusiongyro
Link?

~~~
pseudonom-
Not sure what GP is thinking of, but here's an example of embedding tools for
reasoning about performance in Idris:
[https://www.youtube.com/watch?v=4i7KrG1Afbk&feature=youtu.be...](https://www.youtube.com/watch?v=4i7KrG1Afbk&feature=youtu.be&t=1254)

~~~
fusiongyro
No offense, but this is like asking to see an example of someone doing OOP in
C and being handed C++.

------
philipn
I loved this passage! --

"..There was zero overlap between the provers and the bulldogs. I was
expecting at least some overlap: somebody who mocked me but also provided a
valid solution, or even tried but failed. But that didn’t happen.

I normally assume these people are “brilliant jerks”: they’re assholes online,
but I still have to listen to them in case they say something important. This
really cracked that assumption: none of the “brilliant jerks” were willing to
put any skin in the game. You don’t have to listen if they have nothing to
say."

~~~
ms013
I came to a similar realization a few years ago. I followed a number of people
who were/are widely followed in parts of the FP community that are generally
caustic, but I assumed their popularity implied that they had something useful
to say. I eventually unfollowed/muted/blocked many of them because I reached a
point where I realized I'm not dumb, I actually do know the domain pretty
well, and I'm well regarded in that community, yet I think these people are
just spewing hot air (either aggressively negative, confrontational, or self-
promoting). To be honest, I don't think I've missed anything of any
consequence since my block/mute/unfollow-fest. It does disappoint me when I
look and see that some of these toxic personalities are still widely followed
and still spewing the same garbage, even though I can't see any tangible
contribution that they've made.

There are many who are good at making noise and opining, and even talking
about wonderful "tech" that they are working on, but most of the "brilliant
jerks" ultimately produce nothing of value and contribute nothing to the
ecosystem.

------
choeger
> it's all about the heap!

This. If you only use local (i.e., stack) variables, you should be able to
directly transform your code into FP using a state monad. Hence I do not think
there is a large difference between the two paradigms here.

Plus: Functional Code is easier to reason about. At least for the person that
does the formalization. You simply need fewer constructs (and less complicated
ones). The same argument holds for parallel code, btw.

~~~
fmap
Yes! Functional or imperative programming makes no difference in his challenge
problems.

Tail recursive functions and loops are the same thing. Proving a loop correct
using invariants and showing (partial) correctness for a tail recursive
function by (functional) induction are the same thing.

------
dbkaplun
> Fulcrum. Given a sequence of integers, returns the index i that minimizes
> |sum(seq[..i]) - sum(seq[i..])|. Does this in O(n) time and O(n) memory.

This is doable in O(n) time and O(1) memory. Sum the list once. Then start
from the beginning, collecting a running total and subtracting from the sum
for a running difference.

------
hackermailman
The concrete examples the author is looking for, where it's easier to analyze
programs according to their behavior instead of their imperative structure, is
Per Martin-Löf's paper _Constructive Mathematics and Computer Programming_
(DOI: 10.1016/S0049-237X(09)70189-2). For a brief and intuitive explanation,
see this post by Robert Harper on the RedPRL prototype that uses cubical
computational type theory
[https://existentialtype.wordpress.com/2018/01/15/popl-2018-t...](https://existentialtype.wordpress.com/2018/01/15/popl-2018-tutorial/)

You can also have your legacy cake and eat it too with this excellent recent
paper on gradual typing
[https://arxiv.org/pdf/1802.00061v2.pdf](https://arxiv.org/pdf/1802.00061v2.pdf)

"This allows for the introduction of new typing features to legacy languages
and codebases without the enormous manual effort currently necessary to
migrate code from a dynamically typed language to a fully statically typed
language. Gradual typing allows exploratory programming and prototyping to be
done in a forgiving, dynamically typed style, while later that code can be
typed to ease readability and refactoring."

------
kevintb
Great stuff. I followed the thread when he released the challenge and enjoyed
reading the comments and quibbles people had along the way.

Also, strongly agree on zero overlap between bulldogs and theorem provers.
Those who were assholes didn’t make a single effort to contribute productively
to the discussion. A good reminder that jerks (no matter how brilliant they
think of themselves) can be safely ignored.

------
xamuel
One way to see there's a lot of hot air coming from the FP crowd is to look
how logicians and mathematicians write pseudocode in real papers. The
"working" logician/mathematician usually writes imperative pseudocode. No eyes
are batted about subroutines having side effects, structures being mutable,
variables being global... sometimes they even use "goto"!

------
js8
"I keep hearing that it’s easier to analyze pure functional code than mutable
imperative code. But nobody gives rigorous arguments for this and nobody
provides concrete examples."

I believe the first sentence and here is my personal justification for it
(maybe it can be made into rigorous argument). I base this on my experience
with Haskell, and it might be that typing and control of side effects is
actually essential for the claim.

In Haskell, for the most part, results of a function depend solely on its
arguments. So you can treat the inner workings of function as a black box, and
only worry about the "boundary" of type information of its parameters.

Now imagine you're facing a large program and you want to estimate an effort
required to make a change. In functional programs, if you look at a function,
you can see the scope of what the function does from its type (its
parameters). In imperative programming, a function can do anything to the
system under the hood, there are no natural boundaries.

It's very similar to the real world, where you can estimate for example effort
in building something, like renovating an appartment, solely based on the size
of the object. You know immediately that renovating a small appartment is
(most likely) less effort than building a skyscraper, because of its size. In
imperative programming, we're often faced with TARDIS-type deals - bigger on
the inside than on the outside.

In physics, there is divergence theorem that formalizes this notion of "you
don't have to look inside the object, you can only look at what happens at the
boundary". Similarly, functional programming (when done right) creates the
boundary at the function level, so you can then apply this type of reasoning
to it.

"I chose correctness because it was the easiest to objectively verify. I’m
sure that if I posed a similar challenge about refactoring code, everybody
would be telling me that “being easier to reason” isn’t about refactoring,
it’s clearly about correctness!"

I am not sure he is right there. The setting up of the boundaries in
functional programs is real. Yet, with current type systems, the boundaries
are not watertight. And of course, you can always write your program in
imperative style with functional programming language, without an effective
boundary.

I think the OP ignores the boundary setting question, because if you want a
full proof of something, then you don't need to have the boundaries. But
that's what ultimately helps in understanding. Maybe if he tried to prove
several different statements about the code, then the boundaries would help,
because he could reuse the theorems about things delineated at the boundary.

I think working with FP correctly essentially forces you to prove statements
about the code (for example, this function doesn't have side effects), and
then you can reuse these small statements for formally proving something else
about it. Whereas in IP, you don't build these statements, you have to build
them for the formal proof explicitly. The resulting proofs, written down, will
have the same lengths, but in the FP case some work might have already been
done.

So what OP is saying is similar to say, well, divergence theorem is really
useless, because at the end of the day, you have to correctly account for all
your sources and sinks. Which is true in a way, but it is a useful theorem
nonetheless.

------
amelius
You can transform purely functional programs into imperative programs
trivially.

And you can transform imperative programs into purely functional programs,
e.g. by using immutable arrays to represent the heap, etc. (basically, you
could write a CPU emulator in Haskell, and let your program call it).

So from a theorem proving point of view, both paradigms are equivalent.

~~~
pron
Your reasoning is faulty for two reasons. First, the transformation may not
preserve the space complexity of the input (program size). So if a transformed
program is exponentially bigger than the original, it is not equivalent from
the prover's perspective. Second, even if the program size is preserved, the
prover needs to write the proof. Finding a proof is a non-trivial problem, and
there is no general mechanized way to do it. Therefore, we rely on human
practice and intuition. It is therefore possible that human reasoning would
work better in one formalism than another (although the avantage, if there is
one, may change in favor of one formalism or the other depending on the
specific problem, too). Now, I'm not saying either of these actually occurs,
and it is possible the the two are more-or-less equivalent, but if so, it is
not an immediate consequence of there being a transformation from one to the
other.

------
Kenji
The thing is that proving small functions to be correct is often quite
useless. You can cover them well enough with a set of unit tests. The big
challenge are larger parts of the code, like entire classes or parts of a
program. These kinds of things are very hard to prove formally. Even if you
did prove them formally, the spec according to which you programmed might be
wrong, or the compiler might be wrong (I've seen a variety of compiler bugs,
especially larger projects uncover them). Formally proving code is like making
sure that every single screw of a self-driving car is tightened to an exact
torque, but then the car runs over a person anyway. It's solving the wrong
problem.

