
New Paper: Theory of Programs - fermigier
https://bertrandmeyer.com/2015/07/06/new-paper-theory-of-programs/
======
fizixer
> Programming, wrote Dijkstra many years ago, is a branch of applied
> mathematics.

This is as effective as saying "X is a branch of philosophy", where X is any
discipline. It's probably true for every discipline, but what's the point?

Also every X is a branch of physics, because we live in a physical universe
and everything is inside it (so even pure mathematics is a branch of physics).

And finally X is a branch of computer science because computation and
information processing sometimes appears to be more fundamental than
everything else. E.g., physics cannot answer the question, "are we living in a
matrix-like simulation?" but if we really are, the whole physical universe is
mere computation happening in some higher dimension.

I like to think of Mathematics, Computer science, Physics, and Philosophy to
be deep and insightful realms of human thought, and they have many connections
but also many unique attributes.

~~~
amelius
Programming is like mathematics, except all proofs are omitted.

~~~
mafribe
By Curry-Howard, the program is the proof, albeit usually not of an
interesting proposition.

~~~
groar
In a rich type system, a program is the proof of its own specification. Which
is usually as interesting as the program itself.

~~~
mafribe
Sure, but as of July 2015 the programming languages (e.g. Agda, Idris) that
can give full specifications of their own programms, are experimental, not
mainstream.

------
buttproblem
At least since 1935 [1, 2] people have been trying to formalize all of
mathematics with set theory. I always thought this was fairly interesting,
similar to how the real numbers can be modeled in a language using dependent
types like Coq.

However, I always found Hoare logic, and its concurrent extension Rely--
Guarantee from Jones, to be quite easy to understand. The more interesting
part is how to do this automatically for a user. Abstract interpretation is
one way to do this, but this necessarily requires mapping some programming
language to your mathematical model. However, determining the formal semantics
of mature programming languages, even C, is still open research (e.g., see
papers on Semantics in PLDI [3] 2015).

TLDR: verification is hard.

[1]
[https://en.wikipedia.org/wiki/Nicolas_Bourbaki](https://en.wikipedia.org/wiki/Nicolas_Bourbaki)

[2] Topoi, the categorial analysis of logic. Goldblatt, Robert.
[http://digital.library.cornell.edu/cgi/t/text/text-
idx?c=mat...](http://digital.library.cornell.edu/cgi/t/text/text-
idx?c=math;cc=math;idno=gold010;node=gold010%3A6;view=toc)

[3]
[http://conf.researchr.org/track/pldi2015/pldi2015-papers#pro...](http://conf.researchr.org/track/pldi2015/pldi2015-papers#program)

~~~
mafribe

        At least since 1935 people have been trying to formalize all of mathematics with set theory.
    

Cantor invented set theory for this purpose in the 1870s. By 1935, Goedel's
incompleteness theorems had shown the limitations of this (and any other)
foundation of mathematics.

Hoare logic is not a foundation of maths, but an approach towards
specification and verification of programs. As such it is in the same space as
Curry-Howard based programming languages like Idris and Agda with dependent
types. The main differences between the two are that (1) Hoare logic is not
restricted to constructive reasoning and (2) Hoare logic is not integrated
with the programming language, so programming and proving are _distinct_
activities. Depending on your philosophical outlook, either (1) or (2) or both
can be seen as major advantage or disadvantage.

    
    
        how to do this automatically
    

What do you mean by "this"? Automatically specifying programs semantics, or
automatically proving program properties?

    
    
       determining the formal semantics ... is still open research 
    

This is somewhat misleading. Every compiler gives a perfectly formal semantics
to the programming language it compiles. What you probably mean is something
like _nice_ abstract accounts of memory models for C that at the same time
capture all the optimisations modern C-compilers want to do, while still being
implementable on existing CPUs. That is indeed an open problem [1].

[1]
[https://www.cl.cam.ac.uk/~pes20/cpp/c_concurrency_challenges...](https://www.cl.cam.ac.uk/~pes20/cpp/c_concurrency_challenges.pdf)

~~~
groar

        What you probably mean is something like nice abstract accounts of memory models for C that at the same time capture all the optimisations modern C-compilers want to do, while still being implementable on existing CPUs. That is indeed an open problem [1].
    

True. Although CompCert [1] is a nice effort toward that goal: a proved C
compiler that covers almost all C99 with many optimizations implemented and
proved sound.

[1]
[http://compcert.inria.fr/compcert-C.html](http://compcert.inria.fr/compcert-C.html)

~~~
mafribe
CompCert uses tame optimisations, and compiles to mainstream architectures
nothing exotic. Most of all, CompCert does not deal with concurrency as far as
I'm aware. Concurrency is where memory models matter.

The problem with memory models is not so much verifying it in a mechanised
way, but inventing something suitable at all.

~~~
groar

      The problem with memory models is not some much verifying it in a mechanised way, but inventing something suitable at all.
    

Agreed.

I'm aware of some work done a few years ago by people working on weak memory
models and extending CompCert with some concurrency primitives:
[http://www.cl.cam.ac.uk/~pes20/CompCertTSO/doc/](http://www.cl.cam.ac.uk/~pes20/CompCertTSO/doc/)

------
nmrm2
What I couldn't find was an answer to the question "why?"

I could see this being an interesting basis for teaching Computer Science in
high schools, where a text like this functions in a way similar to Euclid in
high school Geometry courses. I'm pretty sure there aren't any non-exceptional
examples of high school _Computer Science_ courses (e.g., the AP CS course is
better described as an Intro to Computer Programming course).

And in any case this is a nice exposition.

But aside from that, I'm not sure I see any new insights here about
CS/verification, nor any suggestions for research directions that aren't
already extensively explored. Perhaps I'm missing something, though.

(Edit: There's a list of suggested future work at the end of the paper. I
guess I get it now; although all of these things have been done in
verification/PL -- and even by non-type-theorists -- they almost always
involve the development of a new logic, and aren't done in pure set theory. So
certainly there's a lot of work to do if you want to do things in this style.
But I'm still trying to see the benefit of this style, aside from pedagogic or
philosophical benefits. Is my inexperience in this area blinding me from some
obvious potential? I don't know much about non-high-school set theory.)

~~~
AnimalMuppet
Building a unified framework for all of programming theory is useful, even if
it provides no new insights. It provides a clearer basis for thinking about
what we already know, and thereby makes the new insights easier. So even if
the new insights aren't here yet, it makes it more likely that they eventually
come.

~~~
nmrm2
I guess my question is, why set theory, as opposed to building [new] logics on
top of other semantic models (e.g. operational semantics or reachability
relations)?

(I don't doubt there are compelling reasons, I just don't know enough about
set theory or programming theory to know what they are. Other than the clear
benefit of this approach over others in "elementary" educational settings,
e.g. US high schools)

~~~
AnimalMuppet
Lowest total cognitive load?

That is: If I build my theory on a complicated foundation, then you have to
learn the complicated foundation before you can even start to learn my theory.

On the other hand, if I build it on a simple foundation, but that simple
foundation means that the theory itself has to jump through a bunch of hoops
because the foundation is too simple, that can also make the total (foundation
+ theory) harder to learn and understand.

So the sweet spot is to use the simplest foundation that does not unduly
complicate the theory. (And that may change, depending on target audience.)

Is set theory the best answer? I have no idea, but all of programming in 28
pages, built on a foundation only of set theory, is very impressive.

~~~
nmrm2
Thanks for your responses. I think I'm expecting interesting and impactful
results out of an exposition/proposal, which is probably unreasonable.

 _> but all of programming in 28 pages, built on a foundation only of set
theory, is very impressive_

Yes, it is :-)

------
ble
Anyone find the definition of `functional` a little bit odd?

Is there a reason why a "no-op" or identity function should be considered
imperative? Or why a program must be imperative if at least one post-execution
state is also a valid pre-execution state?

~~~
Garlef
Agreed. Some definitions are really bad. I guess the cause is noble but it is
executed very poorly. ( The definition of functional is also used nowhere in
the document - Although it is mentioned. )

Another oddity: "Object oriented" programs are defined as functions {0,...,n}
-> O where O is "a set of objects". Yet: A program is defined to be an endo-
relation S<->S. Accordingly, we have O = {0,...,n}.

I don't see how this definition gives any insight on the nature of object
oriented programming.

Now for the strangest thing: The definitions of "object oriented" and
"procedural" involve differentiating between possible condidates for S. Yet
the only property a set without any context really has is its cardinality.

So we can rephrase the definitions of "object oriented" as follows:

An "object oriented" program is a set S together with a total function S -> S
with S finite. ([total function] AND [S finite])

Negating, we arrive at the definition of "procedural program":

A "procedural" program is either a relation S <-> S that is not a total
function (no restrictions on the cardinality of S) OR any relation S<->S but S
must be infinite. ([not a total function] OR [not S finite])

I don't see how this definition captures the notions of object oriented
programming and procedural programming in any meaningful way.

But maybe the author will appear here and to clarify his ideas.

------
black_knight
Just skimming the definitions they seem more than a bit naïve. A relation is
really a more general thing than a program, since it need not be computable in
any way.

The _real_ mathematical models of programming are not all that difficult to
understand. The most famous, the Turing machine, is really just a finite set
of states and a function on it. The complications which arise later are real
complications, describing things such as side effects and complexity — its not
trivial stuff!

~~~
ble
Meyer is trying to move up one level of abstraction; the `programs` he is
talking about are synonymous with `specifications` so right away those words
are being used in an atypical fashion.

Calling them `progspecs` would be gross and newspeak-y and calling them
`program specifications` would be verbose. Regardless, the theory of these
`programs` may indeed be useful.

Also, he's not saying that all relations are progspecs, but defining that
progspecs are pairs of (relation, set).

~~~
Garlef
Yes: The theory of relations/chu spaces is used in computer science.

See for example [
[http://chu.stanford.edu/guide.html](http://chu.stanford.edu/guide.html) ]

------
jmount
I'd say formalizing programming is desirable, but harder than you would
suspect. I'd take a look at denotational semantics and Homotopy type theory
for some existing formalisms.

