
Can Programming Be Liberated From The Von Neumann Style? (1977) [pdf] - jw2013
http://www.thocp.net/biographies/papers/backus_turingaward_lecture.pdf
======
slacka
While we have achieved greatness with the Von Neumann evolutionary line, now
that the Mhz free lunch is ending, it’s time for us to recognize that its
sequential nature is holding us back. There are still many branches of
Computer Science that researchers and industry have barely explored. We need
to reevaluate these other branches like Functional and Dataflow programming.

Ivan Sutherland explains it best when he says that concurrency is not
fundamentally hard. The problem is that we constraining ourselves to the
fundamentally sequential Von Neumann architecture.
[https://www.youtube.com/watch?v=jR9pAaQlVRc#t=267](https://www.youtube.com/watch?v=jR9pAaQlVRc#t=267)

~~~
userbinator
On the contrary, I think that we haven't exploiting sequential machines to
their full potential, and what's "holding us back" is all this theoreticism
and overabstraction. There likely is a point where concurrency is really
necessary, but one only has to look at the demoscene, full of people who have
never formally studied CS, to see something closer to what hardware can really
do -- and then wonder how so many others working in software, with formal
backgrounds and extensive education in CS, couldn't.

~~~
weland
This crosses my mind every time I hear about the MHz free lunch.

[https://www.youtube.com/watch?v=oBegD7k2wvo](https://www.youtube.com/watch?v=oBegD7k2wvo)
, 02:20. JS and WebGL, which seem to be all the rage nowadays, would smoke the
shit out of my i7 doing that. In case younger chaps are watching, the C64 has
a 1 MHz CPU and a whopping 64 KB of RAM. For comparison, minified jQuery 2.0.0
is about 80 KB.

~~~
icebraining
The JS/WebGL demos actually run on all of my machines, though.

~~~
weland
Albeit barely :-).

------
cheepin
I feel like "industry" doesn't really want functional style programming to
take over. Languages like OCaml provide really great speed while having also
utilizing a relatively high abstraction level, while Lisp may not have as much
speed but it takes your brain on a journey to another dimension.

However, in practice, people seem to just want "C with Lambdas" languages like
Go.

~~~
emsy
This is probably going to be unpopular: My main problem with most functional
languages is that they encourage cryptic code. From the tutorials I've read
Lisp seemed somewhat readable, but the tutorials encouraged me to use list
comprehension all over the place because it's so convenient. I need to decrypt
all the map zip range calls in order to grasp what's happening. Disclaimer: I
don't know if Lisp is used like this in a productive environment.

Haskell is often described as "elegant" (Here for instance:
[http://www.haskell.org/haskellwiki/Why_Haskell_matters#Elega...](http://www.haskell.org/haskellwiki/Why_Haskell_matters#Elegance)).
To me, the examples are far from elegant. They just prove how much
functionality one can cramp into an operator. But if I have to decrypt code in
order to understand it then the language does it wrong. Math works like that
and I never understood why they called variables "x" instead of "potatoes" or
"time". It's simply non-descriptive. I don't know if you can write code that
is as descriptive as code from an imperative language, but the FP community
doesn't encourage it either.

~~~
nostrademons
map/zip/range isn't cryptic once you get used to it. You start thinking them
as additional CS concepts, which you just apply in one "chunk" to problems.
This is working-as-intended: it's you learning how to think at a higher level
so that you can solve more complicated problems quickly. Once you've done this
you can actually "back port" this knowledge to traditional programming
languages: I often think of Java or C code in terms of "Oh, this is a map" or
"this is a fold over a tree traversal".

There are a bunch of other, legit readability problems with typical Haskell
style: the fetish on single-character variable names is just nuts (it comes
from math, where it's customary because mathematics culture arose before
computers and autocomplete), and once you get into point-free style,
combinator libraries, and monads/arrows things really get nuts. In general
I've found that abstractions should only be used when they're _re_ -used
frequently, so you can internalize them, and Haskell tends to encourage rather
overzealous abstracting. But map/zip/range show up all the time in typical
industrial programming, and learning how to think of it as a unit frees up
brainpower to think about higher-level algorithms.

~~~
pekk
After people learn something they don't find it entirely cryptic any more, but
that doesn't mean it isn't cryptic.

No evidence that using map instead of loops "frees up brainpower."

~~~
loup-vaillant
What to you mean by "cryptic"? Something _you_ struggle with? An abstraction?

You're not used to maps, folds, and filters (the bulk of sequence
comprehension), so they don't free up _your_ brain power. But seriously,
they're not complicated, and using them instead of regular loops often reduce
the amount of code by a factor of 3 to 5. _Simple_ code, where the same idioms
come back over and over —just like in regular loops, only shorter.

Don't tell me that doesn't free brainpower, eventually.

~~~
seanmcdirmid
Functional orgami is hella hard to debug. Their is a reason functional
programming relies so heavily on equational reason and pure FP relies heavily
on static typing: the alternative, actually debugging programmers, is so
painful that you just want to avoid that at all costs.

A loop is way easier to debug than a filter or a map, you just step through
it. Who cares about code length? Haskell doesn't become a usable language
until someone comes up with a decent dataflow debugger, if that's even
possible.

~~~
infruset
I don't agree with this. Usually, the typing does half the debugging for you
and when you have a real bug (as in, the program does something but not what
you intended), you debug at the reasoning level, not at the "is there an
overflow here?" level.

~~~
loup-vaillant
Some people tend to reach for the symbolic debugger the _instant_ their
program behaves funny. I hear Haskell is at a disadvantage here, being non-
strict and all. Though I wouldn't know, I never tried Haskell's symbolic
debugger.

~~~
seanmcdirmid
Some people think to program, others program to think. What kind of person you
are greatly determines if you prefer programming in Python or Haskell.

~~~
loup-vaillant
> _Some people think to program, others program to think._

That one needs to be framed on a wall.

I'm definitely in the first camp (think to program), and have no problem with
paranoid type systems, and I seldom need a debugger. Now I understand why
other people do.

------
signa11
i would rather have a mix-n-match of different paradigms based on the task at
hand. here is something that works best with an imperative style so let's just
use it, for something over-there functional style is best suited, rule-based
stuff will be exactly right in this instance, let's use that etc. etc.

if you have not seen sussman's 2011 strange-loop talk "We Really Don’t Know
How to Compute!", please do, it is quite enlightening (whatever that means)

~~~
chas
I think the key bit of the title is "Algebra of Programs". This is a piece
about being able to _calculate_ programs. In section 12, Backus goes into some
detail on how this construction makes algebraic reasoning easier and gives
some specific examples. While it is not required to have a pure FP system to
have a language that can be reasoned about algebraically, introducing
arbitrary ad-hoc iteration will certainly destroy those algebraic properties.

I think this ties in well with Erik Meijer's essay from a few days ago[1]. If
you are trying to produce a framework for mathematical reasoning, it does not
make sense to allow arbitrary exceptions. As a species, humans are pretty new
to this whole computing thing, so I would agree we don't have it figured out
yet. However, over the last 5000 years or so we have developed a shockingly
effective toolkit for reasoning about arbitrarily-defined abstract objects and
I don't think it is unreasonable to try to bring those tools to bear on a
newer class of abstract object, that is to say programs. The cost of this is
that we have to figure out how to formalize our ad-hoc methods. This will be
painful, but every engineering discipline does it. In essence, formalized
design methods applied to practical problems are the core of engineering
practice in general. For example, I think software engineering right now is in
the same place as electronic filter design was in the 1920's. We have methods
that work, given enough design effort, but our results are brittle and it is
very difficult to make changes to working systems without introducing
problems.

[1]
[https://news.ycombinator.com/item?id=7654601](https://news.ycombinator.com/item?id=7654601)
[2]
[http://en.wikipedia.org/wiki/Butterworth_filter#Original_pap...](http://en.wikipedia.org/wiki/Butterworth_filter#Original_paper)

Note: I am blown away by Sussman's talk every time I watch it, but I think
throwing out math as a tool for understanding computing is very premature
given its success as a tool for building and understanding abstraction,
especially given that computing as a field started as an attempt to understand
some deep questions about the foundations of mathematics.

~~~
discreteevent
"but I think throwing out math as a tool for understanding computing is very
premature given its success as a tool for building and understanding
abstraction"

But the problem with mathematics is that it doesn't deal with time. This may
(in a sapir-whorf fashion) be limiting our understanding of the universe (see
Lee Smolin's Time Reborn). But even if the universe is actually timeless in
many circumstances it still does not suit us to model it like that. We can
have some timeless physical law expressed mathematically (say F=ma) and if we
know all the initial conditions in a fairly pure environment we can predict
the state of the system at any time. But this kind of system is useless in
many real world situations. For example we don't build a control system for a
robot or a car based on this kind of thinking. We don't try and predict the
state of the world. We just treat it as a stateful system that changes and we
keep reading those changes and then try and behave accordingly - Its all I/O
and state. Similarly our brains in order to be efficient do hardly any
calculation, instead its a massive cache against which we pattern match (see
"On Intelligence"). As Leslie Lamport's state:. "Computer scientists
collectively suffer from what I call the Whorfian syndrome the confusion of
language with reality. Since these devices are described in different
languages, they must all be different. In fact, they are all naturally
described as state machines. "

------
picktheorem
a note on Haskell's failure to liberate us from imperative IO, with a view
towards frp:

[http://conal.net/blog/posts/can-functional-programming-be-
li...](http://conal.net/blog/posts/can-functional-programming-be-liberated-
from-the-von-neumann-paradigm)

------
stiff
It is funny how this paper like the writings of Dijkstra is widely cherished,
while its attitude and conclusions simultaneously are completely ignored.
Seems to me that people simply really like to complain about the current state
of the matter, how dirty programming is and how everything is inelegant just
to get back to churning code a minute later using exactly the style
criticized.

People like Backus and Dijkstra really wanted to prove theorems about programs
and do formal derivations like they were used to do in mathematics, that's
practically all they cared for, as far as their writings go. The "liberation"
is in fact an act of bondage, an attempt to limit programming techniques to a
narrow range to try to get all the noble benefits of mathematics, which is a
priori assumed to be the best way to go about reasoning. It's really ironic in
the case of Dijkstra, who writes about how people cannot appreciate "radical"
novelties and instead keep old mental habits, and then proceeds to argue how
programming is just a branch of mathematics.

I wonder how many people who mention those papers as cornerstones of CS
actually prove their programs correct on a daily basis or derive them from
"axioms". I actually think many people in the formal methods camp had a very
narrow and limited vision of what programming is and what it will be become,
and as a result turned out to be very much wrong about the importance of
proofs in programming. Some of them actually admitted it:

[http://www.gwern.net/docs/1996-hoare.pdf](http://www.gwern.net/docs/1996-hoare.pdf)

In the end, reducing all programming to formal manipulations of this kind
turned out to be as successful as the ideas of axiomatization of biology.
Formal methods are useful in mission-critical software, functional programming
penetrated mainstream languages and is interesting in its own right, but even
its proponents hardly go about writing programs the way Backus imagined;
mathematical theories of programming turned out to be very limited and yield
poor crops compared to the mathematical theories that turned out to be
successful, hardly any new insights over what was found earlier by dabbling
were found, and that is that. I love math, but it seems that programming, like
biology, is actually richer than math in some sense. Consider the example
Peter Norvig gives in "Coders at work": how do you prove Google is correct?
You immediately run into issues with the very notion of correctness, not
everything can be convincingly formalized to the last detail, and we build
more and more of those "fuzzy" systems, just consider the rise of machine
learning and data mining in the last years.

I see no reason to go around pretending programming didn't live up to the
noble vision of the ancient masters. The masters were frequently wrong (no
shame for they were pioneers of the field and nothing was known), and we
simply moved on.

~~~
loup-vaillant
> _Dijkstra […] then proceeds to argue how programming is just a branch of
> mathematics._

Which it is.

In a trivial sense, programming languages are formal systems. Even when poorly
specified, the _computer_ they run on is a formal system.

In a less trivial sense, we have the Curry-Howard correspondence.

In a very real sense, programming language designers have denotational and
operational semantics.

Fuzzy systems need not get rid of mathematics. They just switch from first
order logic to probability theory (or a computationally tractable
approximation).

Ignorance of basic mathematics is why so many fools believe lambdas or monads
are hard to understand. It feels like we're scared of abstractions. For
instance, we solve linear equations every time we go to the grocery and pay
cash. Yet being presented with those linear equations _as such_ , get us
running away screaming into the night.

\---

Programming is applied math, period. The question is, _which_ kind of math do
we want to use? Which are the more useful notations, formalizations, theorems?
How does all that relate to the old math?

~~~
stiff
Computers are formal systems as much as an alternating electrical current is a
complex number, same with programming languages, even though both programming
languages and formal systems can be viewed as abstract entities. Technically
speaking all that can be said is that certain programming languages can be
models for certain formal systems, and there is a big difference between being
a model for a formal theory and something simply being only a formal theory:

[http://en.wikipedia.org/wiki/Reification_%28fallacy%29](http://en.wikipedia.org/wiki/Reification_%28fallacy%29)

Programming language designers of most popular programming languages have no
clue about neither denotational nor operational semantics and make use of no
formal theory whatsoever.

Anyhow, I do not deny the importance of various kinds of mathematics for
programming, in some selected places and in the sense of writing programs that
do mathematics. I am saying two things:

\- The attempts so far to turn programming itself into a single mathematical
theory have failed miserably. Can you name one really important algorithm that
was discovered by doing derivations in a formal axiomatic theory of
programming? There is the "Algebra of programming" theory by Bird, the
"Elements of programming" by Stepanov, but nothing particularly striking seems
to ever come out of it, only proofs of things we already know.

\- There are hundreds of aspects of programming that have nothing to do with
mathematics and turned out to be of far bigger importance than finding
effective means of formal correctness proofs of programs. To give one example,
programmers have to work in teams with the size of problems we are dealing
with today, something the formal methods people never seemed to address much,
and issues with communication and coordination cause much more problems than
simple logical errors in algorithms.

~~~
loup-vaillant
> _Programming language designers of most popular programming languages have
> no clue about neither denotational nor operational semantics and make use of
> no formal theory whatsoever._

That may explain why so many of our popular languages have several glaring
flaws. Java doesn't support tail calls dammit! We had to wait for _scheme_ to
finally have lexical scope! C++ is impossible to parse!

> _The attempts so far to turn programming itself into a single mathematical
> theory have failed miserably._

Wait, what attempts? I've read many Haskell papers, and did not stumble upon
such a thing.

~~~
stiff
I have specifically mentioned two such attempts.

~~~
loup-vaillant
Okay. Though if it's not a peer reviewed paper, it doesn't exist. Anyway, I
have a more positive outlook: those "failures" are just the beginning. We'll
build on those. We failed to fly for a long time, for instance.

~~~
stiff
There are lots of papers, dating back to the 1970s:

[http://www.cs.ox.ac.uk/activities/publications/date/algprog....](http://www.cs.ox.ac.uk/activities/publications/date/algprog.html)

Actually Dijkstra also was a proponent of deriving programs, it's discussed in
his "Discipline of programming" and 30 years ago there was some interest in
this, with lots of papers published, but it never gained much traction:

[http://en.wikipedia.org/wiki/Program_derivation](http://en.wikipedia.org/wiki/Program_derivation)

------
chubot
_You know when toddlers try to put a round peg in a square hole? that 's how
it feels using functional programming in a turing-machine-like CPU. It just
don't fit._

Yup, I agree. Functional languages failed in part because they modelled the
machine poorly (Richard Gabriel has some reflections on Lisp in this area).

But the funny thing is that hardware has changed! Imperative programming
languages are now the square peg in the round hole :)

Multicore machines are going to elaborate lengths to maintain the illusion of
a uniform address space. But if we programmed with message passing and
heterogeneous processes, then our programs would have more mechanical sympathy
for the machine.

Shared memory and locks are basically the result of not wanting to disturb too
much source code when converting from a single-threaded to a
concurrent/parallel program. But it results in poor performance and
scalability, not to mention untestable and unmaintainable programs. You can
look at the evolution of multicore scalability in the Linux and FreeBSD
kernels, or Python and Ruby GILs (coarse grained locks in a Turing machine
model), for examples of this.

The other thing that has changed is that our hardware is distributed (not just
clusters, but also client-server). Networks are fundamentally unreliable, and
that means that idempotency of subprograms is a very important property.
Idempotency is of course a word from the same area of mathematics that
functional programming comes from (abstract algebra).

The other algebraic property that is important in distributed systems is
commutativity. This basically has to do with physics[1]. Commuter[2] is a nice
project which addresses commutativity in interface design.

It's not an accident that MapReduce and all the newer big data frameworks are
based on the model of functional programming. They are using the imperative
model at the scale of a single machine, where its appropriate, but the
functional model at the scale of the cluster, so subprograms can be retried
and executed in any order.

The alternative to explicitly thinking about these properties in your program
design is to (only) use slow state machine protocols like Paxos. You could
naively apply this transformation to your sequential programs, but you will be
making very poor use of the hardware. There is a reason that they are only
used for very small bits of state in distributed systems.

tl;dr Distributed systems (multicore included) are not Turing machines. I see
a lot of bugs that are a result of people in the imperative mindset not
understanding this.

[1]
[http://scholar.google.com/scholar?cluster=489252740511712348...](http://scholar.google.com/scholar?cluster=4892527405117123487&hl=en&as_sdt=0,5&sciodt=0,5)
[2] [http://pdos.csail.mit.edu/commuter/](http://pdos.csail.mit.edu/commuter/)

~~~
seanmcdirmid
The MapReduce use case is quite narrow, we also have replicated storage
systems that require, as you said, paxos; many HPC loads are using CUDA these
days, which, even though heavily based on SIMD, is very imperative. As soon as
you want iterative computation or long running incremental computation on
streams...you need to worry about fault tolerance again, and state becomes
very useful.

You can leverage commutativity without going functional...heck, that's how
most physics engines work and they are very far away from FP.

~~~
chubot
I'm not only talking about MapReduce; I'm saying there are ideas from
functional programming that have more affinity for the underlying hardware
than state machines. This is only going to get more important in the future.
Right now we're straddling the two worlds kind of awkwardly.

State is useful; however _exactly_ as in "traditional" functional programming,
it should be used sparingly in distributed systems. State requires very
special handling and thought. Statelessness or purity is the default. Clusters
have a tiny bit of state and lots of stateless workers, just like a Haskell
program has a tiny bit of state and lots of pure functions.

To be clear, I don't think Haskell makes sense at the single machine level. (I
personally don't want to program in it.) I'm saying that the ideas are more
appropriate at the level of architecture and not code.

------
MarkPNeyer
the von neumann architecture is the internal combustion engine of computing.
we've made it do some pretty amazing stuff, but it'd be awesome to try
something new.

~~~
fractallyte
I think that's a very appropriate (if oblique) 'car analogy'.

To extend it further, one might consider an alternative to the ubiquitous IC,
the steam engine: external combustion, clean(er), more efficient, multi-
fuel...

Aside: steam cars were actively developed throughout the 20th century, even
advancing to a sports car in the 1960s. The best article I read on this
subject was 'Steamer Time?' by Wallace West, in the September 1968 issue of
Analog Science Fiction.

Fittingly, the same article mentions the concept of 'steam engine time', a
term coined by Charles Fort, who "contended that the steamboat was invented
only when economic, political, scientific and engineering developments
combined to usher in steamboat time."

~~~
mjcohen
And in the early 60's there was the Dean drive (in Analog).

------
stevesun21
If you cannot write code with immutability, then don't try to blame the
programming language you are writing has no functional programming feature.

