
Can Programming Be Liberated From The Von Neumann Style? (1977) [pdf] - adgasf
http://worrydream.com/refs/Backus-CanProgrammingBeLiberated.pdf
======
jerf
The answer was "not while computing was getting exponentially better every
year". This sort of improvement curve is really difficult to compete with.

It's in the process of being liberated now. GPUs are already deployed in many
machines with a fundamentally different architecture. And there's a bubbling
foment of alternatives being developed now.

In the end, I don't think it was necessarily a case of us getting stuck on
"von Neumann" per se... we would have gotten stuck on anything we settled on
in the 1960s. And despite familiarity breeding contempt, I'm not that
convinced we could have done that much better in the 1960s. A lot of the
alternatives are not _universally_ better, they're better in some limited
domain but it's not obvious that they are going to uniformly beat von Neumann
for general computing, even accounting for the fact that "general computing"
has coevolved for 50 years with von Neumann.

See also
[https://news.ycombinator.com/item?id=12155561](https://news.ycombinator.com/item?id=12155561)
(which may well have been the inspiration for this submission)

~~~
_yosefk
GPUs are much closer to von Neumann machines than to anything people typically
deriding the latter have in mind (such as say the Reduceron architecture.)
Every commercially significant accelerator, GPUs included, has a RAM, often
multi-banked to support vectorized random access, it has an imperative
programming model, and it runs (often vector) instructions from program memory
operating on registers, data is occasionally exchanged between registers and
RAM addresses. $100 says this will remain true for the next 20 years.

The thing most far away from a von Neumann machine that is commercially
significant is the FPGA and it is nothing like what people deriding the von
Neumann machines have in mind. It's imperative, has registers and RAM blocks,
and it will cause nausea in the typical functional programming supremacist.
(Except those that choose to squander its performance potential on
implementing a wasteful machine like the Reduceron because graph reduction is
just better than an imperative low-level programming model.)

Programmers need to be liberated from misconceptions wrt von Neumann machines.

~~~
DigitalJack
I have to ask, what makes you say this: "and it will cause nausea in the
typical functional programming supremacist" with regard to FPGAs?

~~~
_yosefk
Not only are FPGAs as far from the ideal of "computation as reduction"
expressed in the paper as they could possibly be, but they're the hardest
machine to fit into this sort of a paradigm.

The typical CPU will run functional code fairly efficiently if enough work is
put into the compiler (to some functional programmers this sounds bad because
hardware should make writing functional language compilers easy. People who
know about hardware make a different conclusion - that hardware is just fine,
as Ken Thompson said, "the PDP-11 is a fine Lisp machine, it's not a language
special enough to warrant a specialized machine.")

The typical accelerator is a harder target to a general-purpose functional
programming language (it's a poor target for a general-purpose imperative
language to begin with.)

The FPGA, with its explicit management of every imaginable resource, is the
worst target for a general-purpose functional programming language.

In general, the farther away a commercially significant machine is from the
classical von Neumann architecture, the more terrible target it is for
functional languages, which never prevented FP aficionados from talking about
"the end of Moore's law bringing about the long-awaited demise of the von
Neumann machine", a "demise" that largely doesn't happen and where it does, it
does not mean what they think it means.

~~~
chillingeffect
> The FPGA, with its explicit management of every imaginable resource, is the
> worst target for a general-purpose functional programming language.

That's a great way to put it...

but perhaps there's a more suitable FPGA cell for functional programming (than
block lookup tables with state)? Maybe something with a local stack, recursion
assistance, an iterator?

Also, a good deal of the FPGA resource management is focused on
optimization... What if the optimization-at-all-costs constraint were relaxed
somewhat, or supplanted with meta-information to indicate the completion of a
group of cells to allow a message to propagate?

With relaxed constraints partial, on-the-fly and self- modifications could
become regular features.

~~~
adwn
> _Maybe something with a local stack, recursion assistance, an iterator?
> [...] supplanted with meta-information to indicate the completion of a group
> of cells to allow a message to propagate?_

What you're describing sounds a lot more like many CPU cores connected by a
network than like an FPGA. The point of an FPGA is that it gives you very
precise and flexible control over things like timing and interconnects.
Without that, FPGAs are actually pretty slow and you'd better use a CPU.

> _With relaxed constraints partial, on-the-fly and self- modifications could
> become regular features._

There's a reason why self-modifying software is restricted to niche use cases:
it's hard to reason about. Dynamic partial reconfiguration is slowly becoming
a thing in FPGA design, but it's typically used to work around resource
limitations.

------
segmondy
It already has, unfortunately it's not too popular. This is why Prolog gives
most folks a case of headache. My moment of enlightenment while studying
Prolog happened when I realized that I wasn't logical, but trying to solve the
problem in Von Neumann style. I was humbled to see simple logic solutions in
place of my overly complicated solutions. This has made me such a better
developer, I keep my code as declarative as possible.

~~~
Animats
Oh, Prolog. I once wrote a hardware configuration in Prolog. (Dependency
resolution - rules like "An A requires a B or a C".) The configuration part
was fine. Trying to write a menu system in pure Prolog, though...

Most of the troubles with functional programming come when you actually have
to _do_ something. The imperative/functional boundary remains a problem.

~~~
tome
> Most of the troubles with functional programming come when you actually have
> to do something. The imperative/functional boundary remains a problem.

How does it remain a problem? I've been writing commercial Haskell programs
that _do_ something for years now.

~~~
Animats
Monads are kind of a hack.[1]

[1] [http://research.microsoft.com/en-
us/um/people/simonpj/papers...](http://research.microsoft.com/en-
us/um/people/simonpj/papers/marktoberdorf/mark.pdf)

~~~
dllthomas
"X is a solution to problem Y" doesn't imply that X is a hack. You'll need to
be more verbose if you were making a more specific argument.

------
gavinpc
Backus and von Neumann have a history, from their time at IBM.

When Backus first proposed the idea of Fortran to his boss, he first "had to
overcome the persuasive arguments of von Neumann, who was then a consultant to
IBM." In Backus' words:

> He didn't see programming as a big problem. I think one of his major
> objections is that you wouldn't know what you were getting with floating
> point calculations. You at least knew where trouble was with fixed point if
> there was trouble. But he wasn't sensitive to the issue of the cost of
> programming. He really felt that Fortran was a wasted effort.[0]

Backus and von Neumann had made innovations in the use of floating-point and
fixed-point numbers by computers, respectively.

From what I can gather, von Neumann was an incredible genius who just
continuously pushed new ideas. He introduced "scaling factors" (fixed point)
to solve a scientific problem (in nuclear physics) and didn't worry about
improving its "usability."

Backus, on the other hand, cared about the difficulty of programming. Both
Fortran and his work on floating-point, as well as his later efforts in
functional programming, manifested that.

After leaving computing altogether, Backus took to meditation and, as he calls
it, "living."

[0] Quotes are from chapter 1 of _Out of Their Minds: the Lives and
Discoveries of 15 Great Computer Scientists_ , by Shasha and Lazere.

------
nemo1618
I think abandonment of the Von Neumann paradigm will be crucial for AI. For
decades we have labored under the misguided assumption that the brain is
isomorphic to the Von Neumann architecture, with "memory banks" and
"processing units." This approach has failed.

Admittedly, Lisp was once the first choice for AI research, and it did not
readily lead us into the promised land either. Perhaps a functional approach
is not radical enough :)

~~~
runeks

        > For decades we have labored under the misguided assumption that the brain 
        > is isomorphic to the Von Neumann architecture, with "memory banks" and 
        > "processing units."
    

Who has been working under this assumption for decades? It seems rather
obvious to me that this is very far from reality, so I'm a bit surprised if
the AI community thought the brain were a CPU for several decades.

~~~
riboflava
He seems to be under several misconceptions... This one may just be a mutation
of the widely accepted idea that we can in principle emulate a brain using our
standard computer models, even if it'll take n more decades of hardware
advances, the idea that the brain is discretely computable.

------
Razengan
I guess one way to approach questions like this, that challenge our
fundamentals of Doing Things, is to ask this:

Would an alien race be doing the same things? Using electricity, decimals,
binary, registers, memory etc. the same way we do?

As opposed to say photonics or biochemicals-based computers, trinary or
quarternary number systems, belt architectures and so on..

How much of our world has been shaped by simply the ORDER in which we
discovered things?

If there is only one Right Way of doing things then all the intelligent
civilizations out there will eventually use the exact same architectures and
have the exact same technology and, personally, that would make for a very
boring universe.

~~~
stuxnet79
It's a bit silly but that's exactly the way I tend to approach such questions
and my conclusions after musing on the subject for years is that we are
hamstrung by the historical ordering of past breakthroughs. Further, it is
very difficult to determine the extent to which our current ways of thinking
are limiting us.

------
dharmatech
Furry Paws is an "optimizing whole-program compiler and runtime-system for a
dialect of John Backus' FP language":

[http://www.call-with-current-continuation.org/fp/](http://www.call-with-
current-continuation.org/fp/)

~~~
abecedarius
Here's a much smaller and less ambitious interpreter in Python as a hackable
taste of the language:
[https://github.com/darius/parson/blob/master/eg_fp.py](https://github.com/darius/parson/blob/master/eg_fp.py)
\-- or rather another dialect, since everyone who implements FP makes up their
own concrete syntax. Mine reads left to right instead of right to left, and
leaves out the noisy compose operator (it's implicit by juxtaposition), making
it read kind of like Forth or Joy.

I also wrote an optimizing compiler to Scheme from this dialect back around
1990, but it's lost in the sands of time.

------
topkekz
A review of the 1977 Turing Award Lecture by John Backus by Edsger W.Dijkstra

[https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/EWD692.html)

~~~
chewyshine
This guy's arrogance takes your breath away! Correspondence between Dijkstra
and Backus on the Turing award. [https://medium.com/@acidflask/this-guys-
arrogance-takes-your...](https://medium.com/@acidflask/this-guys-arrogance-
takes-your-breath-away-5b903624ca5f#.5nc69kp8u)

------
ozy
No. I always wanted to say that to the title of this paper ;)

The complaint of van neumann programming langauges, as the paper describes, is
that it can only be extended as a framework. Todays imperative languages are
much more powerful, either they are untyped (implicitly generic), or they
include generics, or at least have objects with polymorphism.

The paper puts forward applicative (functional) programming, as a start,
because it can be extended much better. The main point, functions can be put
together, because you don't have to name, and thus not give types to, the
arguments.

To make the system complete (useful), he adds a applicative state transition
system, or mutable state. Think IO monad.

I have a different take. You want systems where everything can be abstracted
over. That is, put inside of a function (abstraction). And whatever the system
does is truly hidden, unobservable to the outside. That might sound like fp,
but it has problems when you want to hide state, or when the abstraction is
about input/output. Example:

    
    
      function downloadAndVerify(url):
        data = http.get(url)
        hash = http.get(url + ".md5")
        return data, md5(data) == hash
    

In my opinion, the above expresses what I want. If I need async, my callers
need async. If I need IO monad, my callers need IO monad. Neither can hide
that. If it blocks, it can be solved the same way as CPU intensive operations,
do it in the background. So processes need to be easy.

You want a language where any async api can be converted into a blocking api,
and any blocking api into an async api. (Using processes, most likely.)

------
adgasf
I find funcional languages great for modelling application logic, but when
working with graphics cards, printers, web APIs, etc. procedural code is more
intuitive. Maybe it's a limitation of my thinking, but I find an impure
functional language (C# and F# are good examples) the best fit for most
problems.

~~~
dmreedy
I would argue that this is largely because the systems you mention do not have
the requisite (hardware or software) functional abstractions required to make
them mesh nicely with functional or declarative control* code. Backus here is
writing from a hypothetical word in which that is not a problem.

\---

* a dangerous word in and of itself in such a conversation, because of its imperative significance.

~~~
nickpeterson
Wouldn't that just place the burden of providing those abstractions on the
hardware implementer. In essence, are we not just pushing the problem down the
stack?

~~~
dmreedy
Absolutely; the problem has to go somewhere. It's just the balance of how much
semantic weight belongs to the encoder vs. the decoder

------
elwell
Here's an interesting talk that covers some of these topics:

"Erlang Factroy SF 2016 - Keynote - John Hughes - Why Functional Programming
Matters"
[https://www.youtube.com/watch?v=Z35Tt87pIpg](https://www.youtube.com/watch?v=Z35Tt87pIpg)

------
Maultasche
When I saw this, I wondered whether this was the same Von Neumann who gave his
name to Von Neumann probes, which are hypothetical self-replicating spacecraft
that can fill the galaxy within a relatively short timespan.

It turns out that this is the same Van Neumann, who appeared to have been very
influential in the early days of computing. In reading about him, I was
impressed with all the work he'd done. The people who came up with the concept
of Von Neumann probes named them after Von Neumann because of his theoretical
work regarding how computing machines can self-replicate.

Fermi also used the concept of Von Neumann probes and the lack of alien Von
Neumann spacecraft as proof that aliens (or at least interstellar spacefaring
aliens) don't exist.

~~~
bbctol
Von Neumann was often considered by colleagues to be the smartest person
they'd ever met. There's an old story that goes something like this:

Von Neumann was asked a mathematical puzzle: two bicyclists start 20 miles
apart and head towards each other at a steady speed of 10 mph each. At the
same time, a fly that travels at a steady 15 mph starts from the front wheel
of the southbound bicycle and heads to the front wheel of the northbound one,
then immediately turns around and heads to the wheel of the southbound one,
and continues until the bicycles meet and he is crushed. What total distance
did the fly travel?

The trick is that it seems you'd need to compute an infinite series, figuring
out the distance traveled on the first journey, then the second, then the
third and sum them all. In fact, you can just realize that two bicycles
starting 20 miles away, each moving at 10 miles an hour, will meet in 1 hour;
if the fly's traveling at 15 mph constantly during that time, it travels 15
miles.

Von Neumann provided the solution immediately. The man who posed it to him
said "Oh, you must have heard the trick already!" "What trick?" said von
Neumann. "I just summed the geometric series!"

~~~
agumonkey
Are there more anecdotes of this kind ?

~~~
Jach
Fermi and von Neumann overlapped. They collaborated on problems of Taylor
instabilities and they wrote a report. When Fermi went back to Chicago after
that work he called in his very close collaborator, namely Herbert Anderson, a
young Ph.D. student at Columbia, a collaboration that began from Fermi's very
first days at Columbia and lasted up until the very last moment. Herb was an
experimental physicist. (If you want to know about Fermi in great detail, you
would do well to interview Herbert Anderson.) But, at any rate, when Fermi got
back he called in Herb Anderson to his office and he said, "You know, Herb,
how much faster I am in thinking than you are. That is how much faster von
Neumann is compared to me."

From [http://infoproc.blogspot.com/2012/03/differences-are-
enormou...](http://infoproc.blogspot.com/2012/03/differences-are-
enormous.html)

~~~
agumonkey
Cool quote and article. To be honest, not summing a serie on the fly, I found
the problem ridiculously simple.

I'm very curious about his love for algebra.

ps: Fr/De
[https://www.youtube.com/watch?v=c9pL_3tTW2c](https://www.youtube.com/watch?v=c9pL_3tTW2c)
Us
[https://www.youtube.com/watch?v=VTS9O0CoVng](https://www.youtube.com/watch?v=VTS9O0CoVng)

------
qznc
Wikipedia argues [0] that we actually use a "Modified Harvard Architecture"
instead of a "Von Neumann Architecture" today, since we have a separated data
and instruction cache. Also, various embedded chips really separate
instructions and data.

[0]
[https://en.wikipedia.org/wiki/Modified_Harvard_architecture](https://en.wikipedia.org/wiki/Modified_Harvard_architecture)

------
carapace
One very important aspect of this paper that I seldom see given proper
emphasis is the ability to treat programs algebraically.

Also, check out Manfred von Thun's Joy.

------
DonHopkins
Not all of von Neumann's architectures were von Neumann architectures [1].

That is to say, not all of the architectures he invented [2] ended up being
given the name "von Neumann Architecture". His 29 state cellular automata rule
[3] for implementing a universal constructor [2] was "non von Neumann" in the
sense of [1].

It's just so amazing what he accomplished with his mind, a pencil, and a piece
of paper, without the use of computers, which he also helped invent.

[1]
[https://en.wikipedia.org/wiki/Von_Neumann_architecture](https://en.wikipedia.org/wiki/Von_Neumann_architecture)

[2]
[https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...](https://en.wikipedia.org/wiki/Von_Neumann_universal_constructor)

[3]
[https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton](https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton)

------
mtrn
Kanerva has discussed alternative computing styles as well. Especially, he
asks, why it is hard to encode intelligence in such a rigid setup.

> The disparity in architecture between brains and computers is matched by
> disparity in performance. Notably, computers excel in routine tasks that we
> - our brains - accomplish with effort, such as calculation, whereas they are
> yet to be programmed for universal human traits such as flexible learning,
> language use, and understanding.

He focuses on the representation problem and develops a more flexible
representation of concepts with what he called hyperdimensional computing[1].

[1]
[http://redwood.berkeley.edu/pkanerva/papers/kanerva09-hyperd...](http://redwood.berkeley.edu/pkanerva/papers/kanerva09-hyperdimensional.pdf)

------
ankurdhama
You can always come up with some really elegant and expressive formal systems
to model programming/computing and thats the mathematics side of things BUT
people often ignore the physical/engineering side of things. We need to
engineer actual physical systems that can implement those so called elegant
formal systems of mathematics. This is where the problem is, engineering a
physical systems is a very complex problem with lots of constraints and is not
as simple as coming up with a bunch of symbols and rules of a formal system.

Physics is about reality and mathematics is about representations, you can
create representations from thin air but you cant do that with reality :)

------
jefffoster
"Out of the Tarpit"
([http://shaffner.us/cs/papers/tarpit.pdf](http://shaffner.us/cs/papers/tarpit.pdf))
is a paper with a similar bent arguing that we can eliminate complexity with a
functional/relational approach to designing software.

------
sroussey
Reminds me that we put so much work into making things "digital", that we
forget how many layers of abstraction there are. All the work to add numbers
when superposition is instantaneous.

It's not just the tyranny of memory, but tyranny of the clock! It will be
interesting when we get rid of both...

------
kristianp
By the way, Bret Victor's worrydream website references this paper from this
page on the talk "The Future of Programming":

[1] [http://worrydream.com/dbx/](http://worrydream.com/dbx/)

------
pklausler
It's not about liberating programming, it's about liberating programmers by
enabling them with more powerful abstractions. And pure lazy statically-typed
FP is one hell of a powerful abstraction.

~~~
david927
You can only say the first sentence if you limit yourself to the second
sentence.

If we go beyond FP, I think you'll find we're going to end up liberating
programming.

------
dmreedy
Note: in the time it took me to write this ramble and also the code review I
had to do in between writing and posting, others like catnaroek have addressed
similar themes

Speculating wildly here, but part of why I think there is continued (albeit
unconscious) resistance to other paradigms is that the procedural/imperative
paradigm seems to be the most `hackable' of the bunch.

I'm not using `hackable' in a positive sense here. Procedural code is bare-
level, almost the machine code of programming models. "First do this thing,
then do this thing, then do this thing", etc etc. There's a relatively small
implicit underlying `physics'[1] in a procedural system. In some sense, every
procedural program involves the creation of some higher-order logic (a model)
as a part of its construction, in order to both define the state machine, and
dictate the manner in which state flows through it.

The trick, of course, is that every model has a bias, encodes some manner of
thinking. An aspect of that is that it is easier to represent some things and
harder to represent others[2]. In a procedural program, when the model you've
(consciously or not) developed fails, it's trivial to 'fall out' of the model
and revert to writing base-level imperative code, hacking your way to a
solution.

Functional and other Declarative paradigms, on the other hand, have a
stronger, more complex `physics' to them. In the case of declarative, for
example, a user is asked only to declare the state machine; the execution of
it is left to the `physics' of the system. This can mean that a well-written
program in a functional or declarative language appears to be simpler, more
elegant. In reality, this is largely because a large set of assumptions that
would need to be explicitly declared in a procedural language have been
encoded directly into the language itself in the functional/declarative
case[3].

This means that when you're operating withing the paradigm that the
functional/declarative language affords, everything is smooth, beautiful,
elegant, and verifiable according to the physics of the system. However, it's
much harder to 'fall out' of the assumed model, because the granularity
required to do so isn't at the native resolution of the language.

\---

[1] By physics, I mean a set of assumptions one needs to make about the
behavior of the system that are not directly encoded by the user's input

[2] Think of how easy it is to, say, pick up a piece of steak with a fork, and
how hard it is to scoop soup with the same implement. Different models have
different affordances, recommend different problems, or different solutions to
the same problem.

[3] As the Church-Turing thesis shows us, these systems -are- equivalent in
power, so those semantics still have to be somewhere. To paraphrase Hofstader,
there are two kinds of music-playing systems (with a continuum in between). On
one end, a system with a different record for each song, and a single record
player capable of playing all records. On the other a system with a single
records, and a different record player for each song. The difference is how
much semantic weight you put on the encoding, and how much you put on the
decoding (but they still need to sum to 1)

------
mLuby
Is functional programming the only other style?

~~~
PeterisP
There are a bunch of other paradigms, e.g. declarative programming was
considered the thing some decades ago; but in general nowadays most domains
are best handled by the tools & approaches we have from either the direct
imperative simplicity, the OOP camp, or the functional mindset.

~~~
wtetzner
I think declarative is still preferred, if you can make it work. I think the
real issue is coming up with a declarative language that is general purpose
and efficient enough.

For specific domains, however, declarative languages have done quite well,
e.g. SQL, regexps, XPath, etc.

~~~
PeterisP
I believe that what we're seeing is declarative paradigm used as subsystems /
domain specific languages that do a particular core task accurately and
efficiently, but surrounded with a general purpose "scripting environment" in
which you do the messy interfaces with the surrounding world and users.

Besides the obvious SQL example, the current ML systems such as Tensorflow are
a great illustration; you declare a particular computation graph and then the
system "just executes it" over multiple GPUs.

------
hellofunk
Well, is not all the new trendiness in functional programming some evidence of
the answer to this?

~~~
meira
New to who?

~~~
maxxxxx
Pretty much everybody in the industry? Even new college grads don't seem to
know much about FP.

~~~
pjmlp
I graduated in the late 90's.

We got to use Caml Light, Lisp and Prolog across a few semesters.

There were also some references to Miranda and Objective Caml.

I guess it is a mirror of the quality of the university more than anything
else.

~~~
jghn
I graduated in the mid-90s. I remember using lisp and ML in multiple classes.
We also used Prolog although that's not a FP language but it is certainly non-
traditional.

------
PaulHoule
Yes

