
Computers are made of metal, not category theory - stakent
http://www.chrisstucchio.com/blog/2014/computers_are_made_of_metal.html
======
exDM69
If you take a look at how the GHC Haskell compiler (A "Sufficiently Smart
Compiler(tm)" in my opinion) works, for example, it is not naively pushing
allocating objects, creating thunks and emitting trampoline code.

Instead, it analyzes the program structures from graphs and emits rather
efficient machine code in the end. Not too dissimilar from what your native
code emitting C compiler does.

If you look at the machine code for something as "stupid" as the Haskell
example below, the output object code does not resemble the semantics of the
source program at all. (it's not quite as efficient as the same from a C
compiler, but still proves a point)

    
    
        foreign export ccall fac :: Int -> Int
        fac :: Int -> Int
        fac n = foldl (*) 1 . take n $ [1..]
    

Compiler and programming language research is a very important topic that
yields real performance benefits as well as better programmer productivity.
That includes using Category theory to reason about program correctness.

If you're interested in how the Haskell compiler works, "Implementation of
Functional Programming languages" is a good (albeit a bit outdated) starting
point. The whole book is freely available here:
[http://research.microsoft.com/en-
us/um/people/simonpj/papers...](http://research.microsoft.com/en-
us/um/people/simonpj/papers/slpj-book-1987/)

I do agree with the title a bit, though. Some of our computer programming
environments are just ridiculously slow. Being slow also means "consumes a lot
of power" which is important when more and more computers are powered by
batteries.

~~~
0xdeadbeefbabe
Correct me if I'm wrong, but teaching a computer to produce good low level
code from high level code seems very similar to the machine translation
problem (translating English to X). Why are people so hopeful about solving
the former problem when the latter problem is so hard? Or are these two
problems so different that the failure so far to solve one has no impact on
the other?

~~~
comex
I'd say those two things are only superficially similar. One is trying to
correctly interpret a syntax which

\- is tailored to the characteristics of extremely parallel computers which
spend years unconsciously learning somewhat pointless statistical micro-rules,
and then layer a formal grammar on top of that (both layers need to be
correctly maintained); and

\- sometimes, due to ambiguity, can't be parsed without understanding the
nature of what is being referred to (which can be anything one can imagine).

The other is trying to transform expressions which are already formally
specified (with semantics to a large extent based on computers' whims rather
than humans', even in very high-level languages) into a more efficient form.
Doing this as well as humans can could require AI-ish "reasoning" about parts
of the program, but it's basically a different task. There's no need to
understand everything in the world to know what a Haskell program does.

------
nine_k
Aircraft are made of metal, not fluid dynamics. Rockets are made of metal, not
ballistics equations. Nuclear bombs are made of metal, not quantum theory.

Well, metal _is_ important, but to put it to any use other than very naïve
fiddling with it, good abstractions are indispensable.

Flamebait titles, on the other hand, don't help it the smallest bit.

------
dxbydt
When you first wrote about how hadoop is a waste of time if you don't have
multi-TB worth of data
([http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html](http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html)
), I thought that was classic linkbait. And then I actually started seeing
real benefits of not doing map-reduce for smaller datasets ( small = 1GB-1TB)
& just sticking to plain old scala ( poso as opposed to pojo :) Similarly,
this article seems linkbait on the surface but makes a lot of sense if you do
anything performance intensive. I recently tried implementing a multi layer
neural net in Scala - I eventually ended up rewriting your mappers as while
loops & introduce some mutables, because at some point, all this FP prettiness
is nice but not very performant. It looks good as textbook code for toy
examples, but takes too long to execute. Am still a huge fan of FP, but
nowadays I don't mind the occasional side effect & sprinkling some vars around
tight loops. Its just much faster, & sometimes that is important too.

~~~
codygman
I'd like to add that "sprinkling side effects & vars around tight loops"
doesn't generalize to all languages. For isntance, GHC can optimize pure code
MUCH more than impure code.

Meaning that with purity you get increased performance in many cases.

------
rollo
What I took from this post, is that we should keep working on compilers. They
could provably optimize away most of the performance issues and move us closer
to doing category theory instead of dealing with the metal.

~~~
kens
I'd much rather deal with the metal than with category theory.

(I'm not trying to be snarky here, but I've studied category theory and it
seems like a whole lot of effort for very little benefit. On the other hand,
it's fun to figure out the metal. (Although silicon's not exactly a metal...))

~~~
wpietri
Yes. To oversimplify, it seems like there are two schools of software: the
car-mechanic school and the math-professor school. I happen to be of the car
mechanic variety. That doesn't mean I'm not borrowing energetically from
people pursuing the math-professor approach, but by now I'm pretty sure where
my strengths lie. And given that computers (and cars) are mainly used for
practical outcomes, I'm totally ok with that bias.

~~~
dxbydt
>the car-mechanic school and the math-professor school.

Heh heh. Having worked mostly with math professors, I would say the code they
write almost exclusively belongs to the car-mechanic school. Math professors
never deck up their code with CT. The majority of the time a math prof has to
write code, its to do heavy-duty applied math, engg stuff ( solve pde's, fluid
mechanics, EE math, fourier transforms, gradients etc. ) or to display some
cool visualization to the students ( here's how a vector space looks! here is
what happens when you apply linear transformation!! here are all the elements
of a quotient group!!! ) - those tend to be done in
Matlab/Mathematica/Maple/Gap...and these tools have zero CT. The applied stuff
tends to use netlib, gams, gsl, colt, apache math...mostly a lot of gsl these
days...again no CT there. Most of them are very happy to declare giant
matrices & happily mutate away.

~~~
wpietri
Heh. Yeah, a lot of the actual PhD-types I know are amateur coders on a
mission, and their code matches that.

I'm sure you get it, but just for the record, by math-professor I meant "very
focused on things that to most people seem incredibly abstract", with a side
of "passionate about solving intellectual challenges for their own sake".

------
dons
Title is misleading. This is about some Scala microbenchmarks. There is no CT
or metal here.

------
chriswarbo
The metal is also made of category theory:

[http://conal.net/blog/posts/circuits-as-a-bicartesian-
closed...](http://conal.net/blog/posts/circuits-as-a-bicartesian-closed-
category)

------
chriswarbo
Related links:

Cost semantics [http://lambda-the-ultimate.org/node/5021](http://lambda-the-
ultimate.org/node/5021)

Recent efforts to bring together "logical" CS (lambda calculus, type theory,
etc.) and "combinatorial" CS (machine models, big-O, etc.)
[http://existentialtype.wordpress.com/2014/09/28/structure-
an...](http://existentialtype.wordpress.com/2014/09/28/structure-and-
efficiency-of-computer-programs/)

------
yzzxy
I'm no expert on these matters, but it seems a bit ridiculous to call out
entire swaths of the programming/computer science world while referencing the
JVM as a benchmark.

~~~
yummyfajitas
I'm (author here) not "calling them out" \- most people doing research in FP
are well aware of these issues. This is aimed at people like me who merely
implement stuff in those languages.

Also, most of the Scala code I'm describing compiles down to very
straightforward instructions that would be the same on almost any compiler. If
you can show me any other language which will handle a linked list as rapidly
as the JVM handles an array, I'll be extremely impressed.

~~~
dons
> handle a linked list as rapidly as the JVM handles an array

A linked list is not an array. Particular operations will have different
complexities. One data structure is not "more rapid" than another.

~~~
kasey_junk
This is precisely the kind of thinking people need to be careful about. Some
data structures are more rapid than others even with more complex operations,
because they conform to the reality of the underlying hardware better.

~~~
codygman
Can you please provide some examples? It would be much appreciated.

~~~
noblethrasher
[http://youtu.be/YQs6IC-vgmo](http://youtu.be/YQs6IC-vgmo)

Stroustrup describes why –on modern hardware– vectors are faster than linked
list for insertion and deletion.

------
cantankerous
Good read. Minor quibble: I'm not sure calling Julia a functional language is
really a fair statement. Yes it's LISP-like, but if you want to go there, Ruby
is a LISP-like language in the same ways Julia is. I don't really encounter
many folks making the claim that Ruby is a functional language.

~~~
dragonwriter
> I don't really encounter many folks making the claim that Ruby is a
> functional language.

Ruby is a multiparadigm language, and functional is one of the important
paradigms it supports. A rather popular essay on Ruby from nearly a decade ago
(surprising for me to discover that it was that long ago!) was "Why Ruby is an
acceptable LISP" [1]

Its true that recently much of the recent focus on functional languages has
been on _statically-typed_ and/or _pure_ functional languages, which neither
Ruby nor Lisp is...

[1] [http://www.randomhacks.net/2005/12/03/why-ruby-is-an-
accepta...](http://www.randomhacks.net/2005/12/03/why-ruby-is-an-acceptable-
lisp/)

------
minimax
The most important thing here is to measure. Always measure. It doesn't matter
what you think is going to be fast or not. The combination of super smart (or
not so smart) compilers, multi-level hierarchal caches, pipelining, branch
(mis)prediction, etc means you can't just look at a piece of code in a high
level language and know how fast it will be. You always have to run the code
and measure how long it takes. For anything where you really care about
latency, measure.

~~~
TheLoneWolfling
...And then have something in an entirely unrelated set of code change and
make the code you oh-so-carefully benchmarked become half as fast. Because you
just caused cache line aliasing, or your pushed your function code across a
boundary, or you're now killing your branch predictor, or, or...

At this point, in any optimized language benchmarking parts of your code is
practically useless for finer optimization. There are some things you will
pick up on, sure, but trying to squeeze the last drop of performance out of
your code? Good luck. Too many global interactions. I mean: even GCC doesn't
generate statistically significantly faster code on O3 versus O2.

------
kyllo
Computers are metal but programming languages are for humans. The purpose of
functional abstractions is primarily to help humans produce correct code, not
necessarily to optimize for speed. Referentially transparent functions,
immutable data and type checking are meant to provide guarantees that reduce
complexity, make a program easier to reason about and help stop the programmer
from introducing costly and dangerous bugs. It doesn't matter how fast your
"fugly imperative code" is, if it's incorrect. And the lesson we've learned
the hard way over and over again is that humans are just not smart enough to
reliably write correct fugly imperative code.

~~~
kasey_junk
I'm not sure I buy your argument, but even if I did, there is no evidence as
of yet that says that fugly functional abstractions provide for lower error
rates.

Further, performance is part of correctness for nearly any real word example.

~~~
DanWaterworth
> there is no evidence as of yet that says that fugly functional abstractions
> provide for lower error rates.

Sure there is. Compare the incidence of buffer overflows in C programs vs
Haskell ones.

> performance is part of correctness for nearly any real word example.

No it's not. See the definition of correctness [1].

[1]
[http://en.wikipedia.org/wiki/Correctness_%28computer_science...](http://en.wikipedia.org/wiki/Correctness_%28computer_science%29)

~~~
kasey_junk
"Compare the incidence of buffer overflows in C programs vs Haskell ones."

It's true that 1 particular instance of a functional language removes 1
particular class of problems from another non functional language. But there
is no evidence that these error rates are lower in general because other
classes of errors creep in. For instance, space leaks due to laziness are a
class of error you are unlikely to encounter in C, but are common in Haskell.

~~~
kyllo
Sure, but in C you get memory leaks due to simply forgetting to call free...

------
chriswarbo
Quick question: when the article calls a pair of mutually-recursive functions
"corecursive", is this a commonly-used meaning of the term, or just a mistake
(since they're certainly not corecursive in the coinductive sense)

~~~
recursive
That's the meaning I use.

------
Rusky
Computers _are_ "made of metal", and category theory does often lead far away
from that reality. But that doesn't mean function calls are slow, or that we
need Sufficiently Smart Compilers just to use map efficiently.

What it means is that abstractions should be designed with both the use and
the implementation in mind. One way to do that is "zero-cost abstractions" a
la C++, where the abstractions are pretty minimal. Another way is things like
stream fusion and tco, where it's easy to accidentally stray out of the
efficiently representable subset.

But there are a lot of ways to get abstractions that are both higher-level and
"zero-cost" (generating the code you would have if you hadn't used them). For
example, Python generators and C# iterators (coroutine-looking functions
internally transformed into state machines) look a lot like Haskell higher-
order functions but the benefits of laziness and stream fusion and tco are
just built into the abstraction, rather than optimizations that may or may not
happen, depending on the language/compiler. They also turn out to be more
flexible, since the iterator state is reified.

Another example is entity-component systems in game engines. You still get a
nice abstraction of game-world objects composed from behaviors, like you might
see in an OO hierarchy, but the cache behavior is vastly improved and the
behaviors are again more flexible.

------
mrwilliamchang
Love the title.

A more common mistake I notice people making is writing code that makes more
memory allocations than necessary.

    
    
      # Bad: Makes an extra instantiation of a list with 1 in
      # it which then needs to be read 
      x = set([1])
    
      # Good
      x = set()
      x.add(1)
    
    

Overall, I think it is important to remember that when you write a program
that every step translates to a set of operations. And this applies to all
kinds of programming, not just functional programming.

~~~
TheLoneWolfling
Any half-decent optimizing compiler will turn those two implementations into
the exact same assembly.

That being said, that looks like Python source code - and as the default
Python interpreter doesn't do much in the way of optimization, you are correct
in that case. (Although I wonder about PyPy.)

------
sitharus
When I started programming it was ZX Spectrum BASIC. No abstractions other
than some symbols. Then I moved on to Perl, then PHP and eventually C# - all
increasing abstraction from the hardware, but all mutable state and based off
how hardware works.

Recently I've started moving on to F# and Haskell, and it's really opened my
eyes.

While computers keep getting faster the humans who input the programs do not.
While programming is about getting a computer to do things the most important
part is making it do what you want it to to. Anything that helps humans reason
about what the computer will do - rather than exactly how - is a good thing in
my book.

------
sklogic
Stalin and MLton are sufficiently smart. Probably too smart in some cases.

------
ffn
One can just as easily make the counter-argument people understand theory, not
computer instructions. So the cleaner and more functional your code is, the
easier (and faster) it becomes for other people to build on top of / around
it.

But that counter-argument only works given that the current application
requires tons of people of work on your code. Just as the article's argument
works only if you actually need to squeeze out those extra ms on your 1
machine.

My only take away from this is "use the right tools for the right job".

------
mcguire
I'm a little confused: the author is speaking of "metal" but isn't writing
assembly.

(And I'm a little snarky today.)

------
Tloewald
Surely the point is to use an effective language (for your particular
definition of effective) and then optimize based on performance testing.
Otherwise you lose time writing fast low level code that doesn't need to be
fast or low level, and possibly never get to the important stuff.

~~~
tjradcliffe
This. "No optimization without quantification" should be pasted up beside
Knuth's laws of optimization. Performance testing and profiling will tell you
where the code is really spending its time. Fix those problems.

The past 20 years have shown us that naive optimization, which was never all
that good, breaks completely in modern compilers, which often emit code that
is optimized for the underlying architecture in ways that it's difficult for
the programmer to anticipate. Like everyone who used to do a lot of hand
optimization, in the late 90's I started seeing more and more cases where my
clever tricks were making things worse, not better. So you can't just optimize
by rote: you have to understand the specific problem you're dealing with.

------
serve_yay
Well, OK. I have never seen anyone argue that functional programming is
beneficial for performance reasons.

~~~
marcosdumay
Well, Ok.

Haskell (or any other pure functional language) can often make it viable for
you to write code that is much more performant than what you'd have writen in
C (or any ohter mainly imperative language). Mainly so if the problem is
paralelizable and can use asynchronous IO.

Done. Now you've seen somebody arguee that functional programming is benefical
for performance reasons.

Of course, the previous statement is not true if you have unlimitted budget
and time. At least not yet.

~~~
serve_yay
TFA serves as the rejoinder to the argument you just made :)

------
tlarkworthy
silicon is not really metal[1], it's rock. It's the most abundant element on
earth & the universe, which I find interesting. Most of our planet and
universe could be turned into a giant computer.

[1] it's a metaloid

~~~
ars
Silicon is not rock, rock is Silicon Dioxide, which is not the same thing.

It's not the most abundant element on earth - Iron is. It's not even the most
abundant on the earth's crust, oxygen is. (Silicon is second place.)

And it's certainly not the most abundant in the universe - hydrogen is.
Silicon is 8th.

~~~
tlarkworthy
Yes I should check my factoids before regurgitation. In atom counts there is
more silicon than iron though, but more oxygen than either. Still interesting
though that silicon is so plentiful. Doesn't matter if its bonded to an
oxygen. I suppose the bonding explains why the oxygen count on earth is so
high also. Oxygen is also (part of) rock.

~~~
ars
> Still interesting though that silicon is so plentiful.

It makes sense if you look at
[https://en.wikipedia.org/wiki/Stellar_nucleosynthesis](https://en.wikipedia.org/wiki/Stellar_nucleosynthesis)
since carbon, silicon, oxygen and iron are some of the main results.

What's more odd is just how useful silicon-dioxide and iron is for building
planets! Some of the most plentiful elements, just so happen to be perfect for
making planets.

And carbon with its amazing array of compounds no other element can match is
also plentiful.

Then you have iron which is the end result of this process, which just so
happens to be perfect for building man-made things. It's strong, yet unlike
the other strong elements, iron is also easy to work.

And water - oxygen and hydrogen, very common elements. And water expands when
it freezes, if not for that small small, non-obvious thing, life could not
exist.

The most abundant elements, the ones that don't need a supernova, just so
happen to also be the most useful ones. Well, it's pretty obvious to me this
universe was designed specifically for us.

------
dschiptsov

       - There is nothing absolute in the world.
       - Metal rails are.

------
jcromartie
He keeps using the word "pointer" ...

------
bkeroack
"No, don't take away my beautiful abstractions!"

