
Auto-Threading Compilers Are Here - jacquesm
http://developinthecloud.drdobbs.com/author.asp?section_id=2284&doc_id=255275
======
chucknelson
The author should really remove this statement:

"The power of a single CPU core has plateaued and won't ever get faster. CPUs
are only getting more cores."

Pretty sure that isn't true. It may be getting more difficult to make CPUs
faster, but advancements in CPU architecture continue.

~~~
wwwtyro
Additionally, his premise that CPUs will continue to get more and more cores
is probably also flawed. The diminishing returns of the extra cores due to
Amdahl's law will make them less and less economically viable.

~~~
marshray
Intel and AMD want to continue making processors that cost more than $150. If
feature size continues to shrink, what else are they going to do with the
extra room on the chip? (other than slash their own prices using ever-more-
expensive fab equipment)

In single threaded code, the unused cores can be powered off.

~~~
rpledge
Larger caches is an option for using the extra die area. RAM access times
aren't increasing at anywhere near the rate that CPU power has.

~~~
Joeri
Why have caches at all? Couldn't you move the entire RAM on-die.

~~~
JohnBooty
It would be prohibitively expensive. Several _megabytes_ of cache already take
up about _half_ of the CPU die size.
[http://techreport.com/review/21987/intel-
core-i7-3960x-proce...](http://techreport.com/review/21987/intel-
core-i7-3960x-processor)

Think about how much larger a CPU would have to be to include an entire
gigabyte of cache, much less multiple gigabytes.

------
m0th87
I call bullshit. Researchers have been trying to make auto-parallelizing
compilers for decades without anything substantive making it to industry.

Even in functional languages, auto-parallelization hasn't worked well because
of the coarseness issue: it's difficult for compilers to figure out what to
run in multiple threads and what to run single-threaded because of tradeoffs
with inter-thread communication.

~~~
7952
Surely auto-parallelization is exactly what CPU pipelining and branch
prediction do quite effectively?

~~~
benzor
Well, sort of, but then the moment you run into hazards [1], bad branch
predictions [2] or any other problems, the CPU will either stall the pipeline
a few cycles or just flush the whole thing, so it's not like it's a magic
solution just waiting to be adapted.

[1] <http://en.wikipedia.org/wiki/Pipeline_hazard> [2]
<http://en.wikipedia.org/wiki/Branch_predictor>

------
zackzackzack
The author seems to think that because a language supports functional
programming, it cannot support mutable state in any form. It would be
worthwhile to actually look into what sort of support for various types of
state that a language like Clojure has (<http://clojure.org/state>). There are
at least 3 ways off the top of my head that I can change the value of a
variable in a Clojure process during runtime. It's obviously not the sort of
control over state that you have in C++, but it's a far cry from define a
variable once and that's all you can do.

~~~
nicktelford
The re-enforcement of the fallacy that FP shuns mutable state really put me
off this article.

~~~
Locke1689
Functional programming does shun mutable state. It just doesn't make it
impossible.

Functional programming is a style of programming which uses the lambda
calculus as its base model of computation. Just because you can subvert that
at times doesn't mean that idiomatic code doesn't disapprove of breaking
referential transparency.

~~~
jmillikin
Functional programming doesn't shun mutable state. It shuns _global_ state,
which is not the same thing. Mutability is well-supported even in very
stringently functional languages such as Haskell.

~~~
Locke1689
No. It shuns mutable variables. It provides outs for performance or
architectural reasons, but it's never a "good thing."

The ST monad is there for many reasons but if you can avoid it you probably
should. If profiling says it's necessary, then add it.

Look at Coq.

~~~
jmillikin
That's a frankly silly position to hold.

There's no reason to avoid ST, and in many cases it makes the code clearer and
shorter than trying to figure out some pointfree contortionism to reach the
same goal.

I'll look at Coq when it has support for binding to C libraries, opening
sockets, or doing anything else than a programming language needs to support.

~~~
eurleif
You seem to be setting up a false dichotomy between using the ST monad and
writing pointfree code ("functions as a composition of other functions, never
mentioning the actual arguments they will be applied to"). But you can write
non-pointfree code that doesn't use the ST monad, and you can write pointfree
that does use it. They're unrelated.

------
malkia
Here is a good overview of the 13 isolated computing problems when comes to
multi-threading.

It's really not about compilers, fp, not fp so much as to what kind of tasks
are easily multi-threaded. As I replied in another post - things like state-
machines are very hard to parallelize (one of the dwarfs that the article
below talks), while others very easy - like raytracer (to a point, since a
raytrace have to share data across all computing units - cpu-s).

(An article from 2008)

<http://www.cs.nyu.edu/srg/talks/BerkeleyView.pdf>

~~~
scott_s
Don't confuse "mutlithreading" with "parallelism." Multithreaded code is a
particular kind of parallelism, but not the only kind.

~~~
martinced
"Multithreaded code is a particular kind of parallelism"

No.

Parallelism implies that there are multiple CPUs and/or multiple cores at work
(or in older use of "parallelism", at least some instructions that are
parallelized).

Multithreading doesn't imply that at all: you can have a multi-threaded
program (like, say, a Java program using multiple threads + its GC thread, EDT
thread, etc.) running on a single-machine using on CPU which has a single core
and which doesn't do any kind of parallelism.

To me you're totally wrong in saying that multithreaded code is a particular
kind of parallelism.

Multithreading doesn't imply parallelism.

~~~
scott_s
You are correct, I over-simplified. Multithreading is a way of implementing
parallelism, but it will not always lead to parallelism. Multithreading always
implies concurrency, but only yields parallelism if the underlying hardware
allows it. (Concurrency allows for time-sharing the processor so they are not
actually running simultaneously; parallelism means they are running
simultaneously.)

However, I was not primarily concerned with this distinction, but with the
distinction between obtaining parallelism through multithreading, versus
parallelism through message passing. Multithreading implies that parallel
threads will communicate implicitly through shared memory, which does not
scale past a single machine.

------
PaulHoule
My favorite parallelizing compiler is Pig.

Pig can turn a combination of relational operators into a series of Map and
Reduce operations that can be done on a Hadoop cluster. This is all stuff I
can code up by hand, but most the things I might do with a shell script or SQL
statements I can parallelize in a way that's scalable in both directions.
Because I can write parallel code easily and quickly, I can use it to do
little jobs. As for big my home cluster handles terabytes and I can rent any
level of power from AWS.

------
octo_t
This is task-based parallelism and does require significant (and verbose)
annotations of the source code.

~~~
profquail
The paper referenced by the article, "Uniqueness and Reference Immutability
for Safe Parallelism", says:

    
    
        The group has written several million lines of code, including: core libraries
        (including collections with polymorphism over element permissions and
        data-parallel operations when safe), a webserver, a high level optimizing
        compiler, and an MPEG decoder.
    

So it also handles data-parallelism.

~~~
kibwen
Paper available here:

<http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf>

Incidentally, I hear this paper is quite similar to one that the Rust[1] folk
recently authored regarding their "borrowed pointer" system, due to be
possibly published next year. If this sort of thing interests you, I suggest
you take a look at Rust (though keep in mind that it's still very pre-alpha).

<http://www.rust-lang.org/>

~~~
stcredzero
I was just thinking that this cries out for language support.

------
georgeorwell
This is just a fluff article pushing some MSR C# thing. Why not submit a link
to their actual paper from OOPSLA?

[https://research.microsoft.com/pubs/170528/msr-
tr-2012-79.pd...](https://research.microsoft.com/pubs/170528/msr-
tr-2012-79.pdf) (this is the TR linked from the Dr. Dobbs thing.)

Automatic parallelization has been the holy grail of performance-based
computing since the 1980's or earlier, and if auto-threading compilers had
arrived we would all know about it.

The fact remains that for general-purpose code, automatic parallelization is
an unsolved and exceedingly difficult problem. So difficult that PG claimed it
as one of his highly ambitious startup ideas.

------
gilini
The author states that "we made the fastest CPUs that physics allows," linking
to an article that doesn't exactly say that.

Assuming that he's merely talking about CPU frequency, does that hold? And if
so, why?

------
stephengillie
Hopefully we'll see improvements in videogames, which I understand are among
the harder type of program to multithread.

~~~
RKearney
Doubtful. Most games are GPU bound not CPU bound so CPU threading really isn't
the issue.

~~~
stephengillie
Has the pendulum swung back? I'm pretty sure GPUs overtook CPUs in the past 5
years, and before the i7 was released CPUs were the bottleneck.

~~~
anonymous
Depends on the game, really. Take a look at minecraft and dwarffortress - they
are both cpu-bound due to

1\. Simulating a large world using complex entities and voxels

2\. Being single-threaded with no clear way to make them multi-threaded

Minecraft in particular is a pretty interesting problem since it only
simulates the part of the world that's within a radius of a player, it would
make sense to have each player's machine simulate their own part of the world
and the server to somehow merge those together.

~~~
cousin_it
Simulating a large world seems like a textbook example of an "embarrassingly
parallel" problem, doesn't it? At least that's true for cellular automata.

~~~
MichaelGG
I think the problem comes from that the world is a large mass of
interconnected mutable state. You don't know if updating a particular object
will update another.

Think of a player crouching on a plank that another player just phasered. If
you continue simulate the first player's movement "crawl forward" by itself,
how do you integrate the result of the plank being disintegrated, causing the
player to fall? The first player's simulation outcome depends on multiple
inputs, but they aren't findable directly from that player's perspective. You
have to first simulate the phaser beam to know the plank is gone to know the
player is falling now, not crawling.

And then imagine that with far more complex rules and a few hundred thousand
objects having similar interactions, each one possibly modifying any other
one.

~~~
Evbn
If you limit communication to "speed of light" , then each tick involves only
local message passing, which is easily parallelizable.

------
malkia
In the mean time I'm still waiting for the Octopiler...

------
michaelochurch
The not-always-pure FP that sometimes uses state is what real-world functional
programmers actually use. We _manage_ state. It's neither possible nor
desirable to _eliminate_ it outright.

It wasn't until recently (a couple weeks ago, when giving a presentation on
FP) that I realized _why_ stateful programming lends itself so easily to evil.
If a program is a serial collection of possibly unrelated stateful actions,
everyone can add new intermediate behaviors to a function to satisfy some
dipshit requirement, and the API doesn't change. Writ large, this allows
silent complexity creep.

I think a major reason why FP is better is that changing a purely
referentially transparent function requires an API change: more parameters or
a different return type. If nothing else, this tends to break tests. It's hard
to change functions that already exist, and it _should be_ hard. You should be
writing new ones instead. Also, if there's only one way to combine programs
(function composition) it's easier to break them up. So you don't get the
500-line monsters that plague enterprise codebases.

That said, the worst thing about OOP isn't state. It's _inheritance_ , which
is the 21st-century goto.

~~~
MatthewPhillips
You had me until you knocked goto.

~~~
michaelochurch
Goto is fine in self-contained uses, but turns catastrophic on large,
enterprise codebases, like inheritance.

[http://michaelochurch.wordpress.com/2012/08/15/what-is-
spagh...](http://michaelochurch.wordpress.com/2012/08/15/what-is-spaghetti-
code/)

~~~
akkartik
I'll include part of my comment to that post:

 _"Are goto’s always bad eventually? Who cares? There’s far bigger problems
with gradual change in codebases, let’s think harder about them. My peer
reviewers, hold off review until you’ve used what I’ve built and found
problems with it. Managers and programmers, wrestle with judgement everyday,
the stuff not easily put into rules. Leave the comforting shallows and engage
with the abyss."_

[http://michaelochurch.wordpress.com/2012/08/15/what-is-
spagh...](http://michaelochurch.wordpress.com/2012/08/15/what-is-spaghetti-
code/#comment-836)

------
DanWaterworth
> Eliminating state is usually possible, but it makes programming
> exceptionally hard.

In the words of a Wikipedian, "citation needed".

EDIT: To the downvoters, care to comment?

