

Auto-Threading Is Actually About Mutability And Safety - tkellogg
http://developinthecloud.drdobbs.com/author.asp?section_id=2284&doc_id=256017&f_src=developinthecloud_sitedefault

======
tikhonj
This article's author seems to be very dismissive of Haskell. Which is a
little odd because, at least as described in the article (I haven't had time
to look at the paper yet), the approach taken by this compiler is actually
very similar to what is already done in Haskell.

In particular, Haskell already supports initializing data structures mutably
and then "freezing" them. I know this technique is used in the Vector library,
for example.

Additionally, "pockets of imperative mutability" perfectly describes Haskell's
"State Threads" (ST). Programming in Haskell already allows you to have
arbitrary state constrained to a particular scope and controlled by the type
system. Using things like ST with immutability by default is just tracking
state in the type system. And--as long as you don't worry about how it's
implemented under the hood--ST isn't terribly difficult to use.

I'm sure the auto-threading compiler is very novel and interesting research,
and I'm sure it's very useful. However, I think that you shouldn't be this
dismissive of Haskell, especially because Haskell already does essentially
everything described in the article.

~~~
bunderbunder
I don't feel like the post was being dismissive. The key sentence is toward
the beginning: "The genius of the compiler created for the paper is how it
offers safety and allows auto-threading with much less effort than previous
attempts."

For example, I'm not terribly skilled in Haskell but I believe the ST monad
has to be managed manually by the programmer. You do get a pocket of
mutability that allows you to constrain state to a particular scope, but that
pocket has to be manually defined by the programmer, and internal to that
state the programmer's still responsible for ensuring that any concurrency
relating to the mutable data is handled in a thread-safe manner.

As far as freezing/thawing, again I'm no expert but I believe it doesn't work
quite the same way as what's being described in the paper. Freezing an array
in Haskell doesn't involve taking an existing array and declaring that its
contents can no longer be modified; instead the freeze operation creates and
returns an immutable copy of the array. (Same for thaw.) There are also
unsafeFreeze and unsafeThaw which don't do that, but they really are unsafe.
The unsafeFreeze operation doesn't take away your MArray, and that MArray
happens to be pointing to the same storage as the Array. Unless there's
something else going on in there I don't know about, that's a situation for
which the word 'unsafe' is an understatement.

Perhaps there are other facilities in Haskell I don't understand which are
able to do what this C# extension does, and not just doing different things
that happen to be describable using similar language. If so I'd be really
interested to see them. And really curious to know why there hasn't been more
interest in them thus far.

------
bunderbunder
As far as the practical value of an auto-parallelizing compiler goes, I'd like
to see it before I believe it. I've got no doubt that the compiler is pulling
it off. But with how young the technology is, I'd be downright amazed if
they've got something that can reliably outdo what a knowledgeable human can
manage with a reasonable amount of effort. Not that I wouldn't like to see
things get to that point.

For now, though, what's really interesting to me is the idea of language-level
support for managing mutability. I agree with the author that that's something
that both functional and imperative languages have generally fail to provide
much help with. And I think there really is a need for language-level support
for it. It's always possible to pass in a mutable subtype or a non-pure
function as an argument to a procedure at run-time. So without some way to
say, "This parameter requires pure arguments," it may not even be possible for
a programmer to verify that a module can, e.g., cope with concurrency in code
that uses any degree of inversion of control.

~~~
spectre
A programmer is always capable of writing better parallel code then such a
compiler but writing safe and correct parallel code is beyond the skills of
many programmers. A good analogy is that a good programmer can hand write
assembly that out performs compiled code but that is too hard for most
programmers to achieve.

There is some interesting research into languages that explicitly control
mutability and object reachability (A set of objects don't need to be
immutable if we can guarantee that only one thread has access to them).

[http://ecs.victoria.ac.nz/foswiki/pub/Main/TechnicalReportSe...](http://ecs.victoria.ac.nz/foswiki/pub/Main/TechnicalReportSeries/ECSTR12-18.pdf)

------
lucian1900
It's what people have been saying for a while now: mutable state is
problematic.

Very cool, of course.

Interestingly, Rust is moving to a similar model, where the (im)mutability of
a data structure is "inherited" from the (im)mutability of the thing that
references it. So it's trivial to determine something like "for this section,
this object is immutable and not referenced from anywhere else".

~~~
cobrausn
Problematic but absolutely necessary. I'm reminded of a quote from John
Carmack from an article where he was advocating FP in C++.

 _In almost all cases, directly mutating blocks of memory is the speed-of-
light optimal case, and avoiding this is spending some performance. Most of
the time this is of only theoretical interest; we trade performance for
productivity all the time.

Programming with pure functions will involve more copying of data, and in some
cases this clearly makes it the incorrect implementation strategy due to
performance considerations. As an extreme example, you can write a pure
DrawTriangle() function that takes a framebuffer as a parameter and returns a
completely new framebuffer with the triangle drawn into it as a result. Don’t
do that._

[http://www.altdevblogaday.com/2012/04/26/functional-
programm...](http://www.altdevblogaday.com/2012/04/26/functional-programming-
in-c/)

~~~
lucian1900
Not _absolutely_ necessary. One could use a persistent data structure, where
the copy shares structure with the original. Of course even this has its cost,
but it makes mutable state much, much less necessary.

~~~
bunderbunder
For simple structures or situations with low performance needs, yes
definitely.

But the classic example of a data type for which persistent structures have a
hard time cutting the mustard is an associative array. With mutability you get
hash tables, which I assume need no introduction. With persistent data
structures you're generally stuck with some sort of balanced tree that has
O(log n) lookup time in the best case, O(log n) insert time (and memory
allocations) in the average case, a lot of pointer overhead, and absolutely
depressing locality of reference.

~~~
dllthomas
> absolutely depressing locality of reference.

If a hashtable gives you good locality of reference, your hash function isn't
working :-P

But yes, obviously you're hitting more nodes in the tree typically, so you're
paying a larger price for being cache-unfriendly.

------
ef4
''The ability to have "pockets of imperative mutability"... connected by a
"functional tissue," is not only clarifying, but works quite well in practice
for building large and complex concurrent systems.''

That's always been a winning combination, and if you look for it you'll see
that many examples of well-written code hold to the pattern: large swaths of
code that are referentially transparent, with smaller chunks of imperative
code piping them together.

This can be done in almost any language, but of course support from the
toolchain is nice to have.

------
rbanffy
I remember something like this in a Sun compiler. I don't remember exactly
what was being optimized, but it gave a significant boost to single threaded
program.

------
rgbrgb
Does anybody know of experiments with using auto-threading in the Haskell
compiler? Is there work underway to do this?

~~~
marshray
To some degree "it just works" today. For example, see the end of
[http://gisli.hamstur.is/2012/12/network-programming-in-
haske...](http://gisli.hamstur.is/2012/12/network-programming-in-haskell/)

 _By using Haskell you get multi-core support for free, so if you want to
distribute the request handling over the cores in your machine you simply
execute the server like so. Substitute x in -Nx with the number of cores you
want to use. $ ./network-server +RTS -Nx_

I don't think this gets you the same kind of fine-grained parallelism as the
research compiler. But I could be wrong.

~~~
jfoutz
If you only have one thread, you'll only use 1 core (maybe 2 with gc) A
network application is a great use case. Usually you'll have one thread per
connection.

In a more "normal" application you'll need to make threads. one very easy way
to do that is pmap, map a function over data and spark a thread for each
function invocation. But then you need to worry about the cost of creating
threads vs the amount of work done, etc. there are always tradeoffs.

------
dschiptsov
Mutability? Safety? Ah, it is drdobbs.com..)

