
Mio: A High-Performance Multicore IO Manager for GHC [pdf] - dons
http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask035-voellmy.pdf?haskellworkshop
======
fauigerzigerk
I've been wondering about garbage collection in pure functional languages
lately. It is often said that statically typed pure functional languages allow
the compiler to do a lot more correctness checking. Shouldn't the same formal
reasoning features also enable much better garbage collection algorithms? Or
would that be asking for a solution to the halting problem?

~~~
jallmann
With a purely linear type system, memory management could be handled
statically, negating the need for traditional GC. The underlying behavior
would be closer to C-style just-in-time {de-}allocation, or C++ RAII. LinearML
is an attempt at this:
[https://github.com/pikatchu/LinearML](https://github.com/pikatchu/LinearML)

~~~
ithkuil
That's awesome! Any good pointers to interesting articles about that? I found
this:
[http://www.eg.bucknell.edu/~lwittie/research/space04.pdf](http://www.eg.bucknell.edu/~lwittie/research/space04.pdf)

~~~
jallmann
Don't know of any articles, but there are a number of papers on linear logic
and linear type theory. Most are pretty dense, but the first few pages of
Wadler's "Linear types can change the world!" give a very readable overview of
the area.
[http://homepages.inf.ed.ac.uk/wadler/papers/linear/linear.ps](http://homepages.inf.ed.ac.uk/wadler/papers/linear/linear.ps)

------
quchen
The file also states that "Mio will be released as part of GHC 7.8.1", which
is scheduled around October if I recall correctly. Neat

~~~
gnuvince
Painful to hear about such wonderful developments when you are a Debian Stable
user :(

~~~
quchen
Installing GHC manually is not hard, the website even provides pre-compiled
binaries. I started doing it for different reasons, but was surprised how
painless the process was. After doing it multiple times, I wrote together a
file that guides me through the process, which you can find here:
[https://github.com/quchen/articles/blob/master/install_haske...](https://github.com/quchen/articles/blob/master/install_haskell_platform_manually.md)

~~~
gnuvince
Thanks for that nice guide! I was actually expecting to install GHC by hand
eventually (for now, 7.4 is sufficient).

~~~
SkyMarshal
Since you're on Debian you can use the awesome Update Alternatives to install
Haskell, which is an even better way in my opinion:

[https://github.com/byrongibson/scripts/blob/master/install/h...](https://github.com/byrongibson/scripts/blob/master/install/haskell/README.md)

Tons of advantages to doing it that way, including the ability to easily
maintain multiple versions of both GHC and Platform on the same system and
swap between them with a single command. Also doesn't clutter up _/
usr/local/_ with binaries not managed by apt or dpkg.

More generally, _update-alternatives_ is a godsend for Stable users. It's like
RVM or RBENV in the Ruby world - lets you install and manage multiple versions
of the same platform and easily swap between them with a single command,
including the system version from repos if you want. It frees you from out of
date software in the repos while still providing a system to manage the
complexities of it all.

------
bsaul
Does anyone know why functional programming languages are so much favored by
research teams over other types of PL ? Every time i hear someone doing PL
research, it's always on some kind of functional programming language such as
Haskell or ML.

~~~
tomp
It's not about functional programming languages per se, it's just that a lot
of core/academic PL research often focuses much more on _concepts_ than on
_implementation_ , and languages with strong type systems (all of which are
functional, in part also because OO/subtyping is hard to reason about
formally) are ideal for encoding these concepts.

There is, however, a lot of research about _implementation_ going on using
other languages as well. Most GC research, for example, is done using JVM and
Java programs. For this particular paper, it would be really hard to choose a
different platform, though, because GHC is one of the rare industrial runtimes
that offer lightweight threads.

~~~
colanderman
> _(all of which are functional, in part also because OO /subtyping is hard to
> reason about formally)_

OO/subtyping is _NOT_ orthogonal to functional, nor is it inherently at odds
with formal reasoning. OCaml has both imperative and functional objects;
Haskell, Mercury, and Coq (a formal logic language) all support typeclasses,
which subsume most of OO and provide subtyping.

Additionally, strong typing has little to do with formal reasoning as well
(see: Lisp/Scheme/the lambda calculus). What makes functional languages
amenable to formal methods is that they are _functional_ and therefore
_referentially transparent_. Mathematical proofs carry no implicit concept of
"state"; hence if you are trying to prove anything about code in, say, C, you
need to augment the code with explicit state and remove all non-local effects.
(See: the Why language, which attempts to bridge this gap.)

Algebraic/generalized-algebraic type constructors used by most functional
languages don't hurt either, as they allow programs to construct complex terms
without relying on lower-level stateful abstractions such as memory
allocation.

~~~
tomp
Not inherently, no, but practically, yes. The type system of, say, Java, isn't
even formally sound; overridden functions can be covariant in parameter types,
which would not be permissible in a proper subtype-based type system such as
OCaml's.

~~~
colanderman
Ah yes, I forgot about Java's failings in this area.

------
DennisP
It's going to be interesting to revisit techempower's benchmarks when this
comes out.
[http://www.techempower.com/benchmarks/](http://www.techempower.com/benchmarks/)

~~~
est
First of all you have to write a C20M-capable client to replace ab.

And high concurrency tends to break MySQL easily. DB is always the bottleneck
here, always.

~~~
IanChiles
So maybe it's time for a SQL database written in Haskell to take advantage of
these new developments? (both Mio and Intel's HRC)

~~~
AaronFriel
Probably not, the limiting factor for most database systems, even of the plain
old boring scale-up not scale-out SQL kind, is disk IO. Before SSDs and
virtualized storage, it was all about having cabinets full of disks (usually
in expensive SANs) for fast database applications. Enterprise hard disk drives
only gave you low triple digit IOPS, I think the highest I've ever read in a
2.5" HDD is about 450 IOPS.

If you wanted data resiliency, you used RAID60 or RAID10 or some exotic
variant that would usually give you on the order of a 30-50% reduction in
capacity and sometimes as much as a 75% reduction in write IOPS - it's always
been the writes that have been killer, that's the stuff you need to make sure
persists. RAID6 with a crappy controller or on an overloaded good controller
would make you pay the write hole tax.

Commodity SSDs give you up to, if you're willing to take peak numbers, on the
order of 45,000 IOPS. To get to that kind of performance in the last decade,
you'd need a cabinet with between 200 and 400 2.5" HDDs, depending on
everything involved.

Now, that's all 2000s era stuff. The future is bright, thanks to virtualized
storage arrays turning commodity storage into massive virtual SANs with tiered
storage on consumer SSDs being used like race tires, consumables replaced
periodically to maintain top performance, and so on. Even so, the biggest
benchmarks I've seen for enterprise systems with multiple 10GbE adapters
running raw disk IO has been on the order of 1 million IOPS. I'm sure exotic
systems will put out higher numbers, after all there are machines that support
over 100 physical CPUs, but they aren't attainable and certainly aren't a
broad market.

All of those words were said to essentially say that databases have almost
always been constrained by disk IO. The SDN controller in the paper? Limited
if any disk IO. A SQL database? It probably could be written in naive, pre-
java.nio Java and still IO limitations would dwarf the overhead of the
language. (There are exceptions, I'm ignoring certain overheads, etc.)

Maybe a NoSQL solution in Haskell would scream with this new IO manager, but I
suspect it's going to be correctness rather than performance that drives
anyone to implement their database in Haskell right now.

------
alperakgun
Is this a research paper, or a currently available library where I can
download somewhere?

~~~
tazjin
The GHC release containing this is scheduled for release in 1-2 months. You
can already build it from source though.

