
Haskell in the Large [pdf] - kryptiskt
http://code.haskell.org/~dons/talks/dons-google-2015-01-27.pdf
======
spopejoy
_“Make illegal states unrepresentable”_

 _Types pay off the most on large systems. Architectural requirements captured
formally._

This more than ANYTHING else is why I want to move enterprisey app code to
Haskell. Having worked on numerous ginormous enterprisey systems -- which are
usually doing pretty straightforward things, just at scale, and needing to be
maintained by non-brilliant developers -- I can say pretty securely that north
of 95% of the invariants could be lifted into the type system, making run-time
errors a thing of the past.

Also, in most frameworky big systems, you WANT to stop devs from "just doing
IO" or pretty much doing anything without a strong contract around it. Monad
stacks do wonders for circumscribing your computational context in an app.

I wonder why they rewrote Aeson though ... perhaps before it was mature?
Aeson's pretty awesome.

~~~
marmaduke
_invariants could be lifted into the type system_

Is there a reason this is only possible in Haskell, or does Haskell just make
it super convenient / idiomatic?

~~~
yummyfajitas
Haskell/Scala/etc makes it easy to nudge other developers into avoiding
mistakes. A concrete example:

I had a Scala system (build on Scalaz, which is a library providing Haskell
for Scala), and one of our core types was DBTransaction[_] (a monad). A
developer (a skeptic of the type system) was complaining to me about all the
excess work he needed to go through, how he couldn't get properly construct a
LazyStream[Foo] as a result of this.

He wanted to construct a DBTransaction[LazyStream[Foo]], call runTransaction
on it, and get the LazyStream[Foo] out. Then he was going to call
f(lazyStream). The compiler just wouldn't allow this, so his "workaround" was
to instead call lazyStream.map(f).

Turns out this workaround prevented a runtime error. If he did get his
LazyStream[Foo] out, generating the next element in the stream would have
called resultSet.next() _after closing the connection_.

This sort of thing happened quite often. People would complain that the type
system made it harder for them to do what they wanted. They'd ask the FP
"guru" types how to fix it and the "guru" would point out that what they
wanted to do was fundamentally unsafe.

~~~
marmaduke
So if I understand correctly, the real point is instead of programming
defensively at runtime, you can do it in the type system.

My main question is the extent to which this is possible in more conventional
languages.

~~~
tel
Just imagine that every static constraint you want to encode has to be written
in the language of the types, a subset of your chosen language.

C's language of types is incredibly primitive. Haskell's is quite nice.
Agda/Idris/Coq's type language is technically equivalent to its value language
so you can encode incredible things.

It turns out that due to people's general desire for compilation to always
terminate that the type language behaves quite different from the value
language. It also turns out that the type language operates differently
because we're more interested in logical constraints than actual evaluation.

The major thing that the above change is that you no longer have the "it's
always a Turing complete language" excuse. Type languages can actually
significantly and meaningfully differ in power.

So it's meaningful to say that there are constraints that can be encoded
statically in Haskell but cannot in C (no matter how you try). And it's also
true that Coq can encode constraints that cannot be encoded in Haskell (though
Haskell keeps making its type language stronger!).

~~~
AnimalMuppet
"It turns out that due to people's general desire for compilation to always
terminate that the type language behaves quite different from the value
language. It also turns out that the type language operates differently
because we're more interested in logical constraints than actual evaluation."

If I understand you correctly, if the type system behaves the same as the
value system, then it is possible for the compilation to never terminate. Do I
have that right?

~~~
codygman
Unrelated but you don't have contact information and this is the most related
thread I could find. Here is context:

" I've been trying to get my mind around this kind of stuff for maybe a year.
And I have to say that monads don't look like the solution to any of the
problems that I actually face or have ever faced, in a thirty-year career." \-
[https://news.ycombinator.com/item?id=8815973](https://news.ycombinator.com/item?id=8815973)

You might be interested in this link:

[http://haskellrescue.blogspot.com/2011/03/cooking-
delicious-...](http://haskellrescue.blogspot.com/2011/03/cooking-delicious-
fish.html)

If you respond quickly enough and I can delete this, I will ;)

------
saosebastiao
I've toyed with Haskell, ultimately moving on to the Ocaml/F# camp. There are
only two things that I miss from Haskell without an appropriate equivalent or
easy workaround: Type Classes, and Higher Kinded Types.

This big roadblock that everyone claims with Haskell, Monads, didn't give me
any problems at all...even if it didn't make any sense to resort to them to do
something as trivial as IO. What really turned me off more than anything was
the combination of the academic focus of the community combined with this
weird culture that I could only describe as a cleverness competition.

I'm a pragmatist, and it is obvious that real software can be created with
Haskell, but it doesn't really come through in the tutorials or books. There
is a lot of "Look, this is really cool!", but rarely a follow up with "and
this is why it matters!". Everything seemed to revolve around cool tricks with
no practical concern behind them, and then in the comments you inevitably find
comments from other Haskellers claiming to do it just a little bit more
cleverly using <<lenses, GADTs, zippers, continuations, or some other obscure
abstraction>>.

Ultimately it was Ocaml and F# that taught me all of the really important
lessons of the ML family, despite the relative lack of learning resources out
there. It was there that I found the benefits of making illegal states
unrepresentable, expressive pattern matching, type inference, composition over
inheritance, etc.

It is a shame, because as languages and runtimes, they probably are inferior
to Haskell. Ocaml does fine with concurrency, but parallelism is a disaster.
Its class system, its one major distinction from SML, is mostly considered a
code smell. Its stance on operator overloading requires me to keep a table of
operators handy, instead of the usual intuitive ones. And F# has the sane
operators and parallelism story but doesn't have SML/Ocaml Functors, and is
still a Windows-first ecosystem (Mono has gotten a lot better, but its still a
kludge).

~~~
Ironballs
The Windows-first status of F# is changing. There have been major improvements
in this recently, most notably, the whole compiler being open sourced and its
development moved to github[1], and .NET is going to be available on Linux[2].
This is not reality yet, but it will be during this year.

[1]
[https://github.com/microsoft/visualfsharp](https://github.com/microsoft/visualfsharp)
[2]
[http://www.hanselman.com/blog/AnnouncingNET2015NETasOpenSour...](http://www.hanselman.com/blog/AnnouncingNET2015NETasOpenSourceNETonMacandLinuxandVisualStudioCommunity.aspx)

~~~
lucian1900
It is indeed not bad on unix. The only major flaw left is the lack of higher-
kinded types.

------
melling
Has anyone used both Haskell and OCaml (or F#?)? How do they compare in
practice? I've been wanting to put some time into a functional language and
I've been debating between Haskell and OCaml. Because of F#, OCaml seems like
it might be the more practical language (i.e. direct job opportunities).
However, excluding F#, Haskell does seem to much more popular than OCaml.

~~~
michaelochurch
I tend to think of OCaml as a functional C. It's strict and not purely
functional. You can write for-loops and while-loops if you want (but you
shouldn't) and use refs to have mutable state (but you shouldn't). It has a
better module system than Haskell, but doesn't have type classes. It has
functors, which are equivalent-- OCaml's functor is an operation over modules
and only very loosely connected (through type theory, which isn't essential to
being proficient in either language) to Haskell's Functor type class-- and
better in some ways and worse in others.

Haskell is more expressive and has a much more powerful type system, but it's
probably harder to reason about performance.

OCaml's biggest issue (note: I may be out of date on this, since I haven't
heavily used it since the late 2000s) is the GIL. This probably limits your
ability to use it for multithreaded programming, but it can compile down to
extremely fast single-threaded executables.

~~~
sea6ear
This book actually teaches C in relationship to previous assumed knowledge of
Standard ML (not quite OCaml, but close).

[http://eprints.eemcs.utwente.nl/1077/02/book.pdf](http://eprints.eemcs.utwente.nl/1077/02/book.pdf)

I'm not sure I agree with all of the conventions, but it's interesting to see
a deliberately functional approach applied to C code.

------
dkarapetyan
I see. So all you need are compiler/interpreter experts that can turn any
problem into a interpreter/compiler problem and you're golden.

I'm not saying the approach is not worthwhile but how exactly does this
generalize to other workplaces where there is no critical mass of such
experts? I mean they have their own compiler for Haskell for Pete's sake. I
would also like to know how many of the core team members have PhDs and MScs.
We can check off Don and Lennart. Maybe Don is really measuring the effects of
PhDs in language/compiler design on how to manage complexity?

Alternatively, Facebook has been experimenting with Haskell and OCaml. Seeing
their case studies would be valuable as well.

~~~
eru
> I would also like to know how many of the core team members have PhDs and
> MScs.

Don't forget the MDs! (Not joking, one of the people Don's team has an MD.)

I used to work for Don at Standard Chartered. Getting good Haskellers seemed
way easier than the hiring efforts of my current employer (Google) focussing
on more traditional languages.

But I guess, that's mostly a function of pent up demand for Haskell jobs.

~~~
dkarapetyan
Where is all this pent up demand? I mean I get what you're saying. Most
academics know Haskell and Standard Chartered hires academics but there is a
bit of circular thing going on there.

~~~
eru
Oh, there are quite a lot of programmers working with more conventional
languages in their day job, but are using Haskell as a hobby. They are easy to
hire with the lure of Haskell.

Since moving away from Standard Chartered I have turned into one of them.

------
boothead
This

    
    
        An awful lot of data mining and analysis is best done with relational algebra
    

Is an interesting statement. Would anyone care to elaborate?

~~~
tome
I think it means that a plethora operations you perform on sets, lists, maps,
hash maps and indeed collections of all kinds are just specific
implementations of the general "relational algebra" operations.

------
olenhad
I wonder what's the primary reason for their "Mu" compiler adopting a "strict-
ish" evaluation strategy.

~~~
jlarocco
I don't know, but in my limited experience, lazy evaluation makes memory use
worse (usually not much), but more importantly makes performance (time and
memory) harder to reason about, because you don't easily know when something
will actually evaluate.

Besides that, there's also not much practical gain from it, IMO. One commonly
cited benefit is a function that doesn't use all of it's arguments, therefore
saving computation time when they're not evaluated. But realistically, an
unused parameter should probably be removed.

~~~
the_af
> But realistically, an unused parameter should probably be removed.

Think about the function _if-then-else_ , which you may be familiar with from
your favorite language :)

    
    
        if-then-else true branch1 branch2 = branch1
        if-then-else false branch1 branch2 = branch2
    

Obviously you don't want both branches evaluated in any given invocation, and
obviously you cannot remove the unused parameter. Note that the purpose is not
to "save computation", since for example _branch1_ may be undefined if _cond_
is false!

When using a language with support for lazy evaluation, you encounter this
kind of functions _all the time_.

~~~
tel
Of course, in OCaml that's just

    
    
        if_then_else c t e = 
          match c with
          | true  -> t ()
          | false -> e ()
       

So the question sort of becomes one of how painful thunking or anonymous
function syntax is.

------
thu
Lazy vs strict evaluation: Lennart, the author of the Mu compiler, has written
about it: [http://augustss.blogspot.co.uk/2011/05/more-points-for-
lazy-...](http://augustss.blogspot.co.uk/2011/05/more-points-for-lazy-
evaluation-in.html)

------
jongraehl
I wonder why they don't just use GHC with some customization to make the
default eval strategy less lazy. I understand why someone might prefer more
predictable (strict) evals by default.

~~~
tome
They were targeting an already existing runtime for on older in-house
functional language, and the runtime only supported strict evaluation.

------
taeric
"Your choice of programming language can have a real effect on these results
over time." Probably more accurate to say "Choice of programming team can have
a real effect on these results over time."

I want to like Haskell. However, I have given up on wanting to dislike Java.
To the point that it is painful to read such things as "you can write Java in
any language."

~~~
the_af
To be fair, that's one sentence in the second to last slide of 42 slides. The
presentation, at least what I can see in the slides, is hardly Java bashing :)
There are 3 mentions of Java in the whole presentation, only one of which has
negative connotations: the one you quoted.

~~~
taeric
Yeah, the Java bashing I could (and, admittedly, should) have ignored. It is
more the crediting the language for the success that I would rather focus on.
It is a fairly strong assertion at the end, that I feel needs more support.
Back when fewer teams were writing Java applications, I feel similar
advantages were felt for it.

Of course, I still pine for lisp, so I can not claim to have no biases. (Well,
that and MMIX)

~~~
kyllo
The entire Java community pines for lisp, whether they know it or not. That's
why Apache Ant was invented, it's Java's manifestation of Greenspun's Tenth
Rule.

~~~
taeric
Is it confined to the Java community? Seems a bit more widespread, honestly.

~~~
kyllo
Well yeah, that's why the original Greenspun's tenth rule cited "Any
sufficiently complicated C or Fortran program..."

Those who do not learn from history are doomed to repeat it. That is the only
logical explanation for why XML even exists.

------
michaelochurch
I'm really happy to hear about Standard Chartered's success with this, and I
want to know more. This is really promising stuff.

My current company is looking into our "next generation" platform for when our
datasets exceed what we can do in R. R may not be the best language, but it's
great for exploratory data science, has the best or the only library out there
for some ML purposes, and we've done a lot of things to make it production-
worthy (on small- and medium-data). We'll probably need to involve something
else, down the road, when our data sets get larger than what fits on one box.
The leading candidates are Haskell, Clojure, and Scala (Scala because of
Spark). I'll have to evaluate the languages fairly and relative to our needs,
but I hope Haskell wins for a number of reasons, including the fact that
Chicago + Haskell is an unfilled niche and we'd attract a ton of talent.

For those who've taken Haskell this far into production: have you encountered
any negatives? Are there any times when you think it might be better not to
have a strong type system?

To me, the biggest drawback of Haskell isn't anything intrinsic to the
language, but the amount of stuff it forces a person to learn. For me, that's
a fun challenge... but trying to convince 110 programmers to use a language
that forces I/O into a set of types (loosely) called a monad seems like an
epic task. Clojure has the advantage of being simple and beautiful once you
get past the parentheses. Haskell is demanding and frustrating for the first 6
months (and pays off handsomely later on, but this can make it a hard sell).

Also, how does the Any type in Mu (if anyone familiar with it is here) differ
from Data.Dynamic?

~~~
dasmoth
Not something I've used myself yet, but is there any particular reason Julia
isn't on that list?

~~~
yrlson
I haven't checked on Julia for like a year, but back then Julia didn't offer
anything above and beyond R when it comes to "big" (really just not fitting
into memory) data. As it is Spark with Scala is both faster and more
convenient than R or Julia.

~~~
Jcol2
Spark has a python API and python has a bigger ecosystem for exploratory data
analysis than spark.

Also python has a big data interface with out of core matrices and tables as
well as a compiler than can speed numerical code, with a just a function
decorator, to C like speeds. The former is an interface to the latter.

[http://blaze.pydata.org/docs/dev/index.html](http://blaze.pydata.org/docs/dev/index.html)
[https://github.com/numba/numba](https://github.com/numba/numba)

