Hacker News new | comments | show | ask | jobs | submit login
Haskell in the Large [pdf] (haskell.org)
187 points by kryptiskt on Jan 28, 2015 | hide | past | web | favorite | 138 comments

“Make illegal states unrepresentable”

Types pay off the most on large systems. Architectural requirements captured formally.

This more than ANYTHING else is why I want to move enterprisey app code to Haskell. Having worked on numerous ginormous enterprisey systems -- which are usually doing pretty straightforward things, just at scale, and needing to be maintained by non-brilliant developers -- I can say pretty securely that north of 95% of the invariants could be lifted into the type system, making run-time errors a thing of the past.

Also, in most frameworky big systems, you WANT to stop devs from "just doing IO" or pretty much doing anything without a strong contract around it. Monad stacks do wonders for circumscribing your computational context in an app.

I wonder why they rewrote Aeson though ... perhaps before it was mature? Aeson's pretty awesome.

It might have to do with the way the default generics for aesons fromjson and tojson instances for record types are defined? Or maybe the lack of incremental streaming read write support in the aeson data model? Or maybe they wanted better date or integer support? It'd be pretty easy to ask don or Lennart I'd imagine.

I have to say that I definitely enjoy using haskell when faced with dirty complicated data, which I'm told happens a lot in larger organizations.

> Monad stacks do wonders for circumscribing your computational context in an app.

Wait until you discover the power of Applicative Functors. They are more restricted in their operations than Monads, so they compose better and allow for analysis.

Mathematically speaking applicatives are less restrictive since more things can be applicatives than monads. A monad is a "very special" kind of functor and that is why a composition of monads often fails to satisfy the monad laws. The least restricted thing in this hierarchy is the functor since composition and other kinds of operations always give you back another functor.

I'm curious, what do you mean they compose better and allow for analysis [better than monads]?

You can determine a lot more about a computation described only in terms of an applicative functor as you can from a monadic one, without running any of its effects (i.e. "statically").

It's easy to see why when you consider the opaqueness of the function in the right-hand side of a bind. Check out http://gergo.erdi.hu/blog/2012-12-01-static_analysis_with_ap... for an example.

gergoerdi covered analysis, but composition is simple. For any two Applicatives F and G and value type A the following 4 values are all Applicative values

    F A
    G A
    F (G A)
    G (F A)
But for monads only the first two are monads for general monads F and G. So, Applicatives compose better!

And to get F (G A) with Monads, F needs to be a Monad transformer.

In addition to composition, products of Applicatives are also Applicatives:

(F A, G A)

Even for monad transformers it's not necessarily the case that direct composition leads to a monad. For instance, ContT looks like

    newtype ContT r m a = ContT ((a -> m r) -> m r)
so ContT M A is not the same as Cont (M A).

Monads also compose under products as two parallel monadic computations

    data (f * g) a = Pair (f a) (g a)

    p1 :: (f * g) a -> f a
    p1 (Pair fa _) = fa

    p2 :: (f * g) a -> g a
    p2 (Pair _ ga) = ga

    instance (Monad f, Monad g) => Monad (f * g) where
      return a = Pair (return a) (return a)
      Pair fa ga >>= k = Pair (fa >>= p1 . k) (ga >>= p2 . k)
You can also talk about composition under sums

    data (f + g) a = Inl (f a) | Inr (g a)
and you'll get compositions of applicatives here so long as you have a notion of natural transformation

    class Natural f g where
      phi :: f a -> g a

    instance (Applicative f, Applicative g, Natural f g) => Applicative (f + g) where
      pure a = Inl (pure a)
      Inl ff <*> Inl fa = Inl (    ff <*>     fa)
      Inr gf <*> Inr ga = Inr (    gf <*>     ga)
      Inl ff <*> Inr ga = Inr (phi ff <*>     ga)
      Inr gf <*> Inl fa = Inr (    gf <*> phi fa)
but there's no way to get a similar monad. It turns out that you have to "know what branch to go down before you start" in exactly the kind of way applicatives allow and monad preclude.

A sidenote: There is nothing in Java/C++ etc that would stop one from making only legal states representable. It's just easier and more obvious in ML family...

That's only true in the most trivial Turing Tarpit sense, though. Ease of use and defaults matter in practice.

E.g. you can do compile-time exhaustiveness-checked pattern matching using the visitor pattern in Java/C++, but try doing that in practice and the amount of boilerplate just becomes unbearable -- and actually obscures (rather than illuminate) the essence of the data structure.

There's also the lack of higher-kinded types. (Or just plain syntactic pain of using HKTs e.g. in Scala.)

Additionally, the Turing Tarpit doesn't apply in the level of the type language. These languages actually differ significantly in power since they're not Turing complete.

Thus, there are certainly non-trivial and interesting static invariants which you can encode in Haskell and cannot in Java.

The most critical and strong common one is probably parametricity.

Oh, certainly. I was just alluding to the futility of the comparison :).

Interestingly[1], the power of the type system seems to conflict somewhat with type inference. Or, at least, that a Turing Complete type-level language like in Shen or Idris does require you to spell out increasing amounts of type information. Obviously, this has to do with undecidability of TC languages, but I kind of think it's interesting that e.g. Haskell seems to occupy a kind of "sweet spot" in terms of inference and being able to tell the programmer when it needs a type ascription. AFAICT that would be impossible in a language whose type system is a-priory undecidable, correct? (Just to spell it out: In Haskell, the compiler knows that if you want feature X, Y and Z, then constructs A, B and C become undecidable... and can tell you so.)

[1] Well, I found it interesting, but I'm a layman at best.

Absolutely! The more information a type system can express the less likely it is to guess correctly at what you want just based on your value-level representation of intent.

Which is really fascinating when you think about it. It basically expresses (what we all know) that your code captures only a fragment of the intent of your efforts. Good types can capture much more intent.

So to that end, people explore interactive development. In this scenario, you provide a (potentially partial) type corresponding to the programmatic intent you desire and the compiler conducts proof search to find programs which satisfy that type!

The types are actually more informative than the programs and it provides some evidence that we'd really rather be writing at higher-level typed languages and letting the implementations fall out naturally rather than writing the implementations and hoping that inference can build the proper types.

Or something like that.

Yes, I've found some of the published Idris videos quite instructive in this regard. Sometimes it can even infer the program fragment that you were supposed to write (given unification and uniqueness constraints), but it seems limited only to demo-level code and frankly at this point I'd be wondering if I got the types wrong and let the compiler infer bad code! (As opposed to getting the program wrong!)

Programming is weird.

Agda's emacs-mode is probably the most developed form of this available today, although Coq's tactics are doing roughly the same thing.

Types can't be wrong in the same way that values can be. Rather, to be more specific, in any circumstance where program derivation could even remotely succeed you will have had to have been specific enough with your types that they cannot easily produce the wrong code. It simply would fail to typecheck---at least eventually.

I wouldn't say that program derivation is limited to demo-level code, I would just say that you cannot expect proof search to achieve large fragments of code. It's just too large of a space!

Instead, it's better to think of it as an interactive game. You describe types as best as you can leaving holes where you haven't figured out what you want. The compiler responds telling you information and guesses about those holes. Furthermore, if you can nail down what you must prove sufficiently well to have confidence that it's the right thing then the compiler can probably do the trivial final details for you.

If you look through Conor McBride's (admittedly vast) literature then you'll find many things discussing this idea with respect to his (now defunct) experimental dependently typed language called Epigram.

I really like this analogy to interactive games, thanks!

I kind of see, logically, that the types can't be wrong in quite as many ways as the code/values/terms, but that's still a (possibly) infinite amount of wrong to contend with! Yes, we can, and frequently do, narrow it down, but still... :)

I've been aware of Mr. McBride's literature for some time, but I haven't ventured in beyond a cursory look at that ridiculously-well-punned Sinatra one. ("Do be do"... something?). It's a bit beyond what I can manage, in theory terms, at the moment.

EDIT: Sorry to sound like an excited puppy, but I don't think the time/place was quite right for me to have gone into or explored this field as I would have liked to as an undergraduate.

The thing about being wrong with types is that the end result is an immediate compiler failure. It can be difficult to figure out the right types, but once you do they're certain to behave properly. Thus, the penalty for wrongness is nearly nothing at all.

You face the infinite amount of wrong by working really hard to understand the right.

If you're wrong in values it'll probably still work for a while until it suddenly doesn't for rather mysterious reasons. This can be way down the line, undetected for a long time if not forever. Thus, the pain of being wrong is potentially infinite as well.

So, programming in types is no easier. It's probably even harder today since so few people do it. But the consequences are amazingly different.

And yeah, with Conor's work its "come for the puns, stay for the mind blowing computer science"

> the penalty for wrongness is nearly nothing at all.

Well, I think Edwin Brady showed a couple of C++-template-errmsg-like-things during a couple of his talks, but having actually programmed semi-advanced C++-template things I think I can agree on the general thrust that type-level problems are in some sense "better problems to have".

Btw, thanks for the exchange. (This is getting too off-topic, so I'll leave it at that.)

Very true. I usually avoid complex constructs in C++ - especially in production code and strive to strike a balance between KISS and YAGNI. KISS comes first unless there is some obvious boilerplate that can be avoided...

Encoding only valid states in subareas of code can be done within C++ without extravagant constructs. Ie. instead of passing around a Car with a Car::isDriving member implement a DrivingCar class and use that where this state of car matters. It's not as neat as ADT:s in ML but provide the same Lego-block static composability with minimum amount of moving parts.

I would not use pattern matching as such in C++.

invariants could be lifted into the type system

Is there a reason this is only possible in Haskell, or does Haskell just make it super convenient / idiomatic?

Haskell/Scala/etc makes it easy to nudge other developers into avoiding mistakes. A concrete example:

I had a Scala system (build on Scalaz, which is a library providing Haskell for Scala), and one of our core types was DBTransaction[_] (a monad). A developer (a skeptic of the type system) was complaining to me about all the excess work he needed to go through, how he couldn't get properly construct a LazyStream[Foo] as a result of this.

He wanted to construct a DBTransaction[LazyStream[Foo]], call runTransaction on it, and get the LazyStream[Foo] out. Then he was going to call f(lazyStream). The compiler just wouldn't allow this, so his "workaround" was to instead call lazyStream.map(f).

Turns out this workaround prevented a runtime error. If he did get his LazyStream[Foo] out, generating the next element in the stream would have called resultSet.next() after closing the connection.

This sort of thing happened quite often. People would complain that the type system made it harder for them to do what they wanted. They'd ask the FP "guru" types how to fix it and the "guru" would point out that what they wanted to do was fundamentally unsafe.

So if I understand correctly, the real point is instead of programming defensively at runtime, you can do it in the type system.

My main question is the extent to which this is possible in more conventional languages.

Just imagine that every static constraint you want to encode has to be written in the language of the types, a subset of your chosen language.

C's language of types is incredibly primitive. Haskell's is quite nice. Agda/Idris/Coq's type language is technically equivalent to its value language so you can encode incredible things.

It turns out that due to people's general desire for compilation to always terminate that the type language behaves quite different from the value language. It also turns out that the type language operates differently because we're more interested in logical constraints than actual evaluation.

The major thing that the above change is that you no longer have the "it's always a Turing complete language" excuse. Type languages can actually significantly and meaningfully differ in power.

So it's meaningful to say that there are constraints that can be encoded statically in Haskell but cannot in C (no matter how you try). And it's also true that Coq can encode constraints that cannot be encoded in Haskell (though Haskell keeps making its type language stronger!).

Two asides, which I don't think you missed but I just wanted to expand on...

First, "Turing complete" means that you can compute anything anyone else can compute. It doesn't necessarily mean you can do it with a reasonable encoding, or do with it what you need to do with it. At the boundary between systems, encoding matters quite a lot!

Second, it's certainly true that there are constraints you can express in Haskell and not in C, but I was surprised by some of the things I could enforce in C with a little creativity.

I bring up the lack of turing completeness because in the space of type languages you can make even more powerful arguments. What you mention about reasonable encodings is sufficient, but we can get more strength.

And, yeah... even a type system as primitive as C can catch interesting invariants. This is an important counterpoint to the general idea that "types never say anything interesting".

"I bring up the lack of turing completeness because in the space of type languages you can make even more powerful arguments. What you mention about reasonable encodings is sufficient, but we can get more strength."

Certainly. I had every confidence you understood that. I just wanted to be sure we avoided strengthening the "Turing complete means it can do anything any other language can" misconception.

Ah ah, fair---certainly never worth reinforcing the turing tar pit argument!

"It turns out that due to people's general desire for compilation to always terminate that the type language behaves quite different from the value language. It also turns out that the type language operates differently because we're more interested in logical constraints than actual evaluation."

If I understand you correctly, if the type system behaves the same as the value system, then it is possible for the compilation to never terminate. Do I have that right?

Unrelated but you don't have contact information and this is the most related thread I could find. Here is context:

" I've been trying to get my mind around this kind of stuff for maybe a year. And I have to say that monads don't look like the solution to any of the problems that I actually face or have ever faced, in a thirty-year career." - https://news.ycombinator.com/item?id=8815973

You might be interested in this link:


If you respond quickly enough and I can delete this, I will ;)

It's possible, but typically what instead happens is that people sacrifice Turing Completeness in the value language. It turns out that this isn't so painful (strong types help a lot) and then you have guaranteed termination all around!

Would you then accept a worthless value language if the type language was perfect?

Anyway, this was quite an enlightening comment for me: Agda/Idris/Coq's type language is technically equivalent to its value language.

Given a sufficiently interesting type language, could we statically type the problem I posed in another comment here



I'm not sure anyone has developed a language with a great type level language and a worthless value level one. But that said, theorem provers (like Coq) essentially consider value-level computation as vestigial: nobody cares about running Coq programs, just checking them.

To encode what you're looking for requires type-level natural numbers and a reasonable notion of type-level equality. These are fairly non-trivial. You can encode this in Haskell, but type leve" equality is a bit hard to work with so you may suffer some pains (I know, I've done it several times!). Agfa and Idris would be where I would look to see this sort of thing materialize.

That depends. Can I get the results directly from the compiler, or must I run the actual program?

About your example, it's not idiomatic Haskell. If you try to code the same exact algithm in Haskell, you'll have the same exact problem (ghc has a warning for it); if you recode it in fmap or list pattern matching that are idiomatic to Haskell, you'll avoid the problem.

Yep - that's probably a fair way to describe it. It's of course possible in other statically typed languages, it's just not as easy.

There are also some things that I'm not sure are reasonable at all in something like Java/C++ without writing a lot of code. For example: https://www.chrisstucchio.com/blog/2014/type_safe_vector_add...

I would say that it's not always possible. It's probably "possible" if you put about half the invariant in the type and half the invariant in documentation, of course, but that's a much weaker claim!

Java is in an interesting place in that its type system is rudimentary but not unpowerful. C is a more useful didactic target---there really do exist invariants which can be encoded in Haskell types but could never be encoded in C types!

Another huge target is parametricity. You can write Java code which is "maybe sort of parametric" but it really isn't and subsequently you can never trust it because of the existence of things like global reference equality and hash codes (in the least).

Finally, there are expressivity concerns. Java cannot encode as many things in its type language as something like Agda (to make a clear comparison). You can simply note that Java's type equality is nowhere nearly as developed as Agda's!

As an example of using Java to indicate a fairly non-trivial invariant and comparing that implementation to Scala, Haskell, and C# consider Tony Morris' challenge


Thanks for the link. I've been wondering how to lift N-D operations into the type system. If I write with NumPy

  x = r_[:n].reshape((3, -1, n/15))[:, :2].sum(axis=-1)
a compiler could figure out that the shape of x is (3, 2) and allow

  y = x + randn(m, 1, 2)
while forbidding

  z = y - r_[:7]
With NumPy you have to wait for a runtime error, and only if the shapes can't be broadcast. We're not even talking correct use of dimensions, etc.

I suppose this may have been tackled in Haskell in the Repa library, but I'd need to read papers like


to know. Hence my question if we could lift this into the type system in a conventional language, but I don't think it's possible in a general way.

I'm not sure what the notation you're using represents, but it looks like you're talking about matrices of a known size. What that's an example of is dependent types, where types can be parameterized by values, and not just types. For example, a (2D) matrix is parameterized by what it contains, and its dimensions. The former is a type but the latter are values. Without dependent types (or some subset of their capability) you can't represent this matrix type; you'd just have to check at runtime that no out-of-bounds access was occurring, so no; I don't think you could put this in a conventional language (without some crazy gymnastics, perhaps involving macros or similar). However, with dependent types, this is absolutely possible to statically guarantee that no out-of-bounds access will occur at runtime.

Another way to think about it is kinda related to pg's stratified design. At the lowest level, you have things like system services, files, sockets, perhaps other processes. Regular old code may use those abstractions - or not - and build new stuff. for example parsers, matrixes, http responses, database connections.

Just like regular old code helps you manage the relationships between an input file and an output file, the type system helps you manage the relationship between libraries, or sets of functionality in your code. I don't care what type of data you pull from the database, but that type must agree with the type of matrix you're constructing. The parser may return a syntactically correct tree, or an error. One use of the type system is to ensure that all possible error conditions are handled in a meaningful way.

You can do this in java, but haskell's type system is a bit more expressive, so it's easier to enforce higher level constraints.

Another way to look at it, the type system is like the algebra axioms you want your system to follow. if equality is reflexive, a == b, then also b == a. At that level we don't care if a is 5 or "hello" or @TcpConnection(0x1342341a).

"My main question is the extent to which this is possible in more conventional languages."

It is substantially possible in more conventional languages, however the conventional languages have holes in them which can not be filled in by libraries.

For instance, a bog-standard approach in any OO language is to hide the raw constructor and give only mediated access via some other class method (or local equivalent construct). This can allow you to easily enforce constraints like "A CreditCardNumber either had its parity validated OR does not exist". If you then put methods on an object that allow you to only manipulate the object in certain ways that maintain the constraints, there will exist no (direct) way to violate the basic constraints of the object.

This is reasonably powerful, and mastery of this technique is something that I would consider to be core to considering yourself at least a mid-tier professional developer.

However, there are many constraints you can not enforce in conventional languages. You can not enforce via type whether or not a given call will do IO or access global variables in unexpected manners. You can not enforce via type that a given method will only be called in a certain context (key in things like transactional memory, where you'd better be doing your STM things inside an actual transaction or the whole thing breaks down). You can not enforce whether an object will or will not be shared across thread boundaries. And so on, for a wide variety of additional guarantees that can be provided by a stronger type system. In the exciting-but-experimental world of dependent typing, you can enforce at the type system that a number is even or odd and stuff like that.

There are many different additional constraints being explored right now across a wide variety of languages, and while functional languages are leading the way, and there are some solid reasons for that, it isn't just functional languages that can use this... for instance, see Rust, which isn't functional at all in the modern sense but can still guarantee some of the things I said above.

(Rust, for instance, bring a question to the fore that I've been interested in for a while, but haven't had a major language to check it with: Is the important thing about functional programming immutability, or is it controlling mutation carefully? If in practice the latter is really what's important, than it is possible to see a set of "immutable" languages and see them successfully control mutation but accidentally attribute that to the "immutability", because that's the mechanism we happened to be using to accomplish the mutation control. I've already previously explained why I believe Erlang definitely had this problem: https://news.ycombinator.com/item?id=7744109 but with Rust we can explore whether Haskell has the right of the argument, or if Rust-style mutation control will turn out in practice to be sufficient. But it will be years before we can even start forming a decent answer... Rust needs some large scale programs and a body of people with experience writing large-scale Rust programs before we can even start forming a solid answer.)

For what it's worth, the more I have used Rust, the more I have started to think it really is control of mutation ("effective referential transparency") that is important. But given how restrictive Rust is compared to an immutable garbage collected language like Haskell, I'm not sure that will end up being more than an academic question unless you can't afford GC. That is, I think it's easier to just do "immutable by default" but not have to worry about aliasing or lifetimes, than it is to strictly control aliasing.

A toy example would be the following C code

    #include <math.h>
    double sine(double x){
      return sin(x);
which can be easily translated to Haskell

    sine x = do
      return (sin x)
So we can write the same stupid function in both languages. The bit that's specific to Haskell, as opposed to just any typed language, is that sine and sin have different types. sin is (Double -> Double) while sine is (Double -> IO Double), so swapping in sine for sin causes the code to fail to compile. Comparatively, you can drop in a random sine anywhere in your C code and the compiler will happily chug right along. While I've seen ways of implementing various invariant catching techniques like the Maybe type in C, I haven't seen anything that would catch the sine function.

Now, as I said, this is a toy example. The real solution would be to fire whoever wrote the sine function. However, in larger systems, it can be translated into invariants such as declaring that a function cannot access the database or send packets on the network.

The way I see it, immutability and purity are hard requirements for actually enforcing the contract, and a very robust type system is then a requirement to go from code that makes you want to shoot yourself in the head, to code that's incredibly expressive and easy to reason about. Haskell just happens to sit in that exact niche.

immutability and purity are hard requirements

I agree this is important. I have just started to move a project to Java from Python and see that immutability and purity are attainable for certain parts of the system but appear to cost a lot (leaning hard on use of interfaces, for example) in terms of readability.

Another thing is contract programming, but it is really hard to sell to managers where shipping cheap outweights quality.

I don't envision FP or contract based programming in enterprise offshore code.

Imagine a super-charged version of the method where Java Architects make class hierarchies and interfaces for offshore/contract programmers.

I think FP (and contract based programming) would do a much better job than Java for varying qualities of programmers on very large projects.

I've toyed with Haskell, ultimately moving on to the Ocaml/F# camp. There are only two things that I miss from Haskell without an appropriate equivalent or easy workaround: Type Classes, and Higher Kinded Types.

This big roadblock that everyone claims with Haskell, Monads, didn't give me any problems at all...even if it didn't make any sense to resort to them to do something as trivial as IO. What really turned me off more than anything was the combination of the academic focus of the community combined with this weird culture that I could only describe as a cleverness competition.

I'm a pragmatist, and it is obvious that real software can be created with Haskell, but it doesn't really come through in the tutorials or books. There is a lot of "Look, this is really cool!", but rarely a follow up with "and this is why it matters!". Everything seemed to revolve around cool tricks with no practical concern behind them, and then in the comments you inevitably find comments from other Haskellers claiming to do it just a little bit more cleverly using <<lenses, GADTs, zippers, continuations, or some other obscure abstraction>>.

Ultimately it was Ocaml and F# that taught me all of the really important lessons of the ML family, despite the relative lack of learning resources out there. It was there that I found the benefits of making illegal states unrepresentable, expressive pattern matching, type inference, composition over inheritance, etc.

It is a shame, because as languages and runtimes, they probably are inferior to Haskell. Ocaml does fine with concurrency, but parallelism is a disaster. Its class system, its one major distinction from SML, is mostly considered a code smell. Its stance on operator overloading requires me to keep a table of operators handy, instead of the usual intuitive ones. And F# has the sane operators and parallelism story but doesn't have SML/Ocaml Functors, and is still a Windows-first ecosystem (Mono has gotten a lot better, but its still a kludge).

The "and this is why it matters" is often left to be inferred by the reader. ;) General kinds of reasons include "preventing (whole classes of) mistakes" or "writing even more reusable code" or "making things fast".

I know the kind of thread you mean -- it happened to me last weekend https://www.reddit.com/r/haskell/comments/2tl8v1/making_prop... . But I really appreciated those suggestions about even more clever ways to make my code better, even though I have so far not incorporated any of them.

It helped me learn more about the shape of a space I'm only starting to explore, and since I'm exploring it for real-world reasons, the very abstract and hifaluting suggestions were easy to relate back to real-world concerns. Which is a great way to learn.

And I suspect several of the commenters there were commenting for exactly that reason -- if there's one thing the haskell community excells at, it's noticing when someone is in a receptive state (or coaxing them into one), and helping them learn.

The Windows-first status of F# is changing. There have been major improvements in this recently, most notably, the whole compiler being open sourced and its development moved to github[1], and .NET is going to be available on Linux[2]. This is not reality yet, but it will be during this year.

[1] https://github.com/microsoft/visualfsharp [2] http://www.hanselman.com/blog/AnnouncingNET2015NETasOpenSour...

It is indeed not bad on unix. The only major flaw left is the lack of higher-kinded types.

> This big roadblock that everyone claims with Haskell, Monads, didn't give me any problems at all...even if it didn't make any sense to resort to them to do something as trivial as IO.

That's a bit uncharitable. You're likely going to end up working with Lwt (monadic concurrency/IO library) as soon as you need to do something interesting with IO in OCaml.

> Everything seemed to revolve around cool tricks with no practical concern behind them, and then in the comments you inevitably find comments from other Haskellers claiming to do it just a little bit more cleverly using <<lenses, GADTs, zippers, continuations, or some other obscure abstraction>>.

While I agree that showering code in mathematical abstractions doesn't make for readable code, a zipper is not "cleverness for cleverness' sake". It lets you manipulate easily manipulate an immutable tree (and it doesn't decrease readability).

> Ocaml does fine with concurrency, but parallelism is a disaster.

Hopefully the multicore runtime will turn into something concrete eventually :)

It's also lacking real type classes, at least until modular implicits are ready.

Has anyone used both Haskell and OCaml (or F#?)? How do they compare in practice? I've been wanting to put some time into a functional language and I've been debating between Haskell and OCaml. Because of F#, OCaml seems like it might be the more practical language (i.e. direct job opportunities). However, excluding F#, Haskell does seem to much more popular than OCaml.

Either one is great. Short of it is that OCaml and Haskell will both teach you a lot of the same things. OCaml has better modularity features (namely, genuine modules which are replicated almost nowhere else) and Haskell has better purity features (namely... purity).

It turns out that sophisticated functional programming essentially stresses that good modularity and good purity are both killer features and techniques and they should be in heavy use all of the time. In OCaml you'll have an easy time expressing the exact modularity concerns you find important, but will struggle a bit to manage purity explicitly. In Haskell you'll find it beautiful to express pure and typed-effectful computation as much as needed, but will struggle a bit with expressing modularity as you'd like.

Haskell also has the whole typeclass system which is really, really interesting and valuable for many kinds of expressiveness. It's sort of an oddball feature, but interesting to see the impact of.

Personally, I find strict purity more important than expressive modularity. But if anyone could really put them together in a way that worked it'd be great. I'm not so sure such a language exists today, though.

> Personally, I find strict purity more important than expressive modularity. But if anyone could really put them together in a way that worked it'd be great. I'm not so sure such a language exists today, though.

There is Ur, which has both ML-style modules and Haskell-style type classes. It's also pure and strict. And it has advanced type-level programming features and record types. There's a lot to like.

There are a few problems with it, though. It's mostly a one-man project. The compiler is immense, complex and almost entirely devoid of documentation (it's pretty amazing actually, like 1 comment per 1000 lines or so), so contributing is hard. It's very difficult, if not impossible, to use the language for anything but web development: there's no way that I know to execute a program directly, rather than as a sub-program of the ur/web framework. The standard library is tiny and doesn't even include `print`. But even with those warts it has enough good things about it that it's worth checking out.

Oh! I've used Ur a bit and spoken to Adam about it briefly. He gave a talk on Ur at Haskell Boston. I wasn't actually aware that it had typeclasses (though it surely has modules).

Honestly, Ur is completely ingenious but there remains an enormous challenge getting it away from being entirely Adam's project. It's completely web-only right now (although Ur is supposedly a more generalized language) as the only compiler is Ur/Web.

By the way, Adam is apparently going to be a lecturer at OPLSS this summer. That kind of event seems right up your alley; have you ever gone? Any plans to go this summer? I'm hoping to attend.

I watch all the lectures! I really really really want to go but can never afford it timewise. :(

Yeah, it has type classes, although the syntax is a little different than how Haskell does it.

Agreed to all of that. I've tried to get into it several times, but I have almost no interest in web development. I'd really like to use it just as a language. It seems like it should be possible to decouple it, but it seems like the only way that will happen is if Adam gets more people in on it. And that's not likely to happen with the scanty documentation and completely impenetrable compiler code. :(

My experience is that, Haskell syntax is a dream compared to Ocaml, with 'where' clauses, beautifully simple lambda syntax, do notation support, . and $ composition, typeclass operator overloading.

Ocaml is more practical in a hard to explain way. Much more predictable in terms of it's memory and cpu use with non-lazy default evaluation, and I found much better performing on general code.

Funny, I have the opposite experience. Where clauses force you to look down and then up again (bonus if you managed to use both let and where in the same function!). The lambda syntax is a matter of debate, writing "fun" instead of "\" doesn't bother me and is, IMHO, clearer. OCaml has an equivalent of $ with @@, but I haven't seen it used very much, since chaining functions with |> is so convenient.

OCaml does not have generic ways of doing monadic stuff (do notation or generic >>=), which means in practice that monadic code looks like "fun1 x >>= (fun r -> )" which is kind of awkward. It also doesn't have typeclasses (yet).

However, one thing you're unlikely to find in OCaml and which is unfortunately not uncommon in Haskell is functions with a long list of positional parameters (since you have named arguments) or different versions of the same function with a "_" suffix, depending on various defaults (since you have named arguments). It makes a tremendous difference of readability when dealing with complex code.

Also, interface files (.mli), while a bit of a pain to maintain at times, give a very clear idea of what interface a given module exposes.

All in all, I find OCaml code much easier to read, not to mention less "clever" than equivalent Haskell code.

> However, one thing you're unlikely to find in OCaml and which is unfortunately not uncommon in Haskell is functions with a long list of positional parameters (since you have named arguments) or different versions of the same function with a "_" suffix, depending on various defaults (since you have named arguments). It makes a tremendous difference of readability when dealing with complex code.

Can you give an example, I'm interested in seeing what you mean but having trouble figuring out what an example would look like.

I have used all three languages professionally—Haskell & some OCaml at Facebook, and some F# at Xamarin—and would recommend learning Haskell.

There’s a strong community around Haskell, with a decent library ecosystem and high-quality implementations, and learning it will make it easy to pick up F# or OCaml—the reverse is not necessarily true.

F# is a decent choice if you want to write functional code in .NET land, but Visual Studio and MonoDevelop support for F# aren’t as good as for C#.

OCaml is a good language, but I would recommend against it. Like F#, it has some syntactic and semantic conveniences that Haskell lacks. However, the community is small, there are relatively few libraries, and the runtime falls down when you need to do anything with large arrays or floating-point math.

> OCaml is a good language, but I would recommend against it. Like F#, it has some syntactic and semantic conveniences that Haskell lacks.

I haven't had a chance to use OCaml or F# yet, what syntactic and semantic conveniences are you talking about?

For example, OCaml has an object system (the “O”), polymorphic variants (somewhere between ADTs and symbols), parameterised/first-class modules, and named/optional function parameters. A lot of these things have suitable replacements in Haskell, they’re just not in the language directly. And of course, Haskell has plenty of features that OCaml and F# lack, such as pervasive purity, typeclasses, and an array of useful advanced type system features. It’s tradeoffs all the way down. :)

I used both Haskell and OCaml professionally.

OCaml's strictness makes some things easier to reason about. Haskell is a nicer language and has more momentum.

OCaml has, IMHO, strong syntactic advantages over Haskell which have a major impact on readability:

- the way you access records (which also means it's not going to clutter your namespace)

- named arguments

- default values

- not an OCaml-the-language property, but there is much less of a race in the OCaml ecosystem to write the shortest variable name possible and accumulate the most ASCII operators

Also, the great module system, with easy abstract types or "public" types with private constructors. By contrast, having to export accessors (if using a lens library) is embarrassing.

The modules are great in OCaml.

The short variable names in Haskell follow the convention that the more polymorphic your variable is, the shorter its name should be. Eg in

map f (x:xs) = f x : xs; map _ [] = []

f, x and xs are very polymorphic, so there's really no better longer name there.

First, people are not shy about using short names in general, however polymorphic the variable is.

And you could easily write:

    map func (item:items)

The point is typically something like "generic english names don't provide much benefit over conventional short names".

When I'm writing map, parametricity literally ensures that I cannot care about the element of the list. Giving their priority with any thought toward the name is actually taking the focus off the list structure where it belongs.

That's nice, but when reading non-trivial code, having "generic English names" makes the programmer's intention easier to decipher than a random sprinkling of 5 different one-letter variables - though it's fine to have ecosystem-wide conventions for the most common cases.

I would argue that in non-trivial code when variables represent meaningful concepts it is Haskell tradition to use long, descriptive variable names. Sometimes these are even longer than in other languages since that's a tool used so rarely.

The trick is that once you're exposed to short names at the right places you realize that there are relatively few times that variables are meaningful as much more than function-wiring notation. Especially in pure code!

As soon as you add in mutability this all goes out the window really fast. Subsequently, you see "meaningful" variable names all the time in IO or ST code.

    map fn (head:tail) = (fn head):(map fn tail)
    map _ [] = []

Your version, while not very different, is not easier to read than that of the parent post.

A few weeks into learning Haskell, reading x:xs will become second nature, so much so that "head" and "tail" become noise. (Also, "head" and "tail" are function names in the Prelude, but I'll assume you meant to write "hd" and "tl" or something like that).

    (Also, "head" and "tail" are function names in the Prelude, but I'll assume you meant to write "hd" and "tl" or something like that).
No, I overlooked that. Writing "hd" and "tl" would be opposite to my point! I think that "head:tail" is much more readable and much more informative than "x:xs".

Shadowing built-in names comes with it's own set of problems. (At least that's the opinion of the compiler writers that enable warnings about shadows by default.)

Actually, (some form of) default values are easy to add to Haskell. We did so at Standard Chartered.

How do they compare in practice might be too broad a question, I'm willing to bet you'd get more responses if you asked for comparisons in certain use cases.

I tend to think of OCaml as a functional C. It's strict and not purely functional. You can write for-loops and while-loops if you want (but you shouldn't) and use refs to have mutable state (but you shouldn't). It has a better module system than Haskell, but doesn't have type classes. It has functors, which are equivalent-- OCaml's functor is an operation over modules and only very loosely connected (through type theory, which isn't essential to being proficient in either language) to Haskell's Functor type class-- and better in some ways and worse in others.

Haskell is more expressive and has a much more powerful type system, but it's probably harder to reason about performance.

OCaml's biggest issue (note: I may be out of date on this, since I haven't heavily used it since the late 2000s) is the GIL. This probably limits your ability to use it for multithreaded programming, but it can compile down to extremely fast single-threaded executables.

This book actually teaches C in relationship to previous assumed knowledge of Standard ML (not quite OCaml, but close).


I'm not sure I agree with all of the conventions, but it's interesting to see a deliberately functional approach applied to C code.

Can you compare that to Scala? Scala also has non-pure constructs built in, but most people tend to opt in to its purity. And I don't know much about OCaml, but it seems like Scala may have better type system parity with Haskell's type level features.

I'd really love to be able to think of ATS as "functional C"!

(Not that I actually do. Maybe someday).

I see. So all you need are compiler/interpreter experts that can turn any problem into a interpreter/compiler problem and you're golden.

I'm not saying the approach is not worthwhile but how exactly does this generalize to other workplaces where there is no critical mass of such experts? I mean they have their own compiler for Haskell for Pete's sake. I would also like to know how many of the core team members have PhDs and MScs. We can check off Don and Lennart. Maybe Don is really measuring the effects of PhDs in language/compiler design on how to manage complexity?

Alternatively, Facebook has been experimenting with Haskell and OCaml. Seeing their case studies would be valuable as well.

I think "interpreter/compiler" problems arise way more frequently than people expect. It's just they're often not called that until they're being tackled by interpreter/compiler experts.

In particular, you can think of things like message passing, interpreter/command patterns, and anything which uses reflection as being at least a little like an AST/interpreter pattern if you look at it in the right light. Then, if you have a team of compiler writers they can make a big win there.

I completely agree. When viewed from the right angle almost everything is a combination of some "VM bytecode" and a "compiler" targeting that bytecode. Greenspun's tenth rule applies to anything large enough. That's not what I'm getting at.

What I'm getting at is that when you have a bunch of problem solvers well versed in programming language design and theory the programming language at that point is no longer relevant. Standard Chartered doesn't have all that software because Haskell somehow gave them superpowers. Standard Chartered has all that software because a bunch of PhDs chose Haskell to write it in and along the way Lennart wrote another compiler because why not, the man is good at it. So instead of being a case study in how to apply a functional language to solve problems this is really a presentation about what kind of people you want to hire to build systems, i.e. PhDs, MScs, and MDs (apparently).

Imagine if the title of this post was "PHP in the large". Would anyone take it seriously? Does anyone really believe PHP is a great language to build large systems in? No, everyone would immediately jump on the fact that all of it is running on HHVM and there are bunch of smart folks optimizing the hell out of it. Same here.

The metapoint here is really a quote by Rich Hickey from a David Nolen talk/blog post - "Not everything is awesome". There are few hidden gems in the presentation when concessions are made by including Any type and re-inventing a lot of Erlang style process management. I wonder why more trading companies don't leverage Erlang and Dialyzer?

The relevant post for the quote: http://swannodette.github.io/2015/01/09/life-with-dynamic-ty....

I'm certain that it's impossible to separate out the PhD Effect from the Language Effect entirely. I'm also certain that the Language Effect is non-zero. I'm finally pretty confident that there's an interaction, the Language-Lets-You-Hire-PhDs-More-Easily Effect, that's in play.

I don't think PHP is an apt comparison because much of this presentation is actually honing in on very particular points of Haskell-the-language which are directly benefitting SC in practice. SC's use of Mu doesn't detract from this either---both Mu and GHC implement Haskell-the-language (to a large degree) and the core benefits they speak of are exactly those within Haskell-the-language.

Now, SC needed some extra advantages: portable code seems to be a big one as well as dramatic cross-compatibility. These are compiler features more than Haskell-the-language features and so they wrote Mu.

I think the point about Erlang is awesome. I think Haskell could learn a lot from Erlang... and frankly also improve upon its safety and readability tremendously using types. That's a personal bet.

I don't think Dialyzer is even a slightly acceptable replacement for the language advantages of Haskell, however. They are not even playing the same game.

> So all you need are compiler/interpreter experts that can turn any problem into a interpreter/compiler problem and you're golden.

Any text processing, symbolic data structure manipulation can be reduced to a compiler phase, it is just apparently many CS degrees don't teach it properly.

> Alternatively, Facebook has been experimenting with Haskell and OCaml. Seeing their case studies would be valuable as well.

Microsoft Research also does lot of FP related work.

Microsoft as well. I don't know what kind of production systems are built with F# but maybe there are a few.

Microsoft recently released Bond, a message encoding library, which was written in Haskell.

They don't only have one modern FP iron in the fire.

> I would also like to know how many of the core team members have PhDs and MScs.

Don't forget the MDs! (Not joking, one of the people Don's team has an MD.)

I used to work for Don at Standard Chartered. Getting good Haskellers seemed way easier than the hiring efforts of my current employer (Google) focussing on more traditional languages.

But I guess, that's mostly a function of pent up demand for Haskell jobs.

I second this. It's embarrasingly easy to hire Haskellers (especially if you can hire remote or are in a tech centre like London/Chicago/NY etc)!

Where is all this pent up demand? I mean I get what you're saying. Most academics know Haskell and Standard Chartered hires academics but there is a bit of circular thing going on there.

Oh, there are quite a lot of programmers working with more conventional languages in their day job, but are using Haskell as a hobby. They are easy to hire with the lure of Haskell.

Since moving away from Standard Chartered I have turned into one of them.

"You're golden."

Indeed you are: you can use Haskell and GHC and reap the benefits of programming language and compiler experts. You can read the slides again and see that many things explained there are possible with GHC and don't require Mu.

I guess you can make a similar comment about PHP and Facebook. Facebook do have experts in compiler/interpreters to change the language/implementation to make it good at their scale.


    An awful lot of data mining and analysis is best done with relational algebra
Is an interesting statement. Would anyone care to elaborate?

I think it means that a plethora operations you perform on sets, lists, maps, hash maps and indeed collections of all kinds are just specific implementations of the general "relational algebra" operations.

I wonder what's the primary reason for their "Mu" compiler adopting a "strict-ish" evaluation strategy.

A pretty common idea in Haskell-land is that "the next Haskell would be strict". Laziness was once a dreamy ideal execution strategy, but thanks to Haskell the practical tradeoff is better understood.

At the same time (as the slides note) the strict/lazy divide is hardly decided! As many people who would rather Haskell be strict are willing to defend laziness to the death for being the key reason Haskell is so composable. [0]

So Lennart Augustsson—the author of the first standard's compliant Haskell compiler, I believe—wrote up Mu and made it strict. There is probably a justification in SC's case for why it was an experiment worth trying.

[0] At the very least it's indisputable that laziness forced Haskell's hand with purity and that ended up being an enormous win in everyone's opinion.

This is interesting, because the well-known paper "Why functional programming matters" identifies two key aspects of FP: higher-order functions and lazy evaluation. I wonder if John Hughes has reviewed his opinion on this, or if the FP community thinks the paper is no longer an accurate insight into FP...

In particular, I'm thinking of the last paragraph of Hughes' conclusions:

"It is also relevant to the present controversy over lazy evaluation. Some believe that functional languages should be lazy, others believe they should not. Some compromise and provide only lazy lists, with a special syntax for constructing them (as, for example, in SCHEME [AS86]). This paper provides further evidence that lazy evaluation is too important to be relegated to second-class citizenship. It is perhaps the most powerful glue functional programmers possess. One should not obstruct access to such a vital tool."

Maybe it turned out that practical evidence has shown that lazy evaluation wasn't as important for modularity as Hughes thought, or at least that its drawbacks have been found unacceptable in practice?

I think that there is no cut and dried answer. Laziness appears to dramatically improve modularity, but it's unclear whether all of the tradeoffs are worth it. It's difficult to analyze the downsides still since (a) more research is needed and (b) a lot of it can be shrugged off as "weirdness", but it's clear that there are reasons to prefer strictness as well as to prefer laziness.

I've grown to be of the opinion that neither is best and that languages ought to be developed which allow free and clear choice between evaluation strategies throughout. Lazy defaults at the right times and clear strictness types might be a way forward, but it's hardly anything I have expertise in.

I tend to feel this is one area where perhaps Scheme had the right idea (if you ignore set!). Now, I don't know a whole lot about Haskell, but one of (IMO) the elegant parts of Scheme is that you can choose when and where to use streams instead of lists.

The main benefits of streams vs. lists is usually composability, but also that delayed computation can save you from computing something you'll never need. In this way, I think that being able to choose when and where you apply laziness is a huge gain. Now, of course this is where people will complain that streams are either a bad abstraction, or that doing stream-cons vs cons is annoying, but I like to think "what if it was the other way around?". Namely, if you had streams used by default (a la laziness in Clojure), but could switch to lists when it was really necessary or made reasoning about your code easier.

Is there a language that does this smoothly? From what I understand, to work around laziness, you typically have to bend over backwards in Haskell, but if there was a good way to just transform lazy data structures into strict structures similar to the relationship between [stream]-map or [stream]-cons/car/cdr and the list equivalents, I think it'd be pretty exciting. Though to be honest, I'm not sure how this would interact with Monads / Type Classes / etc, especially if you have Type Classes relying on both lazy and non-lazy structures.

The typical idea is that you almost always want "spine lazy" data structures. So, streams are almost always more valuable than strict lists.

In my mind, I tend to think of this as being true up to the point where the size of the structure is statically known. Thus for fixed-sized vectors (or arrays), spine strictness is very important—it gives you better control over memory usage in the very least.

Unfortunately, no language I know of has a good concept of "strictness polymorphism" so in Haskell you end up with duplicate implementations of Strict and Lazy data structures. This ends up being not so much duplicated code, but a whole hell of a lot of duplicated API.

And I think the stream-cons/cons distinction is trivial. It's a lack of proper polymorphism that you're dealing with there and that can be easily implemented in any language with good polymorphism. In Haskell we have a very nice (very nice if you tolerate lenses, anyway, which you should) interface called Cons which is an elaboration of

    class Cons s where
      type Elem s
      _Cons :: Prism s (Elem s, s)
which works like this

    instance Cons [a] where
      type Elem [a] = a
      _Cons = ... -- more complex than worth explaining

    instance Cons (Stream a) where
      type Elem (Stream a) = a
      _Cons = ...
and then gives you

    cons   :: Cons s => Elem s -> s -> s
    uncons :: Cons s => s -> Maybe (Elem s, s)
which are generic in Stream a and [a].

It's so weird to me that in a world of more asynchronicity than ever we want to bring Haskell to strict-land.

On the opposite end, there is so much boilerplate optimisation out there to get around the strictness of other programming languages that would be solved with a non-strict mode

Strictness can always embed laziness---this is sometimes an argument for the natural superiority of strictness---so long as you have lightweight lambdas. Thus, in OCaml you'll see a lot of

    thunk () = long_computation
effectively. Is that syntactic noise enough to disable the advantages of laziness? Actually, maybe!

> Strictness can always embed laziness

And vice versa, although I think embedding strictness in laziness is probably more syntactically heavyweight.


> Is that syntactic noise enough to disable the advantages of laziness? Actually, maybe!

Hmm, if that's the case then it seems that

    thunk = return long_computation
is enough to disable the advantages of monads! :)

But note that achieving the strictness type requires the use of `seq` which is perhaps, arguably, a more arbitrary language feature than function abstraction and unit are! In particular, it's been a big debate as to what the proper semantics for seq are—the dust is technically unsettled, despite the long history of seq in Haskell.

And w.r.t. to monads, for a lot of people it is! Most use of monads is done implicitly, right? ;) The only reason Haskellers tolerate the extra syntax is because it's statically required (I claim).

There's an advantage here, of course, in statically ensuring that people do something more explicitly than they would like to. Perhaps the same advantage applies to thunking.

Honestly, I'd rather not comment. I don't know that I or nearly anyone has enough information to make strong, confident opinions about the "right way" to do lazy/strict. I'm hoping that the research into total languages will provide answers!

I don't know, but in my limited experience, lazy evaluation makes memory use worse (usually not much), but more importantly makes performance (time and memory) harder to reason about, because you don't easily know when something will actually evaluate.

Besides that, there's also not much practical gain from it, IMO. One commonly cited benefit is a function that doesn't use all of it's arguments, therefore saving computation time when they're not evaluated. But realistically, an unused parameter should probably be removed.

Generally the argument is never around saved computation but instead around composability. Lazy languages ensure that everything behaves like a value and in that domain operations compose much more effortlessly. You can't reason about operation as easily, so you don't, and the language can cope with making that work more or less correctly.

Which is definitely suboptimal in some cases!

I think honestly the goal should be reasoning about evaluation order statically instead of trying to find some clever argument such that laziness or strictness is clearly "correct".

> But realistically, an unused parameter should probably be removed.

Think about the function if-then-else, which you may be familiar with from your favorite language :)

    if-then-else true branch1 branch2 = branch1
    if-then-else false branch1 branch2 = branch2
Obviously you don't want both branches evaluated in any given invocation, and obviously you cannot remove the unused parameter. Note that the purpose is not to "save computation", since for example branch1 may be undefined if cond is false!

When using a language with support for lazy evaluation, you encounter this kind of functions all the time.

Of course, in OCaml that's just

    if_then_else c t e = 
      match c with
      | true  -> t ()
      | false -> e ()
So the question sort of becomes one of how painful thunking or anonymous function syntax is.

> One commonly cited benefit is a function that doesn't use all of it's arguments, therefore saving computation time when they're not evaluated. But realistically, an unused parameter should probably be removed.

How do you do that when the values of the other arguments determine whether or not that argument will be used?

Some level of lazyness is nessasary for Haskell to work. For example, you can do:

    main = show $ head a
With lazy evaluation, this will print "1", but with strict evaluation, this program will never terminate as it attempts to fully evaluate "a=1:a", which creates an infinite list.

>But realistically, an unused parameter should probably be removed. You also have cases where a parameter is only used some of the time.

an unused parameter should probably be removed.

It doesn't have to be completely unused to be avoided. In a case like:

if a then b + c else c + d

a and c are always evaluated, only b or d is evaluated not both. Removal isn't an option because b and d maybe used.

It may be that their applications have high memory usage, which lazy evaluation would make slightly worse. Alternatively, they may have found that the applications that they tend to run benefit from a slightly stricter evaluation strategy, and have thus changed the compiler to better reflect that...

Not sure, but having encountered laziness in Clojure and Haskell, it can be non-intuitive and it can be a bitch to debug. It allows for some conceptual beauty, though, and there are certainly some use cases in which laziness is the right behavior. The question is what should be the default; both ought to be allowed. In Haskell, they are, but you start using bangs a lot (e.g. Point !Int !Int and the ($!) operator instead of ($)) and there are also shallow vs. deep considerations, because forcing a thunk only evaluates it one constructor-level deep-- to "weak head normal form".

That said, I much prefer Haskell's laziness or Clojure's laziness in seqs over the broken laziness in other languages. There's a lot to like about R's libraries but... fuck this:

    > Map(function(x) (function(y) x + y), 1:5)[[3]](0)
    [1] 5
Python can be tricked into the same evil if you build closures in a loop. Haskell doesn't have that, thankfully.

Laziness in data structures has the biggest benefit in the spine. Leaf laziness is just more surface area to hide unexpected thunks. If you really want that, do something like

    data Box a = Box a

    type Lazier a = Tree (Box a)
Generally, I find that a little habit around leaf strictness ends up eliminating laziness concerns entirely until you get to explicit concurrent programming and need to think carefully about what thread is forcing what execution.

Lazy vs strict evaluation: Lennart, the author of the Mu compiler, has written about it: http://augustss.blogspot.co.uk/2011/05/more-points-for-lazy-...

I wonder why they don't just use GHC with some customization to make the default eval strategy less lazy. I understand why someone might prefer more predictable (strict) evals by default.

They were targeting an already existing runtime for on older in-house functional language, and the runtime only supported strict evaluation.

Lennart Augustsson is a skilled compiler writer, especially for Haskell, to say the least. Mu does more than just strictify Haskell—the code portability stuff is pretty unique as well, for instance.

Starting from scratch, building on GHC might have made sense. There were historical reasons for going with their own compiler. (Not sure, how much I can say.)

And Lennart pulled it off.

"Your choice of programming language can have a real effect on these results over time." Probably more accurate to say "Choice of programming team can have a real effect on these results over time."

I want to like Haskell. However, I have given up on wanting to dislike Java. To the point that it is painful to read such things as "you can write Java in any language."

To be fair, that's one sentence in the second to last slide of 42 slides. The presentation, at least what I can see in the slides, is hardly Java bashing :) There are 3 mentions of Java in the whole presentation, only one of which has negative connotations: the one you quoted.

And to be extra fair "you can write Java in any language" is much more a dig at some of the bad patterns which Java has become known for culturally than anything about the language itself!

Yeah, the Java bashing I could (and, admittedly, should) have ignored. It is more the crediting the language for the success that I would rather focus on. It is a fairly strong assertion at the end, that I feel needs more support. Back when fewer teams were writing Java applications, I feel similar advantages were felt for it.

Of course, I still pine for lisp, so I can not claim to have no biases. (Well, that and MMIX)

The entire Java community pines for lisp, whether they know it or not. That's why Apache Ant was invented, it's Java's manifestation of Greenspun's Tenth Rule.

Is it confined to the Java community? Seems a bit more widespread, honestly.

Well yeah, that's why the original Greenspun's tenth rule cited "Any sufficiently complicated C or Fortran program..."

Those who do not learn from history are doomed to repeat it. That is the only logical explanation for why XML even exists.

I read that rather differently. I see it as a poke at management that wants to make the move because of hype not a real understanding of what it will take, and not put the effort in training\hiring devs who have as he puts it "have taste and guidance".

I'm really happy to hear about Standard Chartered's success with this, and I want to know more. This is really promising stuff.

My current company is looking into our "next generation" platform for when our datasets exceed what we can do in R. R may not be the best language, but it's great for exploratory data science, has the best or the only library out there for some ML purposes, and we've done a lot of things to make it production-worthy (on small- and medium-data). We'll probably need to involve something else, down the road, when our data sets get larger than what fits on one box. The leading candidates are Haskell, Clojure, and Scala (Scala because of Spark). I'll have to evaluate the languages fairly and relative to our needs, but I hope Haskell wins for a number of reasons, including the fact that Chicago + Haskell is an unfilled niche and we'd attract a ton of talent.

For those who've taken Haskell this far into production: have you encountered any negatives? Are there any times when you think it might be better not to have a strong type system?

To me, the biggest drawback of Haskell isn't anything intrinsic to the language, but the amount of stuff it forces a person to learn. For me, that's a fun challenge... but trying to convince 110 programmers to use a language that forces I/O into a set of types (loosely) called a monad seems like an epic task. Clojure has the advantage of being simple and beautiful once you get past the parentheses. Haskell is demanding and frustrating for the first 6 months (and pays off handsomely later on, but this can make it a hard sell).

Also, how does the Any type in Mu (if anyone familiar with it is here) differ from Data.Dynamic?

I've put a large Haskell app in production, and previously to that have seen the reasearch process at a couple of hedge funds. I'd have a couple of suggestions:

* You don't need everyone to be a ninja Haskeller. Get a couple of early ninjas to flesh out the architecture and you'll find that you'll be able to "fill in the gaps" around them with less experienced people (a bit like yummyfajitas experience above). FWIW I very quickly became one of the fill in the gaps people ;-)

* If you can move to stream based abstractions, you'll be onto a winner. Event streams are inherantly immutable, and force you into a much purer data model. See the "unified log" linkedin blog post or read [1] for more

* I don't think that you'll be able to retrain 110 programmers in Haskell. While I think that anyone with the right mindset can learn it, there will be a significant portion of any team who lack that mindset.

Hope that helps. You know where to find me if you'd like anyone to help you sell adopting Haskell to your management. :-)

[1] http://manning.com/dean/

I'm sorry - I love Haskell, and it's a great choice for a lot of domains, but it's a poor choice for numeric/scientific computation. The ecosystem just isn't there yet.

Here's a recent r/haskell thread discussing this: http://www.reddit.com/r/haskell/comments/2rsxrb/is_haskell_a...

Exploratory data analysis isn't killer in Haskell. I think types can help a lot here (god knows the amount of pain that R/Numpy spaghetti has inflicted) but the jury is still out for how to do it well. Row types seem like an important technology which gets a little bolted on to Haskell---both exciting and unclear at the moment.


On the other hand, if you've got a solid data pipeline you're scaling then there are grand tools for streaming data in Haskell. You're in the complete sweet spot for strong typing. You can also drop down into C easily when speed is an issue. Matrix libraries exist for the intrepid user. HMatrix is BSD now so you can at least touch LAPACK.

I haven't encountered any negatives, by which I mean I never wished I was using another language. In fact if I'd been using any other language I'm confident I would have wished to use Haskell. (This is for reporting software using Opaleye to generate multi-hundred line Postgres queries in a composable and type safe manner.)

Not something I've used myself yet, but is there any particular reason Julia isn't on that list?

I haven't checked on Julia for like a year, but back then Julia didn't offer anything above and beyond R when it comes to "big" (really just not fitting into memory) data. As it is Spark with Scala is both faster and more convenient than R or Julia.

Spark has a python API and python has a bigger ecosystem for exploratory data analysis than spark.

Also python has a big data interface with out of core matrices and tables as well as a compiler than can speed numerical code, with a just a function decorator, to C like speeds. The former is an interface to the latter.

http://blaze.pydata.org/docs/dev/index.html https://github.com/numba/numba

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact