Hacker News new | comments | show | ask | jobs | submit login
When Haskell Is Not faster than C (jacquesmattheij.com)
250 points by DanielRibeiro on Jan 21, 2013 | hide | past | web | favorite | 219 comments

This reminds me that when I was interested in learning haskell I was completely put off by their "introduction" page on their homepage. It was a couple of years ago but it doesn't seem to have changed much:


It looks like a very pretentious sales pitch to me. You have quotes like "Writing large software systems that work is difficult and expensive. [...] Functional programming languages, such as Haskell, can make it easier and cheaper." and a whole bunch of very vague and poorly backed up assertions.

After a couple of paragraphs brainwashing you about how good functional programming is, you finally get to some code. The first code you get is (of all things) a quicksort which is not used as an introduction to the language but as a way to boast that haskell is so much better and more expressive than C. I mean, look at that C code just after it, disgusting isn't it.

What they don't say is how much slower their implementation is, of course. They also give a link to a more direct (and faster) C-to-haskell version of the quicksort[1]. Hey, it's actually longer than the C version! And it uses 3 imports. And it's still slower than the original C. Oops. I wonder why they didn't put that one in the tutorial. I went back to common lisp after that, I have yet to write 1 line of haskell.

TL;DR: when I want an introduction to your language, I want to know what I can do with it and how I can do it. Give me some code to chew on, show off some features. Don't spend 10 pages telling me you're the best of the best and everything else is crap. This is the kind of article I expect: http://perldoc.perl.org/perlintro.html

[1] http://www.haskell.org/haskellwiki/Introduction/Direct_Trans...

That wiki link to an actual implementation of quicksort in Haskell is pure comedy gold (1):

"Unfortunately none of the above "real" quicksorts seems to compile as given"

"The program below is working very very slowly. It's probably slowsort."

"A more specific/direct translation (neither this nor the C version is polymorphic) is offered by Daniel Fischer, who reports that this version runs within 2x of the C version"

(and it's a completely unreadable mess that is clearly a line by line copy of the C code given)

What a complete and utter trainwreck.

[1]: http://www.haskell.org/haskellwiki/Introduction/Direct_Trans...

To be fair, QuickSort is particularly suitable for imperative languages that allow in-place modification of arrays. For functional languages, merge sort is a much more natural solution that is simple to implement, guaranteed to be efficient, and probably reasonably fast (though still unlikely to beat a QuickSort implementation in C).

To be fair, QuickSort is particularly suitable for actual computation hardware which has hardware features like numerically addressed, stateful random access memory. Imperative arrays happen to feature "in-place modification of arrays" not because it's a useful language feature but because it's a straightforward abstraction of the actual machine being programmed.

That distinction is really important, and something I think many language nerds completely fail to understand. Sure, often all that's "really" important is the abstraction of the "Human Design Space" idea being implemented, and that's a regime where high level language abstractions have lots of value.

But sometimes you want to actually direct the hardware. And there, as everywhere, it helps to use a language with the proper abstractions. C has them, Haskell doesn't. There's absolutely no shame in that!

But somehow people get fooled into thinking that somehow Haskell (or whatever language is involved in the argument, but it seems like Haskell people fall into this hole more than most) doesn't need those abstractions to be "as fast a C". And that's just insane.

To be fair to mergesort, it's particularly suitable for actual computation hardware too. It's mostly linear reads and writes, which almost all i/o systems heavily favor. GNU sort uses mergesort (in C).

GNU sort is designed to sort on disk though. If you actually watch it working on a giant file, you'll see it generating lots and lots of files in /tmp containing the merge lists.

This is because quicksort obviously requires fast random access to the whole arrays, and (at least when it was written) /usr/bin/sort would be expected to be able to quickly sort data that can't fit in memory.

So... if anything I think this is evidence to my point: in the performance regime, you have to match the algorithm choice to the hardware. Specifically here: sort is making up for the lack of memory by exploiting the comparatively (relative to virtual memory) fast streaming behavior of storage media.

Notably, stateful filesystem I/O is also an area where Haskell tends to fall down. I'm not at all sure a Haskell merge sort of the form used in GNU sort would do well at all, though I'm willing to be proven wrong.

> GNU sort is designed to sort on disk though. If you actually watch it working on a giant file, you'll see it generating lots and lots of files in /tmp containing the merge lists.

The number of files created is just an implementation detail; you could easily do everything in one file if you wanted to, or at most 2N files if you're not allowed to seek or overwrite files at all, and you want to do N-way merges.

Maybe GNU sort is just trying to avoid creating files larger than 2 GB (which is a problem on legacy file systems).

> Specifically here: sort is making up for the lack of memory by exploiting the comparatively (relative to virtual memory) fast streaming behavior of storage media.

Yes, also the fact that GNU sort can sort multi-terabyte files on 32-bit platforms, which otherwise wouldn't fit in memory. (Though systems with smaller process memory space than disk storage space are an oddity that will soon be outdated, I suppose.)

> I'm not at all sure a Haskell merge sort of the form used in GNU sort would do well at all

Well, even the on-disk sort starts by sorting chunks in-memory, which we have already established Haskell is bad at. However, the merge sorting part of the algorithm may well be I/O bound in practice, in which case Haskell should be able to do it just fine.

The only good reason I can think of why C could be faster, is that if you can merge more efficiently, you can choose a higher value of N for your N-way merge,

> stateful filesystem I/O is also an area where Haskell tends to fall down

What do you mean by this? AFAIK Haskell can read/write files just fine through the I/O monad.

> Notably, stateful filesystem I/O is also an area where Haskell tends to fall down.

Can you give an example? I do not understand what you mean. The filesystem is stateful, but I do not see that impeding Haskell.

> and it's a completely unreadable mess that is clearly a line by line copy of the C code given

I didn't find it unreadable, and I think the point was to copy the C implementation. I'm not sure what you were expecting.

In any case the unfortunate state of parts of the haskell wiki isn't a good reason to feel one way or the other about the language.

The haskell analogue to perlintro.html is http://www.haskell.org/tutorial/index.html

Some code to chew on, some features shown off: http://lambda-the-ultimate.org/node/2427

and a great writeup of what happens behind the curtain: http://research.microsoft.com/en-us/um/people/simonpj/papers...

Probably the reason Haskell's introduction has more explanation is because for most, functional programming language is unusual.

Evaluating a language by just reading the introduction is like reading the introduction of a book and saying that you didn't like it. Read it! :)

It seems we are focusing too much in a possible weak point and forgeting about several strong points Haskell has.

It is linked from the main page to answer the question "What is Haskell?" It is not supposed to an example driven page at all. The title of the wiki page could be changed to fix this though. The very next link is http://tryhaskell.org/ to learn Haskell in a example driven way.

I agree that I'm nitpicking a bit but I do think it's significant. It says something about the mentality of the community. First impressions matter.

But perhaps you could recommend me and others who'd like to try haskell a better introduction to the language? I'm still curious to try it.

I highly recommend (http://learnyouahaskell.com/)

Yes! That's a great introduction, but you must have a sense of humor. I love that book.

Curious to know what you think of http://ocaml.org, specifically the code example on the front page and the further examples it links to (http://ocaml.org/taste.html)

Looks like what I would expect from a language introduction. Explains the syntax, shows off some features, gives you snippets of code and tells you how to feed them into a REPL. It's a good way to start and get people interested in the language I think.

Funny, I've never seen that page, I always land on Inria's page(http://caml.inria.fr/).

Me too. Quite delighted to see ocaml having a refreshed page and active community.

Plus quicksort is a classic morass- by the time you harden it enough to deal with real world data you could have written a proper merge sort with similar average case performance and much better worst case guarantee. My rant on quicksort: http://www.win-vector.com/blog/2008/04/sorting-in-anger/

Interesting read, thanks for linking it. When I need to write a sorting algorithm, I usually for Merge Sort, because I can never remember the exact details of swapping/partitioning of QuickSort. It seems much more natural to say "split your data in two, sort them, merge them together". And when I'm feeling damn fancy, I use an InsertSort when the size of the dataset is less than 7-8, another algorithm that I find difficult to forget (at least, the recursive definition; I usually need some paper and pencil to do the in-place version).

That is not the best introduction to the programming language by means of practical examples, but it does not look like it is supposed to be either. At least from Haskell.org it links from "What is Haskell?" Which is a very different question then what can I do with Haskell.

> After a couple of paragraphs brainwashing you about how good functional programming is, you finally get to some code.

Whenever I see a page about a new programming language I usually skip all the marketing speech at first and look for a code fragment to examine first. After that I'll start reading the actual text.

"They also give a link to a more direct (and faster) C-to-haskell version of the quicksort[1]. Hey, it's actually longer than the C version! And it uses 3 imports. And it's still slower than the original C. Oops" that made me laugh a few years (?) ago and it still does :)

Is there any language intro page that shows a good sized working program?

Yes. Racket has a great intro-page and download links etc. No BS.


the question was "a good sized working program", the racket page is a great page but only has a few haikus, so "no" should be your first word, even if the rest of the comment is correct :)

> This article is in response to an earlier one comparing Haskell and C, which made the claim that Haskell beats out C when it comes to speed.

Perhaps my reading comprehension of the original post is different from Jacques' (or I'm just wrong), but I don't think that the original article made such a claim. Here's the TL;DR of the original article:

> TL;DR: Conventional wisdom is wrong. Nothing can beat highly micro-optimised C, but real everyday C code is not like that, and is often several times slower than the micro-optimised version would be. Meanwhile the high level of Haskell means that the compiler has lots of scope for doing micro-optimisation of its own. As a result it is quite common for everyday Haskell code to run faster than everyday C. Not always, of course, but enough to make the speed difference moot unless you actually plan on doing lots of micro-optimisation.

From this, I understood that in a larger program, most programmers wouldn't be doing the kind of micro-optimizations that they do for the Benchmarks Game. I figure that most code would be written following this pattern:

* Write code to follow the spec (usually without thinking too much about performance)

* Run the code, and evaluate its performance

* If the performance is good enough, you're done

* If the performance needs to be better, run a profiler

* Find the biggest bottleneck and optimize it

* Re-run the code and re-evaluate its performance

The original article took a micro-benchmark (a mistake in my opinion, because it's easy to micro-optimize every single detail of that code) and showed that in the case of Haskell, the first version was too slow, but that with the help of a profiler, finding the cause was easy, and the fix was trivial, while in C the profiler didn't show any problem with the code of the user, so it must be a problem in the standard library's code, and to fix it required changing the design of the program and making it more different than the spec. And I felt this was the author's real point; that to get the maximum performance from a C program, you'll need to code it not like a human would imagine it, but like a computer would, and it makes the code harder to maintain later on.

> Perhaps my reading comprehension of the original post is different from Jacques' (or I'm just wrong), but I don't think that the original article made such a claim. Here's the TL;DR of the original article:

The TL;DR is not representative of the whole article. It would be nice if it were but it isn't, so just reading the TL;DR is not enough, better do the short-enough;did-read version of that. Another commenter in this thread said more or less the same thing, I've responded to him here:


The whole article didn't make such a claim either. Your hate boner is clouding your judgement. Your article is far more biased and far less useful than the original. Especially all the random off-topic nonsense like "you should totally add error checking" and "lets change the style for no reason because I want to pad the length of this".

> Your hate boner is clouding your judgement.

Should be a button or T-shirt.

I can't parse your comment, please rephrase.

IMO you can make C as fast or slow as you want it to be. If you write terrible C it's going to be slower than good Haskell.

If you write common C (and by common I mean what you see in well known daemons, kernels, etc) its generally at the very least on par. That's mostly because most of the functions come from libraries that have been optimized, in both cases.

>From this, I understood that in a larger program, most programmers wouldn't be doing the kind of micro-optimizations that they do for the Benchmarks Game.

The TL;DR; is wrong too. It doesn't require micro-optimizations for a large real world C program to be faster than an equivalent Haskell program.

All those abstractions and laziness in Haskell add up, and you get lower performance overall compared to ANY decent C implementation of a real world program, not just "highly optimized" ones.

By the same token, the lack of higher level abstractions in C also add up in time and mental effort, and you're more likely to end up, in my experience, writing a dumber and more dangerous algorithm than you would in a language that has better safety features and more powerful abstractions.

Even without trying to optimize.

But I think the main issue is that "new" people don't realize most of their beloved languages aren't self hosted and are C compiled and/or that they're calling the C library for almost everything.

Thus, knowing C matters a whole lot. Using it is another matter, and albeit the flexibility of C makes programs a lot more efficient every single time (when you have time and skill), it's understandable to use other languages.

Still, some tasks should still be done in C instead of starting one of these countless runtime of the new language on the block. I'm slightly tired of the huge interpreted programs running extremely slower than they should these days. Even thus the hardware has gone way faster we still manage to reproduce 10y old programs that run slower than they did when we first made them.

I still wonder to this day why Haskell programmers want their language to be loved so much.

At times it feels like that kid at the playground that spends half his time telling everyone how he's the best thing since sliced bread and cries himself to sleep at night wondering why no one will play with him and his monads.

Don't get me wrong, Haskell looks like a great language with obvious qualities and I don't knock anyone for using it, to each his own, it's just the never ending publicity and proselytism that really rubs me wrong. People adopt new languages, not the other way around.

I'd say Haskell does one percent of the proselytising that Ruby does.

You hear about it a lot on HN because people write interesting articles about it. Remember, most articles that are submitted aren't being upvoted. There's probably someone writing about how their ImmutableSet implementation for Java is the best thing ever, but it's not being upvoted because the community doesn't consider it interesting.

I am a Ruby programmer and let it be clear, the holier-than-thou attitude can indeed be pervasive in the Ruby community.

Yet, most of the proselytism int the Ruby community are different shades of "Programming with this language makes me very happy, I'd like everyone to be happy as well". Of course it's not always as clear cut and there is sometimes much to be desired in terms of behavior. But that's where the Haskell in MY PERSONAL OPINION (so to take with a pinch of salt) is slightly different in how it seems to be variations of "Haskell is a superior language, everyone that doesn't use it has not reached enlightenment"

TL;DR: Ruby people can be annoying like hippies can be annoying, Haskell enthusiasts I've found to be more akin to the Jehovah's witnesses of programming.

And just so we're clear, I am going on a slightly provocative/trollish bender here, one that I hope will not offend too much, it's all in good fun.

I think you're imagining the holier-than-thou attitude. While people are certainly excited to have figured it out, nobody is asking them to be excited about it.

If you disagree, how about citing some sources? You're good at coming up with analogies, but coming up with analogies is like being Hitler. What?

I wonder if 4 comments to Godwin is a new record for Hacker News.

Rubyists aren't like hippies (hippies have no taste and they dislike drama), more like hipster art critics if anything.

Haskellers being compared to Jehovah's witnesses is even more wrong. They're more like scientists, maybe of the climate change variety: most of them keep their heads down with a dedication to improving the state of the art, but when they go public it's with good reason and people should listen rather rejecting out of hand anything that challenges their ingrained worldview.

I love how easily us nerds get sucked into debating analogies and whether they're right or not and which analogy is the best one, even when it has no bearing on anything. It's a really funny phenomenon to watch.

That's not holier-than-thou at all.

Okay then, cut off your nose to spite your face.

Haskell has not uncovered an impending disaster which everyone ignores at their own peril. It is a programming language on the same basic level as many other popular and useful programming languages.

We're not worthy! We're not worthy!

as a rubyist who has dabbled a bit in haskell, i think the two camps are exactly the same in terms of proselytising. it's just frustrating to see people using "less-capable" languages and imagining how much happier/more productive/safer they'd be if they only adopted yours.

The difference there is that Haskell actually does offer some functionality and concepts that aren't really present in most other programming languages.

Ruby, on the other hand, is pretty unremarkable. It doesn't really offer anything beyond what older languages like Perl and Python, for example, offer.

I can mostly agree, Haskell's additions aren't even in the same ballpark. I also prefer Python overall to Ruby. But Ruby has a huge advantage of anonymous code block parameters.

That facilitates dsls and I believe it's important enough to note as significant.

in my experience, ruby has a lot of small improvements over perl and python that add up into a far nicer programming experience. it's subjective, perhaps, but i've used a lot of languages and i find that ruby hits the sweet spot for developer productivity.

What prompted this is not that I'm not fine with proselytizing, I just think that you should be even handed when you do so. False claims serve neither language or their communities and potential users.

Two things I love about Haskell.

1. It is very terse/expressive. I can look at code I haven't touched in 6 months and still understand what it's doing. Perhaps because my code is newbie code and thus somewhat simple.

2. It is the only language that brings me back to the days of Turbo Pascal when the joy of coding came from coding itself, not from building cool things. A small but vital distinction.

That said, I rarely get to use Haskell. For most of my tasks javascript is better suited, but haskell has affected how I write javascript to a great extent.

Maybe it is because I'm not a Haskell Guru, but although I love Haskell's terseness, it tends to make it very hard for me to read code I've written a while ago. I tend to get excited about cool Haskell features like Arrows, using them whenever I can, then I forget about them, and when I read my code 6 months later, it's like gibberish..

When I find myself using arrows, usually I realise that I got lazy and used tuples because my data model wasn’t good enough. Good code flows from good data.

But really, you should use the “tricks” when they make your code more expressive, not just because they’re cool. It’s the same in any language.

You have expressed something I have been worrying about with all these new APIs and frameworks - there is a joy to coding not necessarily connected to delivering business value.

Joy is relative. What gives one person Joy might not give another joy. There just might be people who derive joy using frameworks to deliver business value who might not find joy in coding just for coding sake. For some programming is there hobby and doubles as work, so they derive joy in coding for coding sake. For other's coding is just a means to an end in their work and they have other hobbies.

So every man to what gives him joy and no need to worry about those using Api's and framework to deliver value and pay bills.

have you tried http://roy.brianmckenna.org/ ? its sort of haskell like, but compiles down to javascript, so you could use it in the browser.

When you find a tool that in many ways is far better than every other tool used by the mainstream (and of course in some ways worse) you might get excited about it so you want to share the knowledge you found.

Or maybe you prefer the tool you like to get more adoption so you could use it in more situations.

I advocate Haskell because of those two reasons. I think that like me, many others can get enormous educational and practical benefits from learning Haskell and I'd really rather be using Haskell than e.g Python.

Tons of other languages and programming styles can provide the same benefits. It doesn't have to be Haskell against Python or C++ against Erlang. We can pick, mix and match and let the best of the bunch naturally emerge.

There aren't any other languages that can provide the exact same group of benefits.

What you do is you weigh the benefits and disadvantages of the languages you are considering.

Haskellers just want the benefits of their language known so that it isn't overlooked due to being too difficult/unfamiliar/uncommon etc.

This article missed the point of the original article it was addressing, which is that haskell is often fast enough for the job with a minimal amount of optimisation.

The problem is that the original article author chose a shootout example, where speed was the objective. Idiomatic Haskell is rarely going to win on speed.

The C code in this article is by the authors admission not very robust against invalid input. A robust Haskell implementation would likely take less effort to produce than a robust C implementation. Which is better depends on the goal.

I don't think other languages have the same benefits of Haskell. They have other benefits but not Haskell's.

Python/Ruby: very easy to learn and be productive quickly.

C: high level of control over resources and easy to get good performance.

Haskell: very good static guarantees about correctness. Relatively easy to get decent and good performance. Extremely educational and mind expanding, far more than say Lisp. Allows very high levels of abstraction. Great concurrency and parallelism support.

Lisp: easy to extend syntax with macros and manipulate programs programmatically.

All these languages have benefits, but not quite the same ones.

What about Objective Caml? Seems to fit the bill for all those metrics as well, yet for a reason that eludes me to this day, it never quite reached the kind of "street rep" that Haskell now enjoys.

I don't know OCaml very well, but I do know that without laziness, it does not support as much high-level/reusability[1].

Also, it lacks much of the mind-expanding stuff in Haskell (The class hierarchy explained by the Typeclassopedia).

AFAIK, GHC had surpassed OCaml's compilers' performance, concurrency support, etc. Compiler rewrite rules are also a very nice feature that other languages cannot immitate due to lack of purity.

Many of the parallelism benefits are much a result of purity, which OCaml lacks. Also stuff like software-transactional-memory rely on purity for their guarantees, which OCaml cannot provide. Reasoning about code is also much easier with purity.

1. http://augustss.blogspot.co.il/2011/05/more-points-for-lazy-...

Your're showing your ignorance.

OCaml has full support for laziness. It's just default strict with optional laziness via a keyword (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Lazy.html), rather than the opposite.

The only thing Caml lacks that haskell has is typeclasses, so doing non-integer math is ugly, and non-int/float math is REALLY ugly, like something out java or something, since there is _no_ operator overloading.

I suspect the main barriers to Caml gaining wider acceptance had more to do with a cultural barrier - both the documentation and error messages in English were (and mostly continue to be) fairly poor.

Caml actually did attain a measure of popularity for a while in the early 2000s - the winning ICFP content entry was in Ocaml for something like 4 years running at one point.

Unfortunately, OCaml doesn't have enforced purity outside the IO monad. This removes one of the main benefits of Haskell, which is to force people who are terrible at functional programming to actually write code in a functional style.

I actually think the main thing OCaml lacks is the separation of pure from non-pure code via the type system. I have really become addicted to the way this ends up affecting the architecture of the code, and the guarantees it provides when using code I did not write.

You aren't speaking to the principal point in the post Peaker links, which is accepted by Harper in the notes "As you know, in the eager world we tend to write out our own recursive functions, rather than use combinators", which is of course all any anyone cares about; with explicit recursion the user's IQ falls 50 points immediately. The whole discussion presupposes a mechanism for 'opting out' of default strictness or default laziness, which exists in many languages.

Optional laziness is not good enough -- read the blog post I linked to. You can't re-use and compose existing functions from the standard library if they all tend to be strict.

OCaml isn't purely functional, so it misses some of the most important benefits of Haskell. It's also not entirely honest (in the fact that you can do things that the type system doesn't tell you about.)

With unsafePerformIO, Haskell is also not "entirely honest"...

That's certainly true and always be true given the mutability underlying architectures and operating systems. However, in Haskell, unsafePerformIO is normally only used when it can be shown that a function is referentially transparent. When this is not the case, a function ought to return an IO value.

The general rule is: don't use unsafePerformIO unless your name is Simon ;).

Haskell succeeded in isolating side effects to a great extend. It has its advantages, referential transparency leads to predictable and understandable code. And disadvantages - you have to keep state via argument passing (which can be hidden nicely using the State monad) and when implementing inherently mutable algorithms you usually end up using the ST monad. Still, on the outside, a function will be pure and pretty :).

This is all correct, and I do think it's an important difference between Haskell and OCaml, I just didn't like the way it was phrased ("lies") because that's not really the difference. The difference is that Haskell has a place for you to document (in the type system) whether functions have certain types of effects (and if you cooperate just a little, the compiler will make sure this documentation is correct). The cost is the (conceptual) overhead of actually doing this - which is small when you know what you're building but can require some re-plumbing of a chunk of your code when things change.

This difference isn't inherently a win for Haskell - that OCaml doesn't bear the cost of significant restructuring because you find you need to do some IO based on results internal to a function several layers down is a big win in the short term. Whether it's a win in the long term, and how much the short matters versus the long, would seem to depend very much on the particular problems you're solving.

Having come to Haskell from C by way of OCaml, my personal preference is to get as much out of the type system as one can - but one should always be aware of the trade-offs.

With `{-#LANGUAGE Safe#-}` it is.

unsafePerformIO for an IO action that is not actually pure is considered a bug, and the compiler makes sure you can't abuse unsafePerformIO without being bitten very very hard.

The core of Objective Caml is awesome, but there is a severe lack of libraries, which is guess is in good part due to the limited number of users, and also to the limited advocacy/marketing done by them (compared to, say, Ruby or Haskell).

A lot of the OCaml community is French-speaking.

OCaml led to F#, which is getting some level of adoption, but perhaps limited to folks OK running Windows, as not everyone knows/likes Mono. The fact Microsoft ships a very complete set of tools for F# is pretty cool.

Wrong. There are three main things I want from a programming language: first class functions, strong static types, and purity. Haskell is literally the only language out there today that has these three characteristics and is mature enough for production systems.

It's different. It's like you were listening to one music genre all your life and there's suddenly something new.

If, say, Python was the only language with "for(each)" loops, you'd see many blog posts about that too. After I saw this, the old style of iterating by index feels so antiquated. Haskell gives the same feeling many times. It has unique features when it comes to abstraction and I feel they are the right way to program.

strangely enough, I think one of the interesting features (type classes) only ended up reappearing in Go with its interfaces (granted, only in a very limited fashion).

I think one of the most important features of typeclasses is the ability to be polymorphic on just the return type of an expression. For example, there is a typeclass called Read which comes with a function called read:

    read :: Read r => String -> r
That is, you get a function from a String to whatever type is in the typeclass. It's the opposite of toString. This is also used in a whole bunch of other contexts like numbers--numeric literals are polymorphic, letting you add any numeric type you want and still use literals for it.

As far as I know, Go cannot do anything of the sort.

There are some other nice features of typeclasses, like the ability to have multiple types. That is, you could have a typeclass for multiplication that allowed you to multiply two numbers, two matrices or a number with a matrix. You can even have typeclasses that are recursive, allowing you to define them for an infinite amount of types: you could have a typeclass for functions of any number of arguments, for example.

I think Go interfaces do none of that as well.

Now Rust, on the other hand, actually* has typeclasses.

I'm personally on a quest to "get" Haskell right now, but I really don't see what's so special about that or unique to Haskell. Languages have long accomplished what `read` does by simply casting/coercing the string into the desired type.

In python, for example, I'd call int() on the input strings I want to turn into integers.

The point is that you use a single function for any type. So in Python you'd have to do int() or float() or customType()... In Haskell, all of these would be just `read'. The type system can figure out which instance to use for you, without having to specify it. Moreover, it's also trivial to write a function that works on any readable type, something hard (although not entirely impossible) to do in Python.

This makes it much easier to use: whenever you want to get any type from a string, you just read it. This is just like being able to print values of any type, except for parsing.

This can also be used with constants rather than functions. So maxBound is the maximum value for any bounded type. In Python, the closest you can get to that is something like float.maxBound. (Except, apparently, it's actually sys.float_info.max.)

As I mentioned, this also lets you define new numeric types that can still use the same literals. For my most recent project, I needed 18-bit words. I could do this and still write expressions like `x + 1` using the Word18 type. Moreover, it would be very easy to make my code generic over the exact type of number used--this would make it possible to use numbers of different sizes or even something more exotic like random variables. (It happens to be tricky because some of the semantics I was working with rely on having exactly 18 bits, but that's an issue with the domain and not with Haskell.)

In another language, I would either have to use the normal int type and make sure to always keep track of the overflow myself or I would have to wrap every literal in a function that turned into an 18-bit word.

So the special quality is being able to dispatch on the return type of an expression rather than on the types of the arguments. I think this is very special indeed and extremely useful. I hope this clarifies everything.

That does make more sense, and I appreciate the more thorough explanation. It seems a bit ironic though, that Python makes you more specific and certain about the output type than Haskell!

It is only ironic until you realize static typing can often save you from writing code, and not just require more code.

In general, sure. Doesn't change that the "read" function seems to break the spirit of that.

How so? Inferring which implementation of "read" to use based on the types is exactly the same spirit of inferring which implementation of "length" to use. Haskell can maintain consistency in both, but dynamically typed languages must revert to explicitly choosing the implementation in the former.

Yes, it goes nicely with that spirit. But it doesn't go nicely with "have a well-specified return type for all functions that you know in advance", as "read" can return anything. It's great for languages to pick the right implementation of "length" on the fly, but the point is, they all return integers. "read"? Who knows what type you'll get back in Haskell, the "you must specify [or at least plan out the] output type" language? There, it seems to break the trend.

Read in Haskell returns a well defined type, it just happens to be polymorphic.

By that broad a definition of "a well defined polymorphic type", so does every function in every language, at least those that can say "everything's an object/function/data/etc!"

the type is statically checked, not dynamically checked. Unlike something that just returns an unknown existential "object" type.

Please explain more thoroughly and with fewer shortcuts.

These two types in Haskell are very similar:

   show :: Show a => a -> String
   read :: Read a => String -> a
And there's also a law that connects their behavior:

   read . show = id
   show . read = id
There's nothing special about the position of the polymorphic value, it can be the argument or result. In either case, it is fully statically typed, and you never need to "downcast" it to actually use it. There is complete type safety.

Note that this is based on a correction of an OO languages' mistake: it separates the passing of the function table parameter from passing the values. This allows passing a function table for any position in the type, and not just for the first-argument-type.

Having something like:

    Object read(String x) { .. }
Is entirely different for two reasons:

* The implementation isn't chosen by the type

* You will need to eventually convert the "Object" type to the specific type you need.

In Haskell, when you write:

  read :: Read a => String -> a
It is actually a shorthand form for:

  read :: forall a. Read a => String -> a
The "forall" is called "universal quantification" and here it is like an implicit parameter "a" (think, template <a> parameter). Caller gets to pass in/choose any "a" they want, and as long as there is a valid Read instance for that type, it will work.

However, in you could also (in pseudo-syntax):

  read :: exists a. Read a => String -> a
This is called an "existential quantification", and it is not like an extra parameter "a", but like a tuple: (a, Read a => String -> a). It means: "There exists a type 'a' such that this will be the result type of read". i.e: The caller does not get to choose which type "a" it is, but instead the caller gets a value of an arbitrary, unknown type that happens to have a Read instance.

The only way to make any use of a value whose type is existentially quantified like that is to unsafely "cast" it to the type you want, and hope this is the correct type actually returned by "read" here.

This is of course nonsense in Haskell, and nobody would ever do that. However, it is very typical OO/Java code.

Whenever a Java function returns an Object, it is basically returning an existentially quantified variable, and the caller needs to correctly "cast it down" to the correct type.

Instead of repeating the same thing over again with more words, perhaps you can explain why it's "in the spirit" of Haskell not to know the return type (except with the vagueness of "Object") of a function in this one case.

I explained how you do know the return type, it is of some known, universally quantified type variable.

A type variable is a compile-time known type.

There's a difference between a type like Object and a polymorphic type. This is easy to see with generics in a Java-like pseudocode:

    public foo(Object o) { ... }
is very different from

    public foo<A>(A o) { ... }
In Haskell, you always use the second style of function--there is no sub-typing, so there is no real equivalent to the Object type in Java.

We can imagine something similar for read. If we didn't know anything about the type, it would look something like this:

    public Object read(String s) { ... }
instead, it's actually something like this:

    public A read<A>(String s) { ... }
So whenever you use read, you would be specifying the type it returns. I imagine it would look something like this:

    read<Integer>("10") + read<Integer>("11")
This is exactly how read works in Haskell. The important difference, however, is that the type system can infer what type read is supposed to be. So the above code snippet would look like this:

    read "10" + read "11"
If you made the types explicity, it would look like this:

    (read "10" :: Integer) + (read "11" :: Integer)
So you always know what type read has when you use it. But what is the type of read itself? The Javaish version looked something like A read<A>(String str). The important part is the generic A: it's a polymorphic type variable. In Haskell, the type is similarly polymorphic: String -> a.

Of course, the type isn't quite this general: you can only read things that you have a parser for. In the Java-like language, it would probably look roughly like:

    public A read<A extends Read>(String str) { ... }
In Haskell, we do not have any concept of "extending" a type: there is no sub-typing of any sort. Instead, we have typeclasses which serve the same role, so the type ultimately looks like this: read :: Read a => String -> a.

Hopefully this clarifies how you know what type read has. Really, it's no different from any other class like Show. There is a very clear parallel between show and read:

    read :: Read a => String -> a
    show :: Show a => a -> String
Being able to take advantage of this sort of symmetry in the language is extremely useful. My favorite example is with numeric literals, which are polymorphic:

    1 :: Num n => n
In my previous Java pseudocode, this would look something like:

and would be used like:

    1<Integer> + 1<Integer>
    2<Double> * 3<Double>
Of course, this is hideous, which is why type inference is so important.

Thanks, that second link looks really interesting!

Aren't Rust "type-classes" also limited and use single-dispatch?

I haven't used Rust very much yet, but my impression is that they can dispatch on the return type at the very least. So they can have a Num typeclass. Perhaps they're limited in some other ways, but it sounds like a good start.

Rust has them too, it's described in this very interesting blog post that I haven't had time to fully read yet:


"We also manage to combine traditional class-oriented OOP with Haskell’s type classes in a way that feels seamless to me."

Go's interfaces are not like Haskell's typeclasses. Just try and write a Go interface that allows you to add two of the same things together.

What do you mean add two of the same things together exactly? If I'm not mistaken, what you are talking about is possible but I don't totally understand what you are saying. Could you provide an example?

The types of polymorphism are different. Haskell's parametric polymorphism is compile-time polymorphism, Go's interfaces are run-time polymorphism.

If we draw a parallel to C++, Haskell's polymorphism is like templates, Go's interfaces are like abstract base classes (except that they use compile-time duck typing).

For example, consider the (+) function in Haskell:

  (+) :: Num n -> n -> n -> n
If you use an Integer as the first argument, the second argument must also be an integer, as well as the return type. You can observe this by currying, binding only the first argument:

  Prelude> :type (+) (1 :: Integer)
  (+) (1 :: Integer) :: Integer -> Integer
  Prelude> :type (+) (1 :: Double)
  (+) (1 :: Double) :: Double -> Double
In Go, you could write an interface Num and define a function:

  func foo(a Num, b Num) Num
However, if float64 and int both implement the Num interface, you could also foo a float64 and int. And it would return something that conforms to the Num interface, but what type is actually constructed depends on the implementation of foo.

In other words, type classes have relatively little to do with Go interfaces, aside that they both implement a (different) form of polymorphism.

It's possible to make something akin to Go's polymorphism in Haskell, but you'd need to hide the type parameter of the typeclass using existential quantification.

  data NumBox = forall n. Num n => MkNumBox n
You can now use the NumBox for runtime polymorphism:

  Prelude> :type [1::Int, 4.4::Double]
     Couldn't match expected type `Int' with actual type `Double'
  Prelude> :type [MkNumBox (32 :: Int), MkNumBox (22.32 :: Double)]
  [MkNumBox (32 :: Int), MkNumBox (22.32 :: Double)] :: [NumBox]

> If we draw a parallel to C++, Haskell's polymorphism is like templates

It's important to note that even if Haskell's classes feel like C++ templates they are actually implemented more like abstract classes. In C++ you essentially create a new function for each template instantiation. In Haskell the function is passed an additional pointer telling the function how to implement the class. Kind of like a vtable pointer except not bound to the actual object. This has performance implications, so it's an important point to remember when comparing the languages.

GHC is pretty good at specializing type-class-using functions for particular instances, e.g. if you have

  fac :: (Num a, Eq a) => a -> a
  fac 0 = 1
  fac n = n*fac (n-1)

  main = do
    x <- readLn :: IO Int
    print (fac x)
it will create a specialization (without any indirect calls)

  fac_Int :: Int -> Int
and call that (in fact, in this case fac_Int will even be implemented by a worker function of type Int# -> Int# (unboxed ints)).

If you don't want to rely on this automatic optimization, you can always add a pragma {-# SPECIALIZE fac :: Int -> Int #-}.

(I used a recursive example because a non-recursive function would usually simply be inlined, avoiding any indirect calls too, assuming the call site is monomorphic).

That's a good point, however, the important aspect of typeclasses is that they're type-checked, not that they're compiled via monomorphization. Also, I believe that the way that typeclasses are compiled is at the discretion of the compiler, so actually typeclasses may be compiled in the same way as C++ templates. Another point is that monomorphisation isn't always possible, because a function could be run with an infinite number of types.

He's talking about the inability to specify an interface that specifies functions that take two arguments of the same type. Go can't do that.

You can sometimes work around it. For instance consider sorting. The natural approach is to specify a type that allows comparisons. Go can't do that. But in go you can have a collection which has a function Less(i, j int) bool that takes in two integers and compares the objects at those positions (which in turn just happen to be of the same type).

But this is a limited work around, and there are plenty of cases where you can want something more flexible.

>I still wonder to this day why Haskell programmers want their language to be loved so much.

Libraries. It's the reason why anyone would want their chosen platform loved.

I'd say a healthy job market is also something anyone would want for their platform, and the more teams pick it up the more likely it would be for you to find a good job working with some stack you love.

Many programmers find it easier to understand something by trying to explain it. Haskell has a lot of foreign concepts, so you see a lot more articles from people trying to convey their "aha!" moment. Many of these are erroneous or overly exuberant but it's just because people get excited when they first understand something new. It probably does come across as proselytization--that may even be the intent--but if it helps you cope with it, know that the majority of these articles are first impressions from beginners getting their minds blown, and most of them don't stick it out.

> wondering why no one will play with him

That's kind of a strong statement about a language which has approx. 1000 users in it's IRC chatroom.

The number of people present in the IRC channel is more of a sign that they have a very engaged and active community, which is a good sign for the language itself, for sure, but doesn't necessarily correlate to the general adoption of Haskell as a whole.

A measure like http://www.tiobe.com/index.php/content/paperinfo/tpci/index.... (by no means a perfect one, quite like speed benchmarks ironically) seems to indicate that Haskell still has a long way to go before reaching "mainstream adoption". But you're right about that statement, it was intended more as a humorous jab than a statement of absolute truth :)

I don't think Haskellers particularly want mainstream adoption. Adoption among library writers would suffice... :)

I've seen far more people advocate Python than Haskell, and look at it now!

Of course, since Python is already popular and widely used, this has died down a bit since. That said, even on HN, where it seems almost everybody is using Python anyhow, I still see at least as many comments promoting Python as Haskell.

I get exactly this vibe from a certain other language, but never Haskell.

"You want to play? Great! The rules? Perhaps it's easiest if you understand category theory first..."

I still didn't take the time to learn CT (after 4-5 years of Haskell). It might be time to do so, though, because I keep seeing the awesome things people with a CT background apply it to.

However, you can get very very far in Haskell without ever doing or touching any CT.

Yep, but mentioning it is extremely offputting to new users. And like you say it's simply not true.

>I still wonder to this day why Haskell programmers want their language to be loved so much.

I don't. I just talk about it so that some of the brighter people out there will try it out. This increases the pool of people for me to hire as haskell programmers that can get started right away instead of having to learn first.

Where do you work and what are you hiring for?

I ran into similar issue few years ago where lack of knowledge combined with large amount of fanboyism clouded someones mind. In a popular language comparisons site there was a threading benchmark. It was simple, just start 256 threads. Haskell beat C by far.

I did strace. The C benchmark actually spawned 256 threads and the Haskell one spawned 4. The response I got was that it's a builtin feature in Haskell that it uses lightweight threads internally. And because pthread doesn't do that you cannot do it in C and therefore the comparison is valid. Even though C doesn't have the concept of threading and everything must be done with an external library in anycase.

I could dig the old thread up but I've given up with those. I have nothing against high level languages, I use them a lot. However I find it very weird that some true fans feel the need of creating enormously biased benchmarks and then being proud that those benchmarks show that their favourite language is the best.

I've never worked on a project where my job was to create 256 threads. Instead, I would be tasked with processing lots of requests or datasets in parallel. If Haskell provides convenient and idiomatic ways to do this with lightweight threads that C does not, then C is effectively slower. If C has a commonly used and available library to implement this same approach - then maybe its faster. The whole point is we are comparing implementations of a programming task.

C doesn't impose a specific threading model. So a C threading benchmark is specific to the library you are using. And if you choose a heavyweight thread library to compare to a language which is not using heavyweight threads, it is not an apples to apples comparison. Arguably it could be, if C imposed a specific threading model, which it does not.

You are comparing implementations, not languages

Yes and that was my point. The benchmark is a comparison of implementations - compilers, run times and libraries. I am not sure how you would benchmark a language.

is there a nice lightweight threading library in c that distributes _many_ small threads on _few_ (not one) posix threads? if so: please tell me. i am genuinely interested.

You could use green threads in C but you'd also need asynchronous io to work with these green threads, and an ecosystem of libraries around it. Ghc has it therefore it makes sense to use its by default green threads. In c there is no standard ecosystem around green threads therefore it is harder to write the benchmarks that way.

This misses the point on why green threads are, and can be the default behavior in Haskell. It's because the purity allows it to be.

Sure you can make "green threads" in C, but you as the programmer must be very careful how you use these threads. You can't go accessing some global state (or performing many different side-effects) from them freely - so you must design your code to be as "purely functional" as possible to make any sense of them. The problem is when you come to use somebody elses code - how do you know it doesn't cause side-effects, if say, you only have the object and header files?

You don't, so at best you could "trial and error" until you find out that they're safe enough to use in green threads, but some bug might come back to bite you in the long run.

Haskell prevents this from ever being the case, because any function which does have side-effects must clearly express the fact in it's type signature. A green thread API can therefore specify the limitations on side-effects that may occur in it.

Purity has nothing to do with green threads. Many languages that doesn't have the concept of purity have green threads, like Erlang or Go. In fact, green threads are not a part of Haskell, but a part of the runtime. Purity is just a nice thing to have when working with threading or concurrent programs.

well. but a no-shared-writeable-data approach (as with erlang) certainly helps. and that comes for free with purity.

i don't know how go handles that though. i'd be surprised if this was as efficient as in erlang and haskell.

I disagree. Threads and green threads are not different in any way that is related to purity.

Haskell makes threads much more pleasant due to Haskell code generally being much more orthogonal. But even in C, it would make more sense to create green threads rather than actual threads (at least any thread beyond the first thread-per-core) in most settings you'd actually use threads for.

"Just start 256 threads" means you're testing the OS (or /maybe/ your runtime libraries) not the language.

I think that if you were comparing, say, erlang threads to Haskell threads then you would be comparing the languages because both of them have their own internal threads.

Because C has no such concept, and can only use hardware/OS threads, the comparison is moot.

Could everyone on HN just take a course in languages theory so we can all stop with these stupid trolls about the best languages which have been emerging for a week.

Hopefully it would allow everyone to realize that a language is just some syntax and semantic and that a compiler is just a program like another. Nothing sacred here. Hell you can even do imperative programming in Haskell if you wish. Coding an interpreter to do so is not particularly difficult.

With a bit of luck, everyone would understand at the same time that the compiler is building what is executed therefore the expression"speed of a language" is utter nonsense. You can only speak of the speed of the implementation which, yes, vary widely amongst compilers for the same language.

So, yes, Haskell semantics encourage functional programming, C semantics imperative programming, both are Turing complete and yes GCC is currently better at optimising C code than any Haskell compiler, nothing new under the sun. Can we all go back to a normal activity now ?

> So, yes, Haskell semantics encourage functional programming, C semantics imperative programming, both are Turing complete and yes GCC is currently better at optimising C code than any Haskell compiler

No. That is not why C is winning. C is winning because of fundamental differences between its semantics and Haskell's semantics that makes it possible to implement C with less overhead on actual CPUs.

If the lesson you took from your language theory course is that all Turing-complete languages are equivalent, you should ask for your money back. While it is true that they are all capable of expressing any algorithm, there are fundamental differences that deeply affect the efficiency of implementing them: http://news.ycombinator.com/item?id=5082609

I'm trying to translate what you are saying. All languages are Turing complete, so it doesn't matter? Do you program in assembly language then? I mean, if it doesn't matter....

> All languages are Turing complete, so it doesn't matter?

Precisely. The whole "my language is better" argument is completely void. Syntax is mostly a matter of preference. Semantic will make the structure of your program different but in the end there is no actual difference on what you can do only on how you will do it. Once again it mostly boils down to preference.

If you really need to argue about something go for my compiler/optimisation pass is better that at least does make sense. Note that there is no theoretical bound preventing an Haskell compiler to generate code equally fast than the one of a C compiler for any program.

> Do you program in assembly language then?

I did when I had to (quite a long time ago). While languages don't matter, compilers do when you need efficiency. While enjoying ML-based languages more, I also did some C. As I stated before, C compilers are better. Well, I even used my own little syntax-extension of C for a moment (if you don't have to share the code, you can do whatever you want with it).

> Syntax is mostly a matter of preference.

And to paraphrase Feynman, mathematics is mostly the invention of better syntax. One thing I'd say which distinguishes Haskellers is that they see their programming language as a piece of mathematics. A Haskell program is a bunch of equations, and the semantics are given by domain theory.

> Note that there is no theoretical bound preventing an Haskell compiler to generate code equally fast than the one of a C compiler for any program.

And there's no theoretical bound saying that C is faster than Brainfuck. But no Brainfuck implementation will ever catch C for, say, parsing, and I'd be willing to put money on that.

This "all Turing-complete languages are on equal performance footing" is nonsense. Just because you can't write a proof that shows a result mathematically does not mean that a difference does not exist.

>>"Precisely. The whole "my language is better" argument is completely void."

You say later that the difference is "only on how you will do it". If you stop to think for a second, you come to realise that syntax, abstractions and the big elephant in the room: how "tuned" a language is for the myriad of computing platforms/operating systems/domains (scientific, business, web) -among many other factors- play a HUGE part in how you can develop software and how performant it can/will be. So languages does matter.

>>"Note that there is no theoretical bound preventing an Haskell compiler to generate code equally fast than the one of a C compiler for any program."

But it won't, because at the end of the day, a machine has to execute the programs written in a language and C was designed for a machine with certain characteristics in mind while Haskell/Lisp and other higher level languages are an exploration in abstractions.

When Abelson & Sussman says: "Programs must be written for people to read, and only incidentally for machines to execute.", there is an implicit understanding that trade-offs are being made and its not the same as C and other lower level programming languages. Until a machine appears that isn't limited by the constrains of the present, making trade-offs is inevitable and that too, does matter.

@jacquesm: Great write-up. Enjoyed every line of it.

P.S: I make no distinction between a language and its implementation here.

> Syntax is mostly a matter of preference.

While I halfways agree here, syntax follows semantics.

If a language gives you a syntatical sugar form of something that can only be expressed through many lines of code in another, then it makes a great deal of difference.

If one language cannot be better than another, why is it so hard to name examples of BF programs that paid their own development costs and decades of research costs? I can think of a language for which there is such an example, by the way:


>Precisely. The whole "my language is better" argument is completely void. Syntax is mostly a matter of preference. Semantic will make the structure of your program different but in the end there is no actual difference on what you can do only on how you will do it.

That is extraordinarily wrong.


1) affects the structure of a program,

2) affects how we think about it,

3) affects what is easy to do and what is not,

4) adds or removes mental strain on reasoning about the program,

5) forbids or enables certain kinds of errors.

And much more.

The idea that syntax doesn't matter and is "a matter of preference" is completely bogus. The human mind has certain limitations and capabilities. Some syntaxes cater to that, some do not.

No one using Brainfuck will never be as productive as Python, to take an extreme case.

> No one using Brainfuck will never be as productive as Python, to take an extreme case.

Isn't it a good thing then, that no one prefers using Brainfuck to using Python at work? Leaving things up to preference doesn't mean that all the choices are the same. It means that the languages and the use cases are so varied that it isn't a good idea to restrict oneself blindly to a language before evaluating the use case. It doesn't seem like you are really in disagreement.

>It doesn't seem like you are really in disagreement.

Oh, I'm very, very much in disagreement.

It's not just about "preference" as if preference is arbitrary.

Superficial syntactic preference IS arbitrary.

Substancial syntax issues are not.

Brainfuck is not just left alone because people don't "prefer" its syntax. That's reading it in reverse. People don't prefer Brainfuck's syntax because it is objectively, and for reasons related to human psychology, cognition, etc bad.

E.g it makes it measurably difficult to discern different program states.

Preference isn't arbitrary, people make reasoned judgments to determine what tools they will use to get their work done.

No one spends 40 hours a week with a language without knowing its strengths and weaknesses.

> No one spends 40 hours a week with a language without knowing its strengths and weaknesses.

I wouldn't take that bet.

It amazes me how bad developers seem to be prepared nowadays in compiler design and language theory.

Back in the day we were able to understand that language and implementation are separate concepts. As well as why certain languages had a specific implementation as the default one.

Now young developers seems to think language and implementation are the same thing.

Many languages go so far as to specify a virtual machine to run the compiled program, so your statement doesn't make much sense in that context. Languages, machines, libraries and implementation are all deeply interconnected. Try writing C without a stack. Try writing Java without java.lang.

Thank you for brilliantly proving his point. What you wrote just prove that you completely mixed what is a language and what is its implementation.

Of course C semantic uses a calling stack in its definition. Does that mean you need a physical stack to implement it ? Not at all, you can perfectly implement C in anything as long as you can simulate a stack. Just look at the JVM and Dalvik. Dalvik is register based while the JVM is stack-based and they both run Java.

java.lang is a standard library. It's only part of the language definition in a really broad sense. It's actually written in Java. Any complete Java compiler can and will compile it (with some modification for the purely JVM oriented functions). There is a lot of projects out there trying to compile Java on a target which is not the JVM.

You just prove my point.

Of course languages define virtual machines, or in languages like C the abstract machine model, but that is only how the developer sees the machine through the language semantics.

This does not change in any way whatsoever how the language is actually implemented.

This discussion is derailing. From the top-level comment:

Hopefully it would allow everyone to realize that a language is just some syntax and semantic and that a compiler is just a program like another.

Which is true. However, this glosses over the fact that it is prohibitively hard to write efficient compilers for some languages. For instance, Prolog's semantics are so far removed from our machines that it is terribly hard to make it efficient for general purpose programs.

Semantics do make some languages harder to compile to efficient machine code than others.

I think it is usually easy to determine by context when someone is referring to a language and when someone is referring to the typical implementation, libraries, etc that come along with a language. In the context of performance benchmarking it is clear (to me at least) that the benchmark would be of a particular implementation, of which there is usually an obvious choice. I don't think there is really any misunderstanding of that.

This is totally wrong. Different languages have different levels of power. Sure, you could always write an interpreter but that's exactly the difference between a power language vs. a weak one: to do X will I need to write a whole interpreter here or can the language just do it?

Lisp is so powerful because you can do things like delayed evaluation of arguments, via macros. With Haskell you don't even need this due to the lazy evaluation.

Some people like bickering. But I think the point of the article is that C is, by nature of the language, easier to optimize than Haskell, since it is lower level.

> C is, by nature of the language, easier to optimize than Haskell, since it is lower level.

Depends what level you mean - if you want to do all the optimisations yourself, C is better; but if you want to write code and let the compiler do the details, I would think higher-level is better (you can just write doWhatIMean() and the compiler will automatically choose the optimal implementation for your current problem and platform - where if you'd specified the implementation details yourself, you'd be sub-optimal in many cases)

> "(you can just write doWhatIMean() and the compiler will automatically choose the optimal implementation for your current problem and platform - where if you'd specified the implementation details yourself, you'd be sub-optimal in many cases)"

How far are we with declarative programming in sense of being expressive and expecting the compiler to understand what we mean? I'd believe that without strong AI human will always be better off with imperative programming than a compiler with declarative, for anythig beyond relatively simple isolated pieces of code/algorithms. In general we have so much more experience and knowledge about the target system and environment than the compiler does, this is less true for JIT compilers, but they come with their own overhead.

I'd really love to see superoptimization[1] to be done as a service for compilers. Say you have a function with certain semantics the compiler is fully certain about. The compiler fingerprints this function and uploads a fingerprint + behavioral analysis to <somewhere in cloud> where it's being exhaustively optimized by bruteforcing all the meaningful instruction sequences which conform to the semantics of the function. After the most optimal piece of instructions is being found, it's associated with the finerprint and stored in a database and then returned to the compiler. Now, when ever someone else writes the exact same function(or code with exact same semantics) the compiler queries the <some cloud service> for the optimal piece of code. Of course, a system like this would need more information about the actual target of the code(CPU architecture, cost of things like memory access, cache miss, branch mispreciction etc.).

[1]: https://en.wikipedia.org/wiki/Superoptimization (The system I roughly described is far more extensively analyzed in some of the papers. Really great read!)

I disagree. Have you read any articles on stream fusion in Haskell? You can tell the compiler that certain actions are equivalent and suddenly some code that makes two passes down a tree makes just one pass. Doing the same thing in C would be a lot more work precisely because you're working at such a low level.

"Third, a comparison on speed between Haskell and C is almost meaningless, speed is not Haskells strongest point and it would be folly to bench it against C for that particular reason."

I disagree wholeheartedly. If I am choosing Haskell over C, I am giving up some (possibility of) performance. The question of how much is an important piece of information, entirely relevant to that decision.

As I observed in a comment on that other post, the actual performance of both the Haskell and the C depends on the amount of effort expended to make them faster (first in general, and then possibly on a particular architecture). At the limit, the C beats the Haskell by some margin - the size of that margin is informative; but that's also not the whole story - what the margin is earlier also matters, and for a particular project, it might matter more.

This is not to say that the particular benchmarks in the earlier article were good choices - I don't have a clear position on that.

Nice refutation. The problem about doing language comparisons for speed is that they generally require a non-trivial example, so you have to be an excellent programmer in all the languages you've compared.

Of course, language speed charts are generally as useful as PC spec charts and YouTube videos of Nürburgring laptimes when you're buying a new car.

well, what about an existing piece of software, and compare on more than just speed, but ease of maintenance (meaning, how quickly you can add features or fix a bug)?

There seems to be a myriad of window managers and web frameworks. Someone should do an unbiased comparison.

Same thing again, it depends entirely on really specific domain logic. How easy it is to add a new feature on a C based web browser versus one built on Haskell is mostly useless, it just tells you how quickly one person added one feature on one specific thing.

Languages aren't inherently comparable, you can get useful benchmarks but they shouldn't be taken as gospel.

This article starts by totally misrepresenting the original article (http://paulspontifications.blogspot.co.uk/2013/01/when-haske...) by saying that it said "Haskell beats out C when it comes to speed".

If you bother to read even the TLDR of the original article you 'll find it says something completely different, which makes this article pretty poor as a response.

From TFA:

"So here is the bottom line. If you really need your code to run as fast as possible, and you are planning to spend half your development time doing profiling and optimisation, then go ahead and write it in C because no other language (except possibly Assembler) will give you the level of control you need. But if you aren't planning to do a significant amount of micro-optimisation then you don't need C. Haskell will give you the same performance, or better, and cut your development and maintenance costs by 75 to 90 percent as well."

Note the 'same performance or better' in there.

Maybe you missed that bit in the original article?

This wasn't a large effort by any stretch of the imagination and a factor of 5 difference compared to the Haskell code isn't even in the same ballpark as "the same performance", and about a factor 10 difference with the C code listed in the original article. You'll notice no micro optimizations were done in the re-write, just some global re-arranging of the way data flows through the program.

The rest is in response to the misleading title, that Haskell is 'faster' than C, faster to program, faster to debug, easier to maintain and so on while making claims about performance that are not born out by real world comparisons.

Speed was the wrong thing to compare on here, and should not have been part of the original article without a serious effort to become equally proficient in both languages.

Compared to the amount of optimisation that many people typically do (i.e. zero or close to zero), it was a large amount of extra work to get that performance boost. The global re-arrangements of the way data flows through the code are more difficult than many micro-optimisations, not less, so I don't understand how you can claim.

And you don't have to be an expert in C to make the claim he is making, because most people are not experts in C, but might turn to it because "it's faster". It's precisely this kind of person who needs to understand that with their level of knowledge, and the amount of effort that they might put into optimisations, Haskell can be just as good or better.

> it was a large amount of extra work to get that performance boost.

Not really, actually. It took a lot of time to document the transformations, but actually doing the coding was short, much less than an hour total.

> The global re-arrangements of the way data flows through the code are more difficult than many micro-optimisations, not less, so I don't understand how you can claim.

Maybe more difficult for you, but not for me. Micro-optimizations are actually harder for me, there are infinite possibilities there, flow has really only one (or very few) 'proper' mapping(s) from problem space to solution space.

> And you don't have to be an expert in C to make the claim he is making, because most people are not experts in C, but might turn to it because "it's faster".

Every tool has its uses. If you pick C then probably you're doing so because you are convinced that you need it. You should then study how accomplished users of that tool use it, not stop at the first naive use that you come up with yourself.

> It's precisely this kind of person who needs to understand that with their level of knowledge, and the amount of effort that they might put into optimisations, Haskell can be just as good or better.

That's very well possible, again, I do not know Haskell so I can't comment on it. But I don't see any meaningful comparison between the two languages here, speed certainly isn't it and you failed to acknowledge that the article in fact did make that claim.

If Haskell, Lisp and friends are so great for building large software systems then surely people are going to flock to them and do exactly that. And if that didn't happen, of course the proponents of these languages would sit down and try to figure out why this wasn't happening. Right?

That's assuming people are rational and willing to put up with large up-front costs on the promise of future returns.

Haskell is different, and learning Haskell is closer to learning programming all over again than just picking up another algolish language. Most programmers don't even seem willing to do the latter, much less learn something truly novel.

The people who do use Haskell professionally (e.g. Standard Charter or some number of hedge funds) seem to be very happy with it.

Now, there are some core problems--largely with education and ecosystem maturity--that make Haskell harder to adopt. Recently, people have actually started working on both of these[1]. But neither is a quality of the language itself.

[1]: http://fpcomplete.com/

Education is a big part of the problem. When I attended university, we weren't even told that functional programming exists. Only later on, through one of my friends it was brought to my attention. And it's hard to change that mindset, particularly at the university I was. They are set to produce a constant stream of Java/.NET developers for the companies around them (large banks, in particular).

Is this common? I'm taking the "Introduction To Programming Languages" at Coursera and we're using SML straight out of the gate. It's really fun and totally different from the Java and before that the basic AS/JS I experimented with. Do universities not tend to offer these types of courses?

Is this an advantage of the free courses (relatively unpopular or obscure languages being taught for learning's sake) which paid courses can't or won't offer?

Which university was this, and which program?

The university in Lucerne, Switzerland. Hence the reference to large banks ;)

Haskellers are aware of various downsides and disadvantages to using Haskell and are actively working to improve on those.

Also, the FP complete company was created to answer that exact question and cater to commercial companies who want to use Haskell. They provide training, and survey them to see what is holding them back from using Haskell, and providing them with solutions.

Not necessarily. Other languages will borrow ideas from these languages and that way many strengths of functional programming can be applied in everyday code. I don't think there are many people who would outright claim that functional programming doesn't have benefits at all which could be used in imperative languages too.

Check out GNU Emacs some time. Last I looked the standard distribution came with a million lines of Lisp code.

>If Haskell, Lisp and friends are so great for building large software systems then surely people are going to flock to them and do exactly that

No, I have seen no evidence to suggest that a majority of people will do the best thing when given the option. Popularity does not correlate well with quality. Or are you seriously suggesting "Transformers 19: BANG BANG EXPLOSIONS!!1" is the pinnacle of human artistic achievement? Those who realize how good haskell is are using it. Why would we stop and try to figure out why the majority are still watching Michael Bay sequels instead?

The original article did mention that switching to use lines instead of characters would probably speed things up considerably, but argued that this makes the program harder to understand, while Haskell makes the same optimizations automatically on the character-based program.

And that's a fine point to make, but I do think it was a little misleading and handwavy. The proper thing to do would be to make the change and show the difference both in code and performance, and let the readers draw their own conclusions.

The 'line reader' version of the code (not shown in the article) was actually a lot easier to read than either one of these but the article was already over long. I do have that version, if anybody is interested I can add it.

i'm sure you have all your intermediate steps as well. i really liked an example i've seen a while ago that described an optimization process in git history. you'd simply check out the right step and have a look for yourself.

Yawn. It is not possible to directly compare languages wrt speed: we only can compare the speed of their implementations and reason about the features that make a language amenable to optimizations.

And we all know this.

So why, despite this knowledge, do we engage in "language speed flamewars" like this?

That's a valid question, and it belongs in the field of psychology of hackerhood. Does anyone have a reply to wager?

> So why, despite this knowledge, do we engage in "language speed flamewars" like this?

That was exactly the point.

The reason why this happens is because people are - just like me - invested in their tools, they want to tell themselves they made the right choice. And then they want to tell the world the same thing.

For me it doesn't matter which language is faster in a project. The parameter space that decides which language is 'right' is rarely determined by raw speed. But for those cases where raw speed is the determining factor you could do a lot worse than using C, and if you have latency requirements it is almost unavoidable on most ordinary platforms that we use.

I for one find things like these extremely interesting. The more there's analysis about performance implications of certain ways of doing the task the better. The only thing I'm left missing here is to actually look into the profiling information(branches, caches, instructions per cycle etc.), especially on the compiler generated assembly to see what exact instruction sequences are to blame, and if there are ways to save cycles here or there.

If only there were more things like this in the internets with deep analysis about performance of a given piece of code... :)

It was pretty long as it was... Tacking on a code generation section to show why certain things are inefficient (and the associated detour to how the standard library is implemented which would be required reading to understand all the implications) would make the article three times as long. I promised I'd have it done by Monday.

> So why, despite this knowledge, do we engage in "language speed flamewars" like this?

Because here in the real world, speed sometimes matters? And many languages are fairly closely associated with given implementations that live in the real world, and are more or less fast.

These days, I mostly write Ruby because time to market is more important than having code run fast, but in its current form, it's generally a slower language than C. There, I said it! Yes, of course I really mean implementation, but realistically, there are no Ruby implementations that are faster than compiled C code.

I submit that these are really always about implementation speed. And this matters because at the end of the day we will be using an implementation, not a pure language without implementation.

Excellent. Documenting successive transformations on code is much more instructive than showing a final snapshot. This is one reason I really enjoyed doing live-coding when teaching programming.

This is a classic pattern: when advocating for one of several competing approaches, you spend more care optimizing your favourite approach than the competitors.

Prof. Geoffrey Hinton specifically calls this out as a pitfall in research. It's better to compare your promising new approach versus another group's approach (not your own implementation thereof). This, assuming that the other group has deep knowledge in tuning their competing approach.

I think it's only human. The standard should be that you present your case for your 'favorite' and then put out an open challenge to supporters of the opposition to do their worst.

The problem with fringe languages has always been the "all talk - no walk" nature they all seem to have.

Stop telling me why you are so awesome. Just show me with shipped product.

Until then it's just an intellectual circle jerk.

Talk less. Ship more.

The old appeal to popularity. The only argument that is worse is the appeal to faith.

An intellectual circle jerk occurs when one repeats the same old mantra without doing any actual research. There are plenty of "shipping" products in languages like Haskell. I'm no expert on Haskell, but this wasn't hard to find:


A better tagline might be: Think more and don't settle for the status quo.

That page has generated some heat in the past: http://flyingfrogblog.blogspot.ch/2010/05/why-is-haskell-use... (admittedly this guy seems to have some personal grievances with people in the Haskell community) I don't know how much the page has changed since this rant was written, but some of the entries seem to be the same, e.g. 'Ansemond LLC' which is a company with a single mac app that hasn't been updated since a beta in 2009 and only mentions Haskell in an old blog post. The site for that company is so charmingly dated that I feel kind of bad for highlighting it, but if this is the kind of 'industry success story' they think worth mentioning then I'm not sure the wiki page is of any value.

>admittedly this guy seems to have some personal grievances with people in the Haskell community

No, he's a troll. He trolled lisp, then haskell, then ocaml. 2010 fell into his haskell phase, that's the entire reason for that particular pile of bullshit.

The thing is, there is some truth in among the bullshit. That 'Haskell in Industry' page really does have links to non existent products or long dead projects or things that don't seem to be using Haskell at all. And even after that blog post was widely circulated 2 years ago the page remains unchanged, and people are still citing it as evidence for the widespread use of Haskell in industry.

That page is just sad when compared to C/C++/Java.

When Haskell is pushing trillions (downloads/cash/views/ads) every year - call me.


> Bad science. The Haskell community generate an unprecedented amount of bad science, setting out to draw the conclusion that Haskell is great and using every trick in the book to make that conclusion seem feasible. In many cases, it requires quite some effort to uncover the truth. For example, Saynte published results for "naively" parallelized ray tracers and concluded that Haskell was the easiest language to parallelize efficiently, requiring only a single line change. This deception was perpetrated by starting with a "serial" version that had already been extensively reworked in order to make it amenable to parallelization and then disabling optimizations only for the competitors (and even adding their compile times in one case!) in order to make Haskell appear competitive. Moreover, that reworking could only have been done on the basis of extensive benchmarking and development. Even the academics publishing Haskell research are up to it. For example, in the recent paper Regular Shape-polymorphic parallel arrays Manuel Chakravarty et al. artificially close the gap between Haskell and C by benchmarking only cache ignorant algorithms (that spend all of their time stalled on unnecessary cache misses), incorrectly states that such algorithms are "widely used", describes the one-line change required to parallelize the C code as "considerable additional effort" despite the fact that both the serial and parallel C solutions are substantially more concise than their Haskell counterparts, refused to provide me with their C code so that I could try to verify their findings (until I published this blog post, their code is now here) and cherry picked Haskell's highest performing results as a function of the number of threads (which peaks at an unpredictable number before seeing performance degradation). When I voiced my concerns, Don Stewart abused his moderator status on the forum by deleting my criticisms. To date, these flaws in the paper have not been addressed.

Source: http://flyingfrogblog.blogspot.ch/2010/05/why-is-haskell-use...

Guess Haskell knows a thing or two about misrepresentation.

>When Haskell is pushing trillions (downloads/cash/views/ads) every year - call me.

Did you not notice how many banks are on that page?

>Source: http://flyingfrogblog.blogspot.ch/

Are you seriously referencing Harrop to support your argument? You do realize he is a troll right? Not like, people disagree with him, an actual, original definition troll. He jumps from language to language as people learn to ignore him, and has gone through trolling every major functional programming language except F#, which is the one he currently says is perfect and ocaml is shit. Before that, ocaml was perfect and haskell was shit. Before that, haskell was perfect and lisp was shit. Google his name, there's literally thousands of messages from him in mailing list archives demonstrating this.

Those banks do not have significant Haskell investment or units - so your point is void.

Yes, they do. Does posting your made up nonsense online help you believe it?

There is truth in this. Certainly shipping more is good. But the situation is somewhat analogous to the perennial debate about commenting code, with "talk" playing the comment role and "ship[ped code]" playing the code role, so that your "talk less ship more" position becomes analogous to the perennial "code should be self-documenting" position. There is truth in that: to an important extent, well-designed code can be "self documenting" and thus speak for itself. Analogously, to an important extent, shipping code can work so well that its demonstrated excellence tells you something about the techniques that it uses, and thus it speaks for itself. However, it's hard to use that kind of speaking-for-itself effect to communicate some abstract issues , such as the issue of why something was done that way. Text explaining "here's why such-and-such [inheritance/functional/macro/declarative/whatever] techniques are important here" can be hard to replace with "look at the code". This problem can be quite severe for languages with extremely unusual paradigms, like the Haskell or Prolog examples that people have mentioned.

Beyond that, many important issues only become significant in sizable programs, to the extent that they are imperceptible in examples less than a few pages. (This is not only an issue for language features: books like _Refactoring_ and _Large-Scale C++ Software Design_ are devoted to issues that become important with scale, and would probably be very hard to appreciate for someone who's never constructed or maintained a piece of software that required more than 5 hours of programmer time.) So if you get your mind around an entire application domain and a software system for it which is a good fit to an exotic programming language --- as, e.g., the HOL Light machine-checkable proof assistant seems to be a good fit to OCaml --- then you will very likely learn something about the strengths and weaknesses of the language and its paradigm. But some fraction of people, even perhaps some fraction of HN readers, will treat a software system like HOL Light as "tl;dr", so a 5-minute or 30-minute pitch on how the language fits the problem may be more effective in practice.

Just a quick point. He quotes me:

    You can't make a computer do stuff faster,
    but you can make it do less work.
That's not original to me, but I don't have (and haven't been able to find) a good, definitive, original source.

Is there a reason "Is" and "Not" are capitalized but "faster" and "than" aren't, because it's just been bugging all day whenever I refresh the front page and look over the list for anything interesting.

This is cool. However, in the spirit of contributing, comparing one byte at a time is not optimally efficient. It's possible to write clever optimizations by hand, but I'd be surprised if just using https://github.com/rbdixon/glibc/blob/master/string/memchr.c doesn't cause a meaningful speedup.

The inner loop of those comparisons is indeed the spot where you can still speed up as noted in the last part of the post, the kind of optimizations that you describe are extremely effective but qualify as 'micro optimizations' and I expressly left those out because they impact readability considerably. But, you're right, if that's what it takes then so be it and then readability would have to suffer in deference to the last couple of % of speed. Maximum gain from this optimization relative to the final runtime is about 20% by my estimation. (Inner loop will step 8 bytes at the time, but will have more instructions).

I concur that doing this directly in your code is extremely ugly; but note that you can get this speedup by just dropping in a call to glibc's memchr(), hiding the ugliness behind a well-known interface.

Good point about memchr, I missed an opportunity there.

edit: because it's clearer and faster, that's a no-brainer, updating the blog post with a remark to that effect and a link to parent.

There is a bug that causes invalid output when the input file is larger then the BLOCKLEN. Set BLOCKLEN to 4096 and run the program on the test input file [1] (compare with output file [2]), to see the problem on a smaller input file.

The bug happens when read_sequence() is called with partial data saved from the last sequence, since (size == read) the first fread() will be asked for 0 bytes which causes the read loop to end early (n == 0).

1. http://benchmarksgame.alioth.debian.org/u32/iofile.php?test=...

2. http://benchmarksgame.alioth.debian.org/u32/iofile.php?test=...

That was a good catch! Serves me right for testing with the small file for correctness and testing for speed with the larger one.

What do you guys think I should learn next? I've been studying javascript/ruby and I would like to diversify what I can make. I don't want to rush learning new languages, but it's interesting to learn about paradigms that certain languages enforce.

What's important to me(not in order):

1) Being an omni-platform developer 2) I only know two high-level languages, so learning a low level language will benefit me. 3) I want to get into Natural Language Processing and deep neural networks/AI, machine learning etc. 4) I'm a self-taught programmer, so the amount of good resources online/amazon is a big factor. 5) I need a fucking job. 6) I like making web applications/web games, and I would like to make native applications/games as well.

I'd love to be able to reason like that about the program I'm writing. Really cool article.

Perfection is achieved not when there is nothing more to add, but when there is nothing more to remove.)

When one programs a computer (in a way it suggested in TAOCP) and you know your data, your memory layout and your CPU instruction set nothing could beat it.

Of course, there are tasks so massive, that a decent compiler could save lots and lots of man hours, but it is a completely different story.

No Haskell (leave alone J* ) could beat a code by people who know what they do close to the metal.

In some languages the code could be shorter much more readable and elegant that C, but of course, it is not Haskell, or J* .

Thoughtfully crafted Common Lisp could be close to an ideal.)

Lisp's homoiconicity comes with the cost of some extra syntactic weight that Haskell doesn't have to bear. (It's also nice that "lambda" is considered so important that it's given a single character in Haskell).

I'm not familiar with J*, and it seems to be difficult to search for. Do you have a reference so I can go learn about it?



I prefer Erlang.


And your illustration is particularly good for showing how much more "sane" (actually, in many ways) Erlang is. And how elegantly different Python could be.

J* is my own abbreviation for Java and Javascript.)

I prefer Erlang.

Just out of curiosity, why is that? (With reference to the linked example, I mean.) Clearly they're in the same league, and so it's a close call. From my perspective, I like that Haskell reserved the single vertical bar for the list comprehension notation, so that it would be as close to familiar set-comprehension notation as possible. I also like how Haskell takes advantage of how the cons operator already implies in the pattern-matching that you're working with a list, and so the square brackets become unnecessary, whereas Erlang needs the noisy extra layer of brackets on top of the parenthesis needed for the pattern.

But once languages get this elegant, it's a close comparison. They're both beautiful to my eyes.

I think the sytnax of Erlang is much more elegant, because it was build around the fundamental idea of pattern matching as a core feature of the language.

When you have a few good ideas put together, you could get an elegant solution. When you just stuff everything inside (like Clojure) or went to extremes (like Haskell) all you got is just a mess.

Erlang syntax, however, is noisy due to all those punctuation, but it is elegant nevertheless.

  map(Fun, [H|T]) -> [Fun(H)|map(Fun, T)];
  map(Fun, []) -> [].
  member(H, [H|T]) -> true;
  member(H, [_|T] -> member(H, T);
  member(H, []) -> false.
This is intuitive, readable, but noisy.)

I'm not sure I follow you. Is the pattern-matching in Haskell somehow not 'core' enough? Both qsort examples linked in this thread use pattern matching.

All I'm trying to say that Erlang syntax is much more intuitive and readable, being derived from Prolog.

I honestly can't see why should I use Haskell (except for being so clever) when I have Erlang, or at least one real advantage.

well, let's look.

map f (h:t) = f h:map f t

map f [] = []

member x (h:t) = x == h || member x t

member _ [] = False

I think the only difference is that you can't match for equality directly in the pattern.

If you're going to do a response blog post link to the original article at the beginning when you mention the original.

I read the original but don't assume everyone has.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact