I think it's surprisingly fast because it's usually much faster than things like SBCL, (last time I checked) Racket, and modern JS, and languages like Java and Golang are much less expressive.
I had SBCL around Haskell/Go/Java in my "mental model of language performance", but I don't know much about it and I'm not sure how to evaluate it. From what I understand, Racket uses a bytecode VM and/or a JIT, so Haskell being faster wouldn't surprise me.
For Java and Go, that's true, though in both cases that's mostly because of decisions rather than tradeoffs. Go doesn't have a primitive type system because it makes it faster, but mostly because that's a design choice. Same thing for Java. I'd say you can get a good part of what makes Haskell good by using a ML as the basis for your language . Lack of ad-hoc polymorphism is usually a criticism of ML and its descendants, but Go and Java both lack it too, and both added generics 10 years after the language was created, which ML supports from the beginning. Rust seem to have achieved having parametric and ad-hoc polymorphism, but I don't know how easy or hard it would be to "extract" that "part" of Rust (without the borrow checker) and basically stick a GC on it. I also don't know how performant it would be. Lastly, Go, Java and OCaml all share fast compilation times, Haskell doesn't really and Rust is known for being slow.
Again, I'm far from an expert on all of that, but I think Haskell in the right place in terms of performance and expressivity. It's one of these cases where designing a language for performance in the first place instead of retrofitting it makes things much easier, I think.
> Racket uses a bytecode VM and/or a JIT, so Haskell being faster wouldn't surprise me.
They support AOT native-code compilation too, but they say the JIT generally performs better: https://download.racket-lang.org/docs/5.1/html/raco/ext.html but my experience in the past was that it didn't perform that well. Maybe it's improved.
> Go doesn't have a primitive type system because it makes it faster, but mostly because that's a design choice. Same thing for Java.
I agree: a primary goal for both languages was clarity, even at the expense of some generality. Java was very slow for a decade (HotSpot wasn't released until 01999 and wasn't very good until about 02002), but a few billion dollars and a Fortune-500 bankruptcy later, it's really pretty zippy! (As long as you don't care about startup time.) Golang, by contrast, was designed with higher performance goals, which it fell somewhat short of.
> Lack of ad-hoc polymorphism is usually a criticism of ML and its descendants, but Go and Java both lack it too
Ad-hoc polymorphism is absolutely fundamental to Java; one of its big differences from C++ was that nearly every callsite implied ad-hoc polymorphism, and that's where a lot of the optimization effort went. Golang isn't quite so radical about it, but its interface mechanism exists for almost nothing but ad-hoc polymorphism.
OCaml has ad-hoc polymorphism, FWIW, but people don't use it much.
> both added generics 10 years after the language was created
Golang was released in 02009 and still doesn't have generics, but it will this year. Not far from 10 years.
> Haskell ... It's one of these cases where designing a language for performance in the first place instead of retrofitting it makes things much easier, I think.
I wouldn't have listed Haskell as a language designed for performance. When it came out in 01990 the conventional wisdom was that garbage collection was inherently super slow, especially if you couldn't avoid allocating in your inner loops (by mutating instead).
Allen Wirfs-Brock says Tektronix Smalltalk for the 4405/4406 "may have been the first shipping commercial product, running on a off-the-self processor, to use a generational GC" http://www.wirfs-brock.com/allen/things/smalltalk-things/tek... in 01986, only four years before Haskell was designed. That was the key advance that made efficient pure functional languages feasible, and it was still too early to be sure how it would shake out. That's one reason Java (still Oak in 01992) entirely lacked implicit heap allocations: the conventional wisdom was still that heap allocation was very expensive because it required the GC to run more, so they didn't want to impose that on programmers. It was sort of a self-fulfilling prophecy, because even today the way to make your Java (or Golang) go fast is to avoid allocations in your inner loops. Not a big deal in SBCL, LuaJIT, OCaml, or pretty much anything else with a GC heavily tuned for allocation.
Other aspects of Haskell also involved a lot of performance research. Simon Peyton Jones invented the spineless tagless G-machine in the process of writing GHC, published in a paper in 01992, which was sort of a followup to his 01987 book on efficient implementation of pure functional languages. Even so, through the 01990s, Haskell still wasn't very fast. I think a lot of the stuff that makes Haskell reasonably fast is stuff that's been discovered in the last 25 years, after the basic language design was decided.
I'm also far from an expert on this stuff, though, so my picture of what happened might be missing some important aspects. I still know very little Haskell.