Hacker News new | past | comments | ask | show | jobs | submit login
Rust as a gateway drug to Haskell (xion.io)
358 points by miqkt on June 14, 2017 | hide | past | web | favorite | 212 comments



Something that I found about going from Haskell to Rust was that Rust provides many powerful abstractions without compromising on performance. It was very frustrating to try reasoning about performance in Haskell due to its laziness. Every language has quirks about how to write performant code, but Haskell is notorious in my mind for being quite easy to write obscenely slow code in.


>Every language has quirks about how to write performant code, but Haskell is notorious in my mind for being quite easy to write obscenely slow code in.

+1

One approach that occurred to me was to use Haskell and Hackage for prototyping a design and then a Rust implementation for production.

It also reminds me of this from several years ago: GHC honcho Simon-Peyton Jones saying the next Haskell will be strict, but still pure with effects only via monads. [1], [2].

Following on and presumably related to that PureScript is strict. [3]

[1] https://news.ycombinator.com/item?id=1924094

[2] http://www.cs.nott.ac.uk/%7Egmh/appsem-slides/peytonjones.pp... (PowerPoint) Slide 40

[3] https://github.com/purescript/purescript/wiki/Differences-fr...


SPJ also noted that the original success of haskell was due to their pursuit of laziness. Because laziness has no place in a system with side effects, their pursuit of laziness and functional programming is what led them to discover the usefulness of monads for describing it in the first place. If they were pursuing a strict language instead, then they were allowed to take shortcuts, because sneaking in side effects is a lot easier. YEs maybe the next haskell will be strict, but getting to the point to make that decision required laziness.

SPJ made a great lecture about this. It is very easy to follow and a fun insight into the academic history of modern functional languages https://www.youtube.com/watch?v=re96UgMk6GQ


I'm under the impression that SPJ made that comment thinking of adoption (for a future "regrettably popular" Haskell), since the difficulty of reasoning about laziness does put people off. I might be wrong though: maybe he was arguing that laziness doesn't warrant the user and compiler-author cost it brings. I don't remember where I picked this up from.

Idris is strict too, but optional laziness is built into the compiler with the use of a special-cased Lazy wrapper of the form

  data Lazy a = Delay a
This is kind of the opposite of strictness annotations (bang-patterns) from Haskell, except one doesn't need to pattern-match on Delay or use it to wrap arguments to functions expecting Lazy values. The Idris compiler can do something like "laziness analysis" (admittedly trivial in comparison to GHC's strictness analyser) to introduce and eliminate Delay constructors automatically. The laziness information is carried around in the types:

  force : Lazy a -> a
  force a = a
https://github.com/idris-lang/Idris-dev/wiki/Unofficial-FAQ#...

(purescript-lazy is also a thing, but I think it's not as seamless because of the "no compiler magic" position.)

In any case, what about -XStrict? I've never used it (or -XStrictData either), but isn't it a solution to the laziness problem, albeit a nuke-ish one at that?

And while we're thinking ahead, let's talk about the project to get linear types (something similar to Rust lifetimes, but weaker) into GHC :)

https://ghc.haskell.org/trac/ghc/wiki/LinearTypes


>something similar to Rust lifetimes, but weaker

Why weaker? My understanding is that linear types provide a stronger guarantee than Rust's affine types: linear types guarantee that a value is used once, while affine types only guarantee that a value is used no more than once.


Lifetimes are not the same as linear (or affine) types... they're related to "fractional permissions" in the upcoming formalism (and also related to region types found in many other systems, which can be partially emulated with monads).


>GHC honcho Simon-Peyton Jones saying the next Haskell will be strict,

I think all he meant that strictness was an interesting paradigm that would differentiate a language enough to justify its existence alongside Haskell.

>Following on and presumably related to that PureScript is strict

Partly also to distinguish from GHCJS. Laziness in a strict language is really hard and PureScript generates semi-readable JavaScript.


This is a very overstated concern, in my opinion. Based on my experience, once you learn the common gotchas (lazy folds, WHNF, etc.) it's rarely an issue. Real-world programs (web servers, databases, etc.) typically end up using frameworks that deal with strictness concerns for you anyway. For maybe the first couple weeks I was learning Haskell, this was a source of confusion, but I don't seem to run into it anymore.

If it's really a struggle for some reason, you can always take the nuclear option and enable strict mode on your code.

Haskell is very good for writing extremely fast code that's also extremely composable, which most languages (even Rust, to a substantial degree) really struggle with. Haskell's pure non-strict semantics and support for fusion/rewrite systems means that you can write complex operations on vectors, byte arrays, text, etc. that span multiple modules but get compiled down to extremely tight in-place assembly code. Another way of looking at it is that the economies of scale for Haskell are very different. There's a higher up-front cost because you have to learn about how strictness works, but as a consequence you can write vastly more complicated programs that are, say, 80% as fast as hand-optimized C for only 10% of the effort of hand-optimized C.


> .. you can write vastly more complicated programs that are, say, 80% as fast as hand-optimized C for only 10% of the effort of hand-optimized C.

Computer Language Benchmarks of GHC versus C don't seem to be close to matching your 80% as fast claim [1]. Also, the 10% of the effort of hand-optimized C bit - seem to recall there was a caveat - "if you happen to Don Stewart" [2].

The following is just my vague, uninformed impression, but Haskell and Go seem to be opposite ends of a spectrum, with Rust somewhere in between:

Haskell: good for expert individuals and small teams. The language is highly sophisticated and the compiler is rather slow. At heart, it's a research language, so the direction might not always suit production use. If you contract out to a Haskell development shop, who do you turn to with your Haskell code-base if things don't work out?

Go: designed by Google for their use-case - large teams of commodity programmers. It's very quick to onboard programmers - they can be productive and non-dangerous quickly. The language is very simple and the compiler is very quick - which is again good for scaling large teams.

[1] https://benchmarksgame.alioth.debian.org/u64q/compare.php?la...

[2] https://donsbot.wordpress.com/2008/05/06/write-haskell-as-fa...


The benchmarks game falls squarely under the umbrella of "short, simple programs", not the sort of real-world stuff where you would benefit from having a language that scales well. That said, Haskell performs pretty well in the toy benchmarks anyway.

Your comparison between Haskell and Go is about market size, not language design. I suspect you would have a similarly hard time finding Rust devs as Haskell devs (i.e. mildly less convenient than a popular language like Python or Go).


> .. you can write vastly more complicated programs that are, say, 80% as fast as hand-optimized C for only 10% of the effort of hand-optimized C.

I don't mean to be personal, but your quantative claim smells a little like unsubstantiated BS. Do you have any data to back it up?

> That said, Haskell performs pretty well in the toy benchmarks anyway.

Currently, "pretty well" looks like GHC performance is 50% to 10% of hand-crafted C using perhaps 5 times as much memory. (If the GHC program builds at all that is.)

fannkuch-redux

Haskell GHC: 16.70 sec 7,628 mem

C gcc: 8.97 sec 1,660 mem

spectral-norm

Haskell GHC: 4.06 sec, 9,880 mem

C gcc: 1.99 sec, 1,824 mem

n-body

Haskell GHC: 24.48 sec, 6,468 mem

C gcc: 9.96 sec, 1,016 mem

reverse-complement

Haskell GHC: 1.39 sec, 132,176 mem

C gcc: 0.42 sec, 143,948 mem

mandelbrot

Haskell GHC: 11.64 sec, 41,180 mem

C gcc: 1.65 sec, 32,684 mem

fasta

Haskell GHC: 14.68 sec, 455,088 mem

C gcc: 1.33 sec, 2,856 mem

binary-trees

Haskell GHC: 26.28 sec 511,444 mem

C gcc: 2.38 sec 131,728 mem

k-nucleotide

Haskell GHC: Make Error

C gcc 0.06 sec ? mem

pidigits

Haskell GHC: Make Error

C gcc: 0.06 sec ? mem

regex-redux

Haskell GHC: Make Error

C gcc: 0.02 sec ? mem

Haskell GHC: The Glorious Glasgow Haskell Compilation System, version 8.0.2

C gcc: gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406


The make errors are a bit weird because the programs build on my machine no problem.

In parts haskells slowness seems to be because no one cares enough to work on it. Can't say for sure, though, so maybe I just should try a task and implement it to see how fast it'd be without amazing optimization foo.

Although of course these benchmarks aren't really representative of the real world because they'd just end up as a c ffi call in most languages if they actually were that performance critical.


I agree with all that, particularly the bit about the c ffi call. It's more relevant to compare Rust with C and C++, as Rust is a language that could reasonably display C for the ffi call.

But then remember that @wyager is claiming

>80% as fast as hand-optimized C for only 10% of the effort of hand-optimized C

If it's really only 10% of the effort then one shouldn't really have "to work on it" to get semi-decent performance.

A more modest claim of say:

50% as fast as hand-optimized C for only 20% of the effort of hand-optimized C

would have been more credible and still a big win for many use cases.


Tried my hand at the knucleotide[1] one. Lifted the algorithm from the fastest rust solution. As optimization I dropped some Unpack pragmas and strictness annotations, used Data.Hashtable.Cuckoo for in-place updating hash tables because that seemed necessary for the task, and added naive parallelization via

   let outputs = map (exec content) tasks `using` parList rseq
That got around 24 seconds which seems acceptable for a fairly straightforward solution that is around half the size of the rust one I copied. Might try to optimize later.

[1]: http://benchmarksgame.alioth.debian.org/u64q/knucleotide.htm...



That pi-digits program does not "build on [your] machine no problem".

It cannot.

The line "main = putStr.pidgits.read.head =<< getArgs" has been commented out because the program author has optimised-away some of the work that must be done.


Fair enough, I only checked the two knucleotide programs which both failed because of missing dependencies.


>> The make errors are a bit weird because the programs build on my machine no problem. <<

My fault (except for pi-digits and regex-redux).

I neglected to work-through the force-reinstalls that inevitably follow my update of the GHC compiler.

Done.


Language design choice pretty obviously relates to market size as my comment makes clear.

I referenced Rust because that was the article was about.

Swift and F# are languages that are not as basic as Go, but easier to find devs than Haskell.

That said, Rust has a lot going for it, even if there are a lot less devs than Swift.


"mildly" less convenient? Maybe in SV. I'll bet if I tried to hire a semi-experienced Rust developer here in Melbourne I'd be looking until Christmas at least haha.


>Computer Language Benchmarks of GHC versus C don't seem to be close to matching your 80% as fast claim [1]. Also, the 10% of the effort of hand-optimized C bit - seem to recall there was a caveat - "if you happen to Don Stewart" [2].

That's a really bad benchmark. The power of Haskell is not that it's fast for generating the Mandelbrot set, it's that it allows you to things you couldn't do in any other language while still being fast enough to write e.g. compilers (Idris) or web services (Yesod).


Really bad benchmark? Inappropriate choice of benchmark?


I have a hard time seeing the value in a Mandelbrot benchmark. I absolutely don't have a problem with benchmarks that make Haskell look bad (e.g. directory traversal, regex), but I would like a benchmark that provides insight into the code you'd feasibly be writing in that language.


Isn't Haskell intended to be a general purpose programming language?


> commodity programmers

Heh


Some real world programs have a very clear request-response cycle that Haskell deals with very well (although a more predictable garbage collector would be great).

Other real world programs have a strict "always push things into the world, react to inputs" cycle that Haskell has trouble supporting.


I had the same experience: I really enjoyed the expressiveness and power of Haskell, as well as the strong type system. But Rust gives me all those things, as well as iterators for laziness when desired, without the pervasive laziness that had me constantly chasing down space leaks and adding strictness annotations.


>But Rust gives me all those things, as well as iterators for laziness when desired, without the pervasive laziness that had me constantly chasing down space leaks

Well, yeah, Rust has manual memory management. Of course you don't get space leaks.

Anyways, Rust's laziness is not on par with Haskell. Iterators are great but it's not at all the same as e.g. GHC's lazy IO.


> Well, yeah, Rust has manual memory management. Of course you don't get space leaks.

I wouldn't call Rust's memory management "manual". It's the only language I know of that I'd describe as having automatic memory management without garbage collection.

> Anyways, Rust's laziness is not on par with Haskell. Iterators are great but it's not at all the same as e.g. GHC's lazy IO.

Iterators, futures, and macros (for creating new language constructs) tend to provide the laziness I want. Beyond that, I've tended to find laziness as much a source of bugs as features. (Speaking from a perspective of having written and maintained large programs in Haskell, and in Rust.)


Comparing it against Rust doesn't really seem fair. If I wasn't using Haskell, I'd be writing Erlang, and last I checked GHC is much faster than Erlang/OTP. Certainly faster than CPython, Ruby, and most Node.js.

Trying to compete with Rust or C++ performance in Haskell is certainly frustrating to reason about :)


I would argue that at least for client-server apps Erlang is faster. Here is why: in most cases the perceived performance of a server is constrained by maximum (or 99%, whatever) latency, not by throughput. Moreover, you can scale throughput horizontally, while scaling for latency is more or less impossible. Haskell's GC can give you horrendous pauses (see [1][2]), ruining perceived performance. Erlang, on the other hand, is thoroughly optimized for minimal latency.

TL;DR: "fast" isn't well-defined, and for a lot (or even most) applications "fast" is about latency, not throughput, and Erlang is "faster" in that sense.

[1] https://making.pusher.com/latency-working-set-ghc-gc-pick-tw...

[2] they eventually abandoned Haskell for Go https://making.pusher.com/golangs-real-time-gc-in-theory-and...


Yeah, the worst case of ghc's garbage collection is awful and can dwarf the actual work being done. That's the man driving factor behind adding linear types to haskell - they let long lived and large data live on a separate non-gced heap.

The other big part of this is that it can make stream fusion predictable. Currently haskell programs can become 100x slower or faster after some innocuous change, so linear types will make it a lot easier to write fast haskell in general.

http://blog.tweag.io/posts/2017-03-13-linear-types.html


While "GHC is faster than CPython" is a fairly accurate generalization, I still think python performance is easier to reason about. Haskell can give you fast code, but sometimes it will be super slow for no obvious reason.


super slow for no obvious reason

Obviousness is hard to pin down. What may be completely obscure to a Haskell beginner is obvious to a veteran. Is it really fair to count people's strict-language preconceptions against Haskell? I don't know.

GHC provides a wealth of profiling tools that an expert Haskell user can employ to find space leaks. Heck, experts generally know where to look (excessive use of lists, spine-nonstrict data structures) so that the profiling tools become a last resort.


A sufficiently smart [compiler | developer] can [make it so that | think] anything is easy.

Since performance is exactly a measure of the number of imperative steps that need to happen in a program, low level imperative langauages are going to win in performance-obviousness every time. (Rust probably beating C in terms of avoiding memory problems, C probably beating rust in terms of not supporting language constructs that can make a single line worse than O(1).)

"Accidentally" asking the language to do something time-expensive is as impossible in C as forgetting a free() in Haskell.


> "Accidentally" asking the language to do something time-expensive is as impossible in C as forgetting a free() in Haskell.

Unless you use GCC's cleanup extension (which runs an arbitrary function when leaving a scope, similar to Rust's Drop trait). Or evil macros which turn innocent-looking code into something else (a well-known project, in some compilation modes, redefines the "if" keyword to track how many times each branch is taken). Or both at the same time (like hiding GCC's cleanup extension with a macro).


Since performance is exactly a measure of the number of imperative steps that need to happen in a program

This is an oversimplification. Writing fast code for modern processors is all about how much time you spend waiting for memory.


This is an oversimplification. Some compute intensive parts can get significant speed ups through careful use of data parallelism (such as SIMD). Others needs parallelism at the thread level. Etc. Optimizing for memory patterns is important as is trying to keep the working set within a near-level cache.


Cache locality will never beat algorithmic complexity as a performance factor. Don't exaggerate its importance.


It can, when the constant portion of the algorithm chosen is high and the number of items being pushed through the algorithm is low.

e.g. bubblesort will beat quicksort for tiny lists or large lists which are already sorted.

Or, put another way, 1000 operations against 1ns latency memory (L1) has the same performance as 10 operations against 100ns latency memory (RAM).


Moreover, CPython is a interpreter without JIT while GHC compiles to machine code. This is totally different. IMHO, it is much more correct to compare a JIT-enabled interpreter to GHC.


Why not SML/CML where you basically get haskell syntax and types, but get modules, no laziness, and you can mutate things when it's more efficient to do so.


I couldn't stand to work without HKT now that I'm used to them. (Hell, I'm struggling with Scala's lack of polykinded types; you don't need them until you do, but then you really need them).


MLs have higher kinder types.


Which ones? OCaml is planning to add them via its "modular implicits" work but that hasn't happened yet; none of the SML implementations I used mentioned them.


I think you're confusing higher-kinded types with type classes. You're right that neither has type classes.


The difference in both syntax and type system is vast, unless you are coming from, say Fortran or APL.


Scala would like to have a word with you.


Something I've always wondered about Haskell is whether there is, in practice, rope for the compiler to optimize things away from the theoretical underlying implementation. For instance a C compiler is free to turn

  for (int i = 0; i < N; i++) {
    a[i] = 2*b[i];
  }
  for (int i = 0; i < N; i++) {
    c[i] = 2*d[i];
  }
into

  for (int i = 0; i < N; i+=2) {
    a[i] = 2*b[i];
    a[i+1] = 2*b[i+1];
    c[i] = 2*d[i];
    c[i+1] = 2*d[i+1];
  }
It seems like GHC ought to similarly be able to turn some

  foo :: [Int] -> [Int]
into

  foo :: [(Int, Int)] -> [(Int, Int)]
under the covers for a similar speedup as long as it can, as with the C compiler, handle the case of an odd length with a bit of extra code.

There are limits to how much you can restructure a list that's exported outside the compilation unit (my Haskell knowledge is fuzzy here so that's probably not the right terminology). And if the list is being re-stitched in odd ways it might be a bad idea. But as far as I can tell it ought to be doable and useful in most common cases. Of course maybe it's already doing this in those cases where it can prove that it works and that's why Haskell performance generally is good but fragile.


Haskell performs a very cool optimization called "stream fusion" to deal with a whole class of traversal optimization problems; stream fusion removes intermediate list results completely, and bundles a bunch of things that look like maps/reduces/filters into a single operation. Your example doesn't quite hit it, but a similar bit of code like:

    for (int i = 0; i < N; i++)
        a[i] = 2 * b[i];
    for (int i = 0; i < N; i++)
        c[i] = 2 * a[i] + b[i];
    t = 0;
    for (int i = 0; i < N; i++)
        if (b[i] % 2 == 0) t += c[i];
Would be rewritten to the equivalent of:

    t = 0;
    for (int i = 0; i < N; i++)
        if (b[i] % 2 == 0) t += 5 * b[i];
Lots more info on the GHC optimizations page: https://wiki.haskell.org/GHC_optimisations#Fusion


Yes, stream fusion is super nifty. But can Haskell alter the size of the chunks it's sending through the stream? There was a recent HN thread about how GNU 'yes' is so fast thanks to buffering. Obviously too large a buffer would blow out the cache and stream fusion avoids that but a moderate amount of buffering so that you're neither working array by array nor element by element but cachable chunk by cachable chunk should be a happy middle. And with Haskell lists reduce pointer chasing as well.


I think that the issue you're describing about `yes` was around building a buffer and issuing one big write() syscall instead of a bunch of small ones. I'm not sure about the actual implementation of the IO monad, and whether or not it coalesces call to write(), but it would definitely be possible. Also, idiomatic Haskell would make large writes, because usually (at least in my experience) you create a large string/Text representing your output, and then putStrLn or print it all at once.

In regards to assembling the stream: when the list is actually materialized, I assume it's done one element at a time, but the LLVM backend that GHC uses might be doing some loop unrolling, but I'm not sure.

Stream fusion doesn't require that all of the streams be the same type, though! Elements at different points in the stream can have wildly different sizes. For example, this would be fused:

    sum . map length . map double $ ["hello", "Symmetry"]
        where double x = concat [x x]
That will never materialize a list of Strings, even though at two points in the stream you had a stream of strings.


GHC doesn't actually perform stream fusion. Not by default, anyway: The default (by that I mean "ships with base", not that it's baked into the compiler) is something weaker, called build/foldr fusion, described by this rule:

  foldr z f (build g) = g f z
Foldr is the usual catamorphism for the inductive list type, and build is an "abstracted constructor", of sorts:

  build :: (forall b. (a -> b -> b) -> b -> b) -> [a]
  build f = f (:) []
Note that build's argument is rank-2 polymorphic, and also the same as foldr's type.


Haskell's laziness is an issue for you? Then how do strictness annotations (and/or module flags) not solve this problem?

Haskell also has inline-C FFI support for "type safe" performance escape hatches (in the sense that you can manually bring C constructs into Haskell's type system): https://github.com/fpco/inline-c

The only way that Haskell inescapably suffers from performance hits (that I'm aware of) is not in its laziness or lack of imperative programming support, but in its periodic GC pauses. In such cases where GC pauses are an issue, I still think you can -- in principle -- write more and more performance critical pieces of your application in a non-GC language (in fact, you could do this in Rust just as well as C/C++). You can also manually trigger a GC pass, although you cannot -- to my knowledge -- manually stop and start GC passes in your code.

This is the approach I'm currently taking in a performance critical VR application where GC pauses could absolutely be an issue (using as much Haskell as possible, with C and GC triggers as an escape hatch). And if someone could point out a flaw in my reasoning (in particular, how GC pauses might be inevitably avoided even with C FFI), I would gladly pay them money, because it's so important I not get this one wrong :).


It is also possible to avoid GC pauses causing problems in your application with a load balancer. Basically shifting load between haskell processes when a big GC is imminent.

Read about that technique from Simon Marlow of Facebook's Haskell team. Of course it would be awesome to not have to worry about that.


Nice -- this technique never came up when I searched /r/haskell for GC pause workarounds. And yes -- it would be awesome not to have to worry about this.


> Then how do strictness annotations (and/or module flags) not solve this problem?

Strictness annotations (and associated pragmas) only change the strictness of constructors and functions defined in my modules. I still have to reason about laziness as I use almost anything from the Haskell ecosystem.


Good point. I wonder if there are Strict versions of the Prelude that could help out?


Hence a reason why so many people are sticking to ML despite the lack of social momentum.


ocaml is a great language in its own right too; the only haskell features I find myself really missing are do notation and operator overloading.


OCaml seems like the ideal language for me on the surface, but the onboarding process is... suboptimal IMO. There's a lack of good soft/hard documentation (AKA the "no, type signatures are not sufficient documentation" problem), but my biggest turn off is the multiple standard libraries. Some might see that as a plus, but when I'm just trying to get a small learning project off the ground, and I find out there's a critical library I need that's incompatible with the stdlib I'm using is a huge bummer.

Not saying these are insurmountable, unforgivable problems, just really annoying.


I feel like the biggest thing missing from OCaml is a Cargo-like build tool (probably using OPAM under the hood for dependency management). It would also be nice if OPAM had better cross platform support. The only way I could get it working on Windows was through one of the distributions that bundled Cygwin.


F# does have operator overloading.


but F# doesn't have Ocaml's performance, which I believe was the point being made for ML over Haskell

I honestly don't know how F# compares to Haskell over performance


F# is essentially just C#, which should be reasonably performant? It's possible the persistent data structures aren't optimized I guess?

Pity something's busted in the benchmarks game: http://benchmarksgame.alioth.debian.org/u64q/fsharp.html


It wasn't busted with .NET Core 1.0.1 005db40cd1.

Is F# working with .NET Core 2.0 Preview 1 for you?


Haven't tried out .NET Core 2 yet, actually.



But it's missing the module system, which is IMHO OCaml's best feature.


Also the multicore concurrency.


Some people also complain about a tiny set of syntactic issues. Other than that I like both ml and ocaml.


Having learned ML languages via Caml Light, I never understood what those complaints are about.

Maybe the only thing I would complain would be the separation between integer and floating point math.


not even the monomorphism, I forgot but there's a few warts at the syntactic level only.


A few months ago I read monomorphism is the way to go performance wise.

Something about people writing fast code with Flow/TypeScript than with plain JavaScript because the typing leads to less polymorphic code which is easier to optimize for the compiler.


In this context, "monomorphism" just meant that OCaml requires you to write "+" to add integers and "+." to add floats (and yet other things to add bignums or whatever).

This has nothing to do with performance per se. It doesn't make OCaml code faster than languages that allow you to write "+" for both int+int and float+float. The compiler knows anyway.


But that's what most devs do, right? If it doesn't look like C it's ugly, haha.

Anyway, Reason seems to have a bit nicer syntax than OCaml, I think.


Reason seeks to destandardize syntax, a fascinating experiment I'd prefer to watch in a language not already held back by community fragmentation.


This is what's really nice about Nim. It writes like Python, but is fast like Rust.

Ran across Option in it the other day, though it's not used heavily. I really wish there was something like Haskell but with a runtime like Nim or Go. Perhaps that is OCaml?


Exactly. For me, Haskell was a gateway to Nim. I wanted something that feels familiar, but with a strong type system and native speed.

I wish I could invest some time in working with Nim. It doesn't get as much attention as it deserves in my opinion, and it would really benefit from growing the community a bit.


Have you looked at Swift? It's pretty similar to Rust, but eschews all the lifetime stuff in favour of a runtime that does reference counting.


Swift is useful if you're an Apple developer only targeting Apple users, seems like a bit of a niche.


Swift works reasonably well on Linux. You can even run Swift on Windows 10 through Windows Subsystem for Linux.


I tried under Arch a few months ago and found swathes of the standard library didn't appear to work, is this something which has been fixed?


AFAIK only the core language is fully ported, for the libraries ... it is a work in progress.


In other words it is useless currently.


Tell that to IBM that uses Swift in production. See IBM's Kitura open source server framework if you are interested.


They use a language without a standard library in production?


I'm sure you can find yourself the answer if you are really interested, here is an example: http://www.computerworld.com/article/3122994/apple-mac/ibm-s...

Or this one https://developer.ibm.com/swift/


Actually Swift 4.0 just got some memory ownership support, with more to come later.


Link? Last I checked all of Swift's ownership ideas were just in-progress proposals, with implementation to be left to far-future releases.


Watch the video about the talk, what is new in Swift 4.0 at WWDC 2017.

Basically 4.0 will introduce the notion of exclusive access, with more affine type ideas to follow up on 4.x.

EDIT: This feature is described by SE-0176.

https://github.com/apple/swift-evolution/blob/master/proposa...


As a Haskell practitioner in industry - for several years we have been creating technical documentation sites for large enterprise customers, providing fast semantic search for gigabytes of content serving millions of users - I can attest that this simply isn't true. We have found that reasoning about performance in a pure functional language is not harder than for any other kind of mainstream language. Except in the sense that it is different, so not all of your skills and intuition from other languages necessarily apply.


Wasn't the selling point of laziness performance?

"It only does what is needed!"

I didn't use Haskell but Nix and at least there it seemed reasonable to "not do" everything.


Haskell can be made very fast. The problem is that at least for me, performance is often quite non-intuitive. As in, make tiny change to program, it is now a hundred times slower. Then you get to hunt down the reason why it's now doing too much work/ why stream fusion isn't working anymore/why it's leaking so much space/etc for hours.

I enjoy writing Haskell for small hobby projects, but I doubt I would choose it for real work now that F# and Rust exist.


In other languages when the compiler misses an optimization you get a constant order penalty on a small piece of code.

In Haskell when the compiler misses an optimization the performance hit is potentially unbounded, as it could think anything is a dependency. (You want the first N digits of pi? Just wait while I calculate all of them for you...)


> You want the first N digits of pi? Just wait while I calculate all of them for you...

I don't think that example cold actually happen, or at least I can't see how it would happen in Haskell.


I don't think anything is the unique selling point of laziness. Wikipedia[1] says "The benefits of lazy evaluation include:

* The ability to define control flow (structures) as abstractions instead of primitives.

* The ability to define potentially infinite data structures. This allows for more straightforward implementation of some algorithms.

* Performance increases by avoiding needless calculations, and error conditions in evaluating compound expressions."

In my several-year experience as a professional Haskell developer the first and second points are at least an order of magnitude more important than the third.

[1] https://en.wikipedia.org/wiki/Lazy_evaluation


Also in some cases laziness creates loop fusion between chained functions for much better cache locality.


"the first and second points are at least an order of magnitude more important than the third."

No offense, but 'performance is not as important as...' is what basically all pro-FP devs say for years, still it seems that the rest of the world thinks exactly the opposite.


It's hard to believe that a world where Ruby, Python and Node are massively popular cares about performance above expressiveness. Even Java became popular first and fast afterwards—not even talking about memory efficiency!

I'd agree that a lot of developers still over-index on performance though, probably because it's sexy and easy to measure.


> I'd agree that a lot of developers still over-index on performance though, probably because it's sexy and easy to measure.

To keep producing responsive compute programs as the complexity of those programs continues to rise while the single-core CPU performance remains steady, program performance must become more and more important.


You've completely missed the point.

I make no claim about the importance of performance.

My claim is that the performance gains brought by lazy evaluation are far less important than the other gains it brings (because the performance gains it brings are rather small).


Yes, you're right. I'm sorry.


That's OK!


On the contrary, the rest of the world seems to think "the performance gains given by laziness" aren't worthwhile at all.

It's true that there are also performance costs associated with laziness, but those weren't under discussion.


>It was very frustrating to try reasoning about performance in Haskell due to its laziness.

Conversely, I find it hard to achieve the same performance with strict code. I think it's likely a matter of familiarity.

>Haskell is notorious in my mind for being quite easy to write obscenely slow code in.

The standard prelude is flawed in many ways. Haskell 2020 will hopefully fix this. In the meantime: the recursion-schemes library might be of great use.


Something I found about going to Haskell from <any other language> is that Haskell provides many powerful abstractions without compromising on performance. Especially programmer's performance.

If you do not need to write notoriously slow code in Haskell, just don't do it. Use libraries (containers and mtl especially great), keep state strict (a handful of bang patterns).

I use Haskell's laziness with great success (look for paper "generating power of lazy semantics" it's fascinating; look "the essence of functional programming" for even more fascination). I use Haskell's flexibility with great success (how about beating C speed ten times? how about parallelization to all cores with couple lines of code?). It is really great language.

BTW, Haskell's RTS reuse chunk type field which is needed for implementation of laziness for synchronization and it is great! This means that it can provide everything that Java and/or .Net provides and get away with twice as less overhead for it.


I don't use Haskell but didn't I read recently that the latest version of Haskell addresses this common concern by allowing you to turn off laziness altogether so everything evaluates eagerly?


There is -XStrict, see here for a discussion: https://ghc.haskell.org/trac/ghc/wiki/StrictPragma


>Haskell addresses this common concern by allowing you to turn off laziness altogether so everything evaluates eagerly

It's a common concern but it's a terrible critique. Laziness has many subtle benefits, but everyone whines about it because its pitfalls are decidedly not subtle.


Most problems arise from using linked-lists in inappropriate situations. This includes using the default String type. Both the list and string syntax have been overloadable for a while, so using alternatives when necessary shouldn't be an issue. Understand the complexity of the data structure operations you are using and you will get fast code, laziness does not change this (although space usage can be hard to reason about).

The benefits of laziness is difficult to appreciate if you are not used to it. For a great example of the benefits it brings, please read this:

http://augustss.blogspot.co.uk/2011/05/more-points-for-lazy-...


While I've found that Haskell's laziness can often be worked around with appropriately-implemented memoization, it's an absolute PITA for anything that requires frequent hops into the IO monad (unsurprisingly). Historically, this has been a bottleneck for me that's prevented me from using Haskell in any major projects.

A Haskell port of a logging library I'd originally written in Python ended up being almost 2 orders of magnitude slower than CPython 3, and almost 3 orders of magnitude slower than PyPy3.

PyPy3 resulted in performance similar to a barely-optimized C port.


Do you have source code? It sounds like you were writing unidiomatic Haskell.


I actually posted the port on the Code Review SE, where (after almost a month of silence), someone stepped up and offered a CFFI solution.

https://codereview.stackexchange.com/questions/157118/a-port...


You might have better luck with the fast-logger library, for one. I'd also suggest using ansi-wl-pprint, which would work on Windows and also probably be faster than string concatenation (which is just plain bad in Haskell).

As was pointed out, GHC is calling getCurrentTime too often. If you take advantage of GHC's laziness, the value will stay valid longer. I assume that is what fast-logger does.

https://hackage.haskell.org/package/fast-logger


You might want to consider F# as well. It supports both lazy and greedy functional programming very nicely.


You might like Idris. Not nearly as mature as Haskell, but definitely as powerful, with much more reasonability when it comes to performance, thanks to eager evaluation.


It's not just laziness. Functional programming is what happens to the universe when you eliminate the concept of Time. It is a mega abstraction over the what's really going on under the hood.

However we live in the real world and in the real world you cannot eliminate the concept of time, even in an abstraction such as functional programming. Therefore Functional programming is an utter lie.

Reasoning about the performance of a functional program is to peer through the lie and "reintroduce" time into this abstraction. This is why it it notoriously hard for functional programs to be reasoned about. Procedural programs are much more better for this in that regard.


This is utter nonsense.

Everything you wrote is somewhere between 90° and 180° wrong.


I'm totally correct. You are completely unaware of what's going on. We can discuss this logically in the comments, you'll see. It's not going to pretty for you as my arguments will pick you apart piece by piece.


saying "I'm going to destroy your argument with logic" is not the same thing as actually destroying someone's argument with logic.


Logically speaking, I never equated the two statements. It's up to him whether he wants to engage in a debate. Additionally, this guy just replied saying I was 180 degrees off angle without any explanation, how polite.


>Functional programming is what happens to the universe when you eliminate the concept of Time.

I've heard this a lot but still can't find a good code example to illustrate this. Any help?


The statement makes no sense. See FRP for a take on how to handle time-varying behavior.


It does make sense. Man of course whenever the functional paradigm switches to IO things change. The functional program needs to have an exception somewhere or nothing gets done. I'm talking about the functional part of functional programming, not how the IO monad is implemented.


Let me make my statement clearer. The FRP loop has a part that is not pure (IO). This is where time enters the picture. FRP is not 100% functional, no functional programming language is... but if you look at the functional part of a functional programming language you will see that it is immutable.


Immutability is equivalent of eliminating time. To be unchanging is to be timeless.


Haskell/GHC will soon have an extension for linear types, which will bring the languages much closer: http://blog.tweag.io/posts/2017-03-13-linear-types.html


Yeah right. If only the standard library (the base package) could immediately be converted to use linear types and we can all benefit from that! (Hint: it won't. Even the fairly no-brainer AMP and BBP took an absurdly long time. This is extremely slow by Rust standards. The Haskell community might, but the stewards of the standard libraries don't have the move-fast-and-break-things attitude.)


> Even the fairly no-brainer AMP and BBP took an absurdly long time

This is because, believe it or not, Haskell is actually used in the real world and people care about their code not becoming broken.


Did rust ever have a large scale breaking change? Anyway, the linear type proposal is designed to be completely backwards compatible so that shouldn't be as big of an issue.


> Did rust ever have a large scale breaking change?

Pre 1.0: every day

Post 1.0: not really, though I guess it depends on your definition of "large scale"; we've made some soundness fixes, but made sure that they were warnings for a long time first, so most people experienced no breakage.


Yes I know it's completely backwards compatible. But in practice if the standard libraries don't change, (a) there won't be a ready source of inspiration and examples to copy from; (b) actually the feature is really geared more towards libraries than applications, so it's less useful if the base library doesn't adopt it wholeheartedly.


Fair enough, although it would be easy to have an alternative prelude with linear types that libraries could depend on. Just that the extra dependency would really suck.

Haven't sunk a lot of time into it yet to be honest, I think the biggest beneficiaries will be streaming libraries and non-gced data structures like long lived queues?


Rust uses affine types instead of linear types per some statements I've seen on the subject. I wonder why they wouldn't use affine types in this. Either way, it's good they're adding it.


The similarities are striking. Both have a problem with undocumented or experimental modules for everyday tasks, both use compiler plugins to make sure that every package is using its own superset of the language (although in Rust this only applies to nightly), both tout lofty goals while rarely producing production grade systems. Rust is the first toe dip into the world of blog posts heralding the"coming of the age of reason" when we all see the error of our ways for writing dangling pointer references and impure functions. To be a complete cynic, I hope Rust runs right away from over generalized abstractions and the Haskell indifference to achieving real success by overcomplicting even the most mundane of problems.


There's nothing wrong with the ability to generalize abstractions, the problem is when all you know are category theory everything looks like a zygohistomorphic prepromorphism.

Case in point, TypeScript and mypy. Both are adding more abstractions, more powerful types, more aspects, because people find/have fitting use cases for those features.

That doesn't mean let blah = document.getElementById("blah") ; blah.innerHTML = "..." or import requests ; requests.post(...) will become deprecated, but it means it'll be easier to express more things (like conditions/assumptions on shape or state of data, control flow, component dependencies) without boilerplate (and copy-paste), and maybe helping other developers make sure they cross all the t-s and dot all the i-s.

Does this have a cognitive cost? Yes, especially in Rust/Haskell.

Are there escape hatches? Yes, of course, like unsafe and whatever Haskell has.

Should a developer who isn't familiar with those assumptions/conditions work on the code base anyway? No, not really.

(Is Haskell ugly? Yes. :( )


The haskell escape hatches are the unsafe family of functions, such as unsafePerformIO


> rarely producing production grade systems.

https://www.rust-lang.org/en-US/friends.html has been growing at a pretty hefty pace lately!


I think this is a little unfair on Rust. You can get a lot done using stable Rust.


Yes. Complex languages are not required to solve complex problems. This is why Go and JavaScript are so popular.


I'd say neither Go nor JavaScript are predominantly used to solve complex problems. The main use cases for both are solving mundane problems by mundane programmers. In case of Go that's an explicit design philosophy.


That's... not really true. JavaScript solves trivial problems, and it's mainly popular because it's literally the only web language around.


Although clearly presented to push some buttons, it isn't completely wrong.

Am upvote hoping you don't get buried.


it isn't completely wrong

It may get some of the facts right, but the tone is completely wrong. Cynicism is just a step or two away from nihilism. It's far, far easier to destroy than it is to create. The world needs more creators, not destroyers.


> Cynicism is just a step or two away from nihilism. It's far, far easier to destroy than it is to create. The world needs more creators, not destroyers.

Totally off topic:

Such a simple thing to say, but I thank you for it. I often want to do good, and spent time thinking about the problems in the world and how my path might help them, but I know that I am a massive cynic in many areas. Politics being a big one.

I have always agreed with your statement about creators vs destroyers, but nevertheless I've been a loud cynic. I need to reign that in. Thank you.


We need more people creating the right things, and more people destroying the wrong things and baggage that we've overburdened ourselves with.

Cynicism is a self-defense mechanism.


I wonder if Idris could some day take Haskell's place. It sports default eager evaluation. Seems like the tooling (package managers etc.) isn't quite there yet, though.


Is a difference in a default option really enough for a language to take the place of another? And if it's not just a change in a default option, but the removal of laziness, we lose something, which will make the choice less obvious.

I think it's more likely that, if dependent types prove really useful, Haskell will adopt these, and people will adapt their code, rather than port everything to a new language.

As far as I can see, Idris is too much like Haskell to take its place. Porting thousands of libraries to a new language is a huge effort, so a huge advantage is required, which I don't see Idris offering.


A System F type system can't be replaced by a dependent type system without affecting existing code. There is no way to turn the libraries in Haskell into libraries in dependently-typed Haskell without "porting them to a new language" and all the work that entails.


This is 100% false. The Calculus of Constructions is a superset of System F. You may be thinking of the fact that if you want your type system to be sound when interpreted as a logic, then termination must be guaranteed (which would break backwards compatibility). Dependent types and totality are often found together, but are in actuality orthogonal features.


>And if it's not just a change in a default option, but the removal of laziness, we lose something, which will make the choice less obvious.

What would we exactly lose?

>As far as I can see, Idris is too much like Haskell to take its place. Porting thousands of libraries to a new language is a huge effort, so a huge advantage is required, which I don't see Idris offering.

That is true, though. Perhaps Rust is the safer bet, even if it lacks many nice things. The sum is probably quite a lot better still.


> What would we exactly lose?

We'd lose composability and clearer code. This[1] section of the Haskell Wiki contains a good example.

In short, with something like this:

    any :: (a -> Bool) -> [a] -> Bool
    any f lst = or boolLst
      where boolLst = map f lst
the compiler can produce reasonably optimal code, because it doesn't have to convert the entire [a] to a [Bool], because of lazy evaluation -- when or encounters the first True, the map f lst expression stops being evaluated because of lazy evaluation.

[1] https://wiki.haskell.org/Non-strict_semantics#Why.3F


I don't know Idris so this syntax is probably wrong, but by reading the docs for a short while, this would be

any : (a -> Bool) -> Lazy List a -> Bool

... which would make it as performant and clearer than Haskell's any, since the laziness is explicit.

Someone with actual Idris experience might tell if it actually would work like this :) I guess we'd have to make sure that the used or function is also lazy.


I was commenting on what the effects would be from removing laziness. I'm not experienced enough in Haskell to know whether making strictness the default option would make sense, although I've seen a quote from Simon Peyton Jones saying he'd do that if he could start over.

Out of curiosity, would you know whether there exists sufficient strictness pragmas in Haskell to translate your example Idris (pseudo) code into Haskell? I often hear about performance/memory usage pitfalls with Haskell laziness, so it'd be really nice if Haskell were able to emulate Idris in this regards, by enabling enough LANGUAGE pragmas to require explicitly annotating types if laziness is desired.


That place being the cool esoteric language that everybody fools around with on the weekends, but is almost never used seriously?


> That place being the cool esoteric language that everybody fools around with on the weekends, but is almost never used seriously?

Indeed. Some people actually use Haskell, though. They even claim it's fun and productive, but having done so myself, I'm forced to be undecided on whether they're just suffering from Stockholm Syndrome.


Isn't the most significant line: "For work-related reasons, I had to recently get up to speed on programming in Haskell"?

I was somewhat disappointed not to have a followup to what work-related reasons there might be. Note also that the About page says "work at Facebook".


If you're puzzling at the connection between Facebook and Haskell, https://github.com/facebook/Haxl is the most obvious one.


I have heard the same thing about Kotlin now - https://hackernoon.com/kotlin-functors-applicatives-and-mona...


Err kotlin as a gateway drug to haskell? They might wanna fix type erasure first....


maybe kotlin as a gateway drug to scala as a gateway drug to haskell


Can't fix that w/o fixing the JVM.


I have found them to be very similar too. I started learning Haskell in earnest last September, following along with the exceptional "Haskell: First Principles" which I highly recommend. I had been toying with Rust around this time and I found a confluence of ideas between the two languages. Most things I learned in Haskell I was able to carry over to Rust.

The inverse was not quite as true. But I still love Rust. Any time I need something performant, or I just want to play around, it's my first choice.


"the legacy C++"

Interesting reading but the author seems intent on pushing some buttons.


Maybe that's just his way of saying "C++ without concepts."


C++ is legacy. It was created in 1979.


Maybe the other way around.


The article references using trait objects for dynamic dispatch. FWIW, I found them extremely limited: you can't use generics, can't return Self... I wonder how often people use them.


What do you mean you can't return self? I'm just starting out with learning rust, but from what I've gathered you can use objects in a myriad of ways that will allow you do what you need.

Instead of returning self, you could just pass a reference - you need to mutate self then just pass a mutable reference. Did I miss something here?


Returning Self in a trait object isn't possible; it's not object safe. In order to return Self, you'd need to know the concrete type of the thing, and when you have a trait object, that's erased.

Returning a trait object instead of plain Self would work though.


> More advanced type systems, however, allow to specify the generic constraints explicitly.

Well, Java allows it, even if it looks a bit unglier than the Rust example.

    <T extends Comparable<T>> T min (T a, T b) {
        if (a.compareTo(b) > 0) { return b; } else { return a; }
    }
And even without general availability of concepts (already in gcc 6.x), one can achieve it in C++ via if constrexpr.


The difference with Rust is that you can apply a trait to an extant type you don't "own" - you can't do that in Java or C#: you would need to use the decorator-pattern (boilerplate ahoy!). C++'s checks are closer to duck-typing in practice.


Actually you can do ad-hoc polymorphism in C# (and I assume Java too, although I've not used it). Here's an example of implementing an Eq and Ord 'type class' in C#, with 'class instances' of OrdInt and OrdString. Then a simple generic bubble-sort function which works on any type that has an Ord instance:

    public interface Eq<A>
    {
        bool Equals(A x, A y);
    }

    public interface Ord<A> : Eq<A>
    {
        bool GreaterThan(A x, A y);
        bool GreaterThanOrEq(A x, A y);
        bool LessThan(A x, A y);
        bool LessThanOrEq(A x, A y);
    }

    public struct OrdInt : Ord<int>
    {
        public bool Equals(int x, int y) => x == y;
        public bool GreaterThan(int x, int y) => x > y;
        public bool GreaterThanOrEq(int x, int y) => x >= y;
        public bool LessThan(int x, int y) => x < y;
        public bool LessThanOrEq(int x, int y) => x <= y;
    }

    public struct OrdString : Ord<string>
    {
        public bool Equals(string x, string y) => x == y;
        public bool GreaterThan(string x, string y) => x.CompareTo(y) > 0;
        public bool GreaterThanOrEq(string x, string y) => x.CompareTo(y) >= 0;
        public bool LessThan(string x, string y) => x.CompareTo(y) < 0;
        public bool LessThanOrEq(string x, string y) => x.CompareTo(y) <= 0;
    }

    public static class Testing
    {
        public static void Test()
        {
            var x = new[] { 3, 8, 1, 2, 10 };
            var y = new[] { "mary", "had", "a", "little", "lamb" };

            BubbleSort<OrdInt, int>(x);
            BubbleSort<OrdString, string>(y);
        }

        public static void BubbleSort<OrdA, A>(A[] values) where OrdA : struct, Ord<A>
        {
            bool swap;
            do
            {
                swap = false;

                for (var i = 0; i < values.Length - 1; i++)
                {
                    var x = values[i];
                    var y = values[i + 1];

                    if (default(OrdA).GreaterThan(x, y)) // Ad-hoc polymorphic call to GreaterThan
                    {
                        swap = true;
                        values[i] = y;
                        values[i + 1] = x;
                    }
                }
            }
            while (swap);
        }
    }


But you just proved my point: your OrdString and OrdInt types there are implementations of the Decorator pattern (just using interfaces instead of superclasses).


No they're not, they are not wrapping an object instance, they are providing ad-hoc functionality only. The generic parameters provided to BubbleSort are constraining the arguments and allowing access to the ad-hoc functionality, something that can't be done with inheritance unless you own the type.

This is exactly how traits, implicits, type-classes, class-instances, and concepts work in other languages like Scala, Haskell, etc.


I agree with you in regard to Java and C#, not so much regarding C++, because GCC already supports concepts since version 6.0, with other compilers adding support for them as part of the validation work to have them finally on ANSI C++20 standard.

In any case, regardless of the fine grain details, all of them allow for to "specify the generic constraints explicitly.".


That's the most HN post title in a while


[flagged]


Comments like these are insidious on Hacker News because they're at once inflammatory and deathly boring. Please don't post like this again.

https://news.ycombinator.com/newsguidelines.html


i imagine they both provide a similar hipster code high. while being fairly hard to do something useful in or make money with.


Rust and Kotlin destroy any previous claims about practical usages of Haskell.


Kotlin isn't anywhere near being a substitute for Haskell, it's a nice set of syntactical tweaks for Java users but it doesn't even have ADTs or real pattern matching.


Could you elaborate? I'm not sure practical usages of Haskell are replaced by Rust/Kotlin, the high-level of abstraction and pure functional paradigm are practical.


Languages without HKT are horribly impractical for large codebases.


Yet there are not many large codebases in languages with HKT, and even those which are there (mainly big Scala projects) do not use HKT


Most large codebases are terrible, and these facts are not unrelated.


I could be wrong, but I don't think that there are enough large codebases that use HKT for us to be able to know whether most large codebases that use HKT are not terrible.

I suspect that it is closer to the truth to say that most large codebases are terrible, regardless of language, and that HKT won't save you. (Using HKT may help, some. Using HTK well may help more. But the problem with large codebases is that enough programmers work on it that the talent level tends toward the average of the universe of programmers. That's... not good. It means that whatever tool or technique you choose won't be used well.)


Standard Chartered has a multimillion line codebase in Haskell (technically an internal variant called Mu).


Rust still allows imperative programming whereas Haskell doesn't. This makes a big difference.

Rust is closer to Alan Turing than it is to Alonzo Church.


There was a recent post about "imperative Haskell": http://vaibhavsagar.com/blog/2017/05/29/imperative-haskell/

The code in this other post about implementing string distance metrics is also quite imperative: https://markkarpov.com/post/migrating-text-metrics.html


Haskell allows you to recreate standard imperative semantics purely (with the State monad, plus lenses for familiar syntax), or to use actual stateful imperative mutability ("purely" via ST or impurely via IO).


Haskell definitely allows imperative programming, it's just not particularly good at it.


"Haskell is, in my view, the world’s best imperative programming language, and second-best functional language" - Robert Harper (https://existentialtype.wordpress.com/2011/04/09/persistence...)


What is it lacking for imperative programming? I find it to be immensely good at it!


I tried implementing some simple imperative algorithms (like union-find) and found the Haskell code very noisy. For example, "var x = f(y)" in an imperative language might translate to "let x = f y" or "do x <- f y" in a Haskell do block, and the wrong variant won't compile. There's no general purpose idiom you can use to replace all loops, only tons of special purpose HOFs you must memorize. Writing code that mixes different types of side-effecting calls is combinator hell. And so on. Usually an imperative algorithm will be much clearer in an imperative language than in Haskell, and easier to get right on the first try.


> There's no general purpose idiom you can use to replace all loops, only tons of special purpose HOFs you must memorize.

Well, yes, there is: general recursion. You can recreate any loop with plain old recursion.

But stating that as a problem is a little like coming to Java and saying that "there's no general purpose idiom you can use to replace goto, only tons of special semantic branching constructs you must memorize." Sure, a valid complaint if you've been forced to only use goto in your life, but not a reflection on good programming practise.


I don't think that's true. Say you have an imperative algorithm that modifies an array in a loop. Then the Haskell encoding of that algorithm will use writeArray. But since it's monadic, it won't look like general recursion, it'll need to use a combinator like forM. You could say it's general recursion plus bind and return, but it gets worse. If the loop body needs to use both writeArray and randomness, you need more combinators. I'm not sure anyone can reliably write that stuff on a napkin, the way I can write a loop. That's not like the situation with gotos, because code with gotos is harder to write on a napkin.


What's wrong with this?

    import qualified Data.Vector.Mutable as V
    import           Control.Monad       (when)
    import qualified System.Random       as R
    
    main = do
        v <- V.replicate 10 0
        putStrLn "Initialised to zero"
        printThemAll v
    
        setIncreasing v
        putStrLn "Set with values increasing by 2"
        printThemAll v
    
        addRandomness v
        putStrLn "Added a random number to each"
        printThemAll v
    
    printThemAll v = loop 0 (V.length v)
        where loop i n = do
                when (i < n) $ do
                   e <- V.read v i
                   print e
                   loop (i + 1) n
    
    setIncreasing v = loop 0 (V.length v)
        where loop i n = do
                when (i < n) $ do
                  V.write v i (i * 2)
                  loop (i + 1) n
    
    addRandomness v = loop 0 (V.length v)
        where loop i n = do
                when (i < n) $ do
                  r <- R.randomIO
                  e <- V.read v i
                  V.write v i (e + r)
                  loop (i + 1) n


That's indeed nice, and made me change my mind somewhat. I wonder if you can write a napkin example that mixes ST with randomness, instead of using IO for everything?


It's not too bad to manually thread a seed around.

    import qualified Data.Vector.Mutable as V
    import           Control.Monad       (when)
    import qualified System.Random       as R
    import           Control.Monad.ST
    
    st = runST $ do
        v <- V.replicate 10 0
    
        setIncreasing v
    
        addRandomness v
    
    printThemAll v = loop 0 (V.length v)
        where loop i n = do
                when (i < n) $ do
                   e <- V.read v i
                   print e
                   loop (i + 1) n
    
    setIncreasing v = loop 0 (V.length v)
        where loop i n = do
                when (i < n) $ do
                  V.write v i (i * 2)
                  loop (i + 1) n
    
    addRandomness v = loop initial_g 0 (V.length v)
        where seed       = 1234
              initial_g  = R.mkStdGen 1234 -- Seed
              loop g i n = do
                when (i < n) $ do
                  let (r, next_g) = R.random g
                  e <- V.read v i
                  V.write v i (e + r)
                  loop next_g (i + 1) n


I see, thanks! The imperative version would still look more direct to me, but your versions are very nice.


(continued from previous comment)

But what's really interesting is the partitioning procedure. That's where things get complicated. According to Wikipedia, it can be implemented in an imperative language like so:

    algorithm partition(A, lo, hi) is
        pivot := A[hi]
        i := lo - 1    
        for j := lo to hi - 1 do
            if A[j] ≤ pivot then
                i := i + 1
                if i ≠ j then
                    swap A[i] with A[j]
        swap A[i+1] with A[hi]
        return i + 1
This is the Haskell translation I came up with:

    partition a lo hi = do
        pivot <- Vector.read a hi
        i' <- newSTRef (lo - 1)
        j' <- newSTRef lo

        fix $ \loop -> do
            j <- readSTRef j'
            when (j < hi) $ do
                aj <- Vector.read a j
                when (aj <= pivot) $ do
                    i <- inc i'
                    when (i /= j) $
                        Vector.swap a i j
                inc j'
                loop

    i <- inc i'
    Vector.swap a i hi
    pure i
Again, I'll walk through it part by part.

    partition a lo hi = do
        pivot <- Vector.read a hi
        i' <- newSTRef (lo - 1)
        j' <- newSTRef lo
Since we're already running this as part of an ST expression, we don't need to specify "runST". First thing, we read the last element of the vector as our pivot, and we define two new references to mutable values i' and j'. (I like to indicate references with ticks like that to distinguish them from the value they contain. Ticks sort of remind me of the C pointer asterisk so it works out.)

        fix $ \loop -> do
            j <- readSTRef j'
            when (j < hi) $ do
Okay, so last time we created a named loop through the "let" keyword, which defines new variables and functions. We could do that here as well, but I think this other approach generates neater code in this case. The "fix" combinator might twist your mind the first few times you see it, but suffice it to say that it creates a recursive function from an anonymous function, by supplying the function with itself as its first argument. You'll see soon one of the reasons I preferred it in this case.

Then we read the value of the j' reference and use it as our loop condition. The loop should run as long as j is less than hi.

                aj <- Vector.read a j
                when (aj <= pivot) $ do
                    i <- inc i'
                    when (i /= j) $
                        Vector.swap a i j
We read the value under j in the array, and if it is less than the pivot we increment the value in the i' reference. If the incremented value is different from j, we swap the two in the array.

                inc j'
                loop
Regardless of how aj compared to the pivot, we increment the value under the j' reference and go back to the top again to start a new iteration.

    i <- inc i'
    Vector.swap a i hi
    pure i
When the loop has finished, we increment i', swap the pivot element back in to its right place, and then return i.

There are three things of note here:

1) One of the major differences with the imperative code is that in imperative Haskell code, we sometimes need to "dereference" mutable variables in a separate statement. We (generally) cannot do that as part of an expression.

There are some structured ways around this even in Haskell, but at that point it might no longer be worth using Haskell to write imperative code. Why do I say that? Because it's actually a good thing that we need to dereference mutable variables in a separate statement. Several "safe coding standards" over the decades have evolved toward "keep expressions free from side-effects and have one statement per side effect".

2) When we defined the loop with fix we didn't have to call it separately from its definition. That's one of the benefits I was talking about.

3) I forgot what number three was.

----

Randomness, you said? That's a common optimisation to the quicksort shown above. The pivot is picked at random in the inclusive range [lo,hi] instead of fixed at hi.

It is child's play to include it by simply passing a random generator as a parameter down the call chain to partition, so I'm not going to show that. What I'm going to show instead is how trivial it is to abstract that parameter away into a state transformer wrapper.

I intentionally ignored this aspect until now to get honest results about shimming randomness in there. Here are the changes:

1.

    partition a lo hi = do
        p <- state (randomR (lo, hi))
        lift $ do
            Vector.swap a hi p
         -- pivot <- Vector.read a hi
         -- ...
I converted the partition method to a state transformer, which makes it possible to compute a random number in the [lo,hi] range and implicitly update the generator state. Then I swap the highest element and the chosen pivot, and the rest of the algorithm is the same as before.

The state transformer also means that the rest of the partition function is now lifted, but that's no biggie.

2.

    quicksort gen vec = runST . flip evalStateT gen $ do
     -- pivot <- Vector.read a hi
     -- ...
The quicksort function needs to wrap the ST expression in a state transformer, but is otherwise exactly the same as before.

Since we need some sort of generator to start with, the quicksort function now also takes a generator as an argument and puts it in the state.

This is not super clean – the quicksort needs to receive a generator each time it is called? Well, yeah, sorta–kinda. This is probably one of those places where it's legitimate to do an "unsafePerformSomething" – the randomness does not cause any actual impurity in the code.

----

What I did not attempt, and what is a complicated subject anyway, was the issue of arbitrary pivot selection strategies. It's easy to integrate support for random pivots, but what if the user wants to specify a pivot selection strategy of their own?

That's fine if it's pure, or at least only relies on randomness, but what if it's unrestricted in its effects? What if they want to supply a strategy that calls your grandmother and asks her for a good pivot, or one that launches nuclear missies and counts the casualties to determine a pivot?

That's clearly some serious international side effects, and I think that we do want to prevent the user from inserting strategies with arbitrary effects. But where to draw the line? And how to distinguish these kinds of effects? Haskell only throws them all into the IO bin, which is a problem.

But the problem is not that effects are controlled, it's that even in Haskell there's a certain lack of control over effects.


Nice! If that's indeed napkin code, it's very impressive.

Just a small note, randomness in quicksort can't be hidden in unsafePerformSomething, because it does lead to impurity. For example, sorting [(0,1),(0,2)] on fst will give different results depending on randomness. But that's not to detract from your main point.


If you need writeArray and randomness and you don't like loop combinators, there is absolutely no problem using general recursion. That's one of the beauties of the monadic interface – it composes very well!

----

I'm going to show you some Haskell code now. This is code that could probably use improvement, but the reason it may look sketchy in some places is that I – surprise! – reliably wrote it on a napkin.

Excepting some typos, obvious brainfarts and missed imports, this is actually the first draft of the code. Here's the quicksort algorithm as it is written on Wikipedia:

    algorithm quicksort(A, lo, hi) is
        if lo < hi then
            p := partition(A, lo, hi)
            quicksort(A, lo, p – 1)
            quicksort(A, p + 1, hi)
It's Haskell implementation will be very similar:

    quicksort vec = runST $ do
        mvec <- thaw vec
        let loop lo hi = when (lo < hi) $ do
            p <- partition mvec lo hi
            loop lo (p-1)
            loop (p+1) hi
        loop 0 (Vector.length mvec - 1)
        freeze mvec
I'll walk through it line by line, even though most of it is very similar to the Wikipedia imperative pseudocode.

    quicksort vec = runST $ do
        mvec <- thaw vec
We run this stuff as an ST expression, which is the Haskell way of saying "hey this block of code does actual mutation, be careful". The first thing we do is thaw the vector, which means making it mutable.

        let loop lo hi = when (lo < hi) $ do
We define a loop that is going to depend on two variables for iteration: lo and hi. It runs for as long as lo is less than hi, and will break when hi is equal to or less than lo.

            p <- partition mvec lo hi
            loop lo (p-1)
First, we call the partition procedure which divides the vector into two halves and returns to us the index of the pivot element between the two.

Then, since we are in a function called "loop" we can continue to the next iteration by calling "loop". The neat thing about this is how it looks like we almost defined our own keyword, which acts somewhat like a "continue" statement in imperative languages.

There is a difference, though. The "continue" statement in an imperative language would abort the current iteration, go back to the top of the loop and run another Iteration. The "loop" statement in our code also goes back to the top of the loop, but it doesn't abort the current iteration. Which means we can

            loop (p+1) hi
also run a second iteration. This is in principle independent from the previous execution, so you can sort of view this as a "multi-continue" that starts two new iterations of the loop in parallel. In reality, though, the execution is sequential because the compiler doesn't have enough information to determine that they are indeed independent.

        loop 0 (Vector.length mvec - 1)
Note that until now, the loop was only defined – it was never executed. But now we execute it with the initial values for lo and hi. It may seem weird that definition and execution of a loop can be separate from each other, but I haven't been able to come up with any reason that could end up bad.

        freeze mvec
After we're done, we freeze the vector to make it immutable again. It might sound like an expensive operation, but it's not. Hopefully, the compiler will understand that nobody else has simultaneous access to the array so it will perform the modifications in-place and optimise away the thawing and freezing.

(continued in second comment)


Just because it doesn't translate 1-to-1 from another imperative language doesn't mean it isn't imperative itself. Haskell is not in the C family. It's not going to accept straight translations from C, but C doesn't monopolize imperativeness either.


>There's no general purpose idiom you can use to replace all loops, only tons of special purpose HOFs you must memorize.

Ah, but there is! Recursion schemes.

>Usually an imperative algorithm will be much clearer in an imperative language than in Haskell, and easier to get right on the first try.

Not sure what an "imperative algorithm" is, but I agree there are tasks better suited to imperative languages.


Don't know... mutable collections? A way to make sure things are evaluated without writing seq every other line?


Mutable hashtables: https://hackage.haskell.org/package/hashtables

Mutable arrays: https://hackage.haskell.org/package/vector-0.12.0.1/docs/Dat...

Strict by default: Put {#- Language Strict #-} at the top of files where necessary but usually you'd just use bangpatterns and unbox-strict-fields.


Well, these are mutable, but only in the ST or IO monad - which is a guarantee, but at the same time it is not quite the same thing. You cannot just accumulate state by mutating some stuff, then in a separate part of the program mutate it again.

Which is intentional, of course, but quite opposite to the imperative mindset.

About {#- Language Strict #-}, thank you - I was not aware of it.

I think the first example in this post https://markkarpov.com/post/migrating-text-metrics.html makes it clear that, while one can program in imperative style, it is not at all natural, and it is going to introduce as much boilerplate as, say, trying to program in functional style in Java 6


Well, if you want to do full imperative programming with warts and all, you'll probably want to stick to the IO monad for your entire program anyway. You just have to be aware that you're actively handicapping your compiler by choosing to do full imperative programming. (In most other languages, the compilers come handicapped out of the box...)

I think the first example in that post is a bit unfair. The Haskell code is unusually complicated because it avoids a complicated problem that exists in the C code. (Which is that 16-bit encodings can't cover all of UCS-4 with fixed character widths.)

Besides, I don't understand why they are making such a complicated solution. In this comparison [1], the first example in that post is the "target". The best performing solution is... you guessed it:

    foldZip a b =
        List.foldl' (\r (cha, chb) -> if cha /= chb then r+1 else r) 0 (Text.zip a b)
And a pet peeve: the way you say "only in the ST monad" is a funny rhetoric device. You make it sound like everyone agrees it's a bad thing to isolate mutation to local regions. Here's the thing: it's not. And the way you can integrate local mutation in your application with the ST monad is fantastic.

By just typing "ST" a few extra times you get local mutation exactly where you want to, and nowhere you don't want it!

[1]: http://i.xkqr.org/scrot_20170615032548.png


Technically this is only correct if a and b have the same length so you would need to do a length check prior:

    foldZip a b
      | T.length a /= T.length b = Nothing
      | otherwise = foldl' step 0 (T.zip a b)
      where step acc (l, r) = if l == r then acc+1 else acc
This is slightly awkward because T.length is an O(n) operation, though. I think the optimal implementation would be a hyper optimized variant that depends on correct byte alignment and this as slow fallback method if that fails.


I discovered yesterday that my previous benchmarks were wrong for some reason. Zip+fold isn't actually that fast: http://i.xkqr.org/scrot_20170615223112.png

What is fast – without reaching into the Text object internal representation – is good old tail-call recursion. And you can even integrate the length check into the traversal! Making it an O(min(n,m)) operation in total.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: