Hacker News new | comments | show | ask | jobs | submit login
Haskell vs. Erlang for bittorent clients (jlouisramblings.blogspot.ca)
142 points by reirob 1417 days ago | hide | past | web | 26 comments | favorite

This article is from 2010. There have been significant changes in the GHC runtime system since then. Most importantly, in February 2013, the parallel I/O manager was replaced with a new design.


> Our evaluations show that the new Mio manager improves realistic web server throughput by 6.5x and reduces expected web server response time by 5.7x. We also show that with Mio, McNettle (an SDN controller written in Haskell) can scale effectively to 40+ cores, reach a throughput of over 20 million new requests per second on a single machine, and hence become the fastest of all existing SDN controllers.


Also, GHC got new, heavily optimised IO manager in version 7.0, which was released later that year[1]. This is probably the more relevant change, as there hasn't been a GHC release that includes Mio yet.

[1] http://johantibell.com/files/hask17ape-sullivan.pdf

Another interesting read about coding bittorrent is Juliusz Chroboczek's [1] Hekate bittoreent seeder [2] which is written in Continuation-Passing C [3].

[1] http://www.pps.univ-paris-diderot.fr/~jch/

[2] http://www.pps.univ-paris-diderot.fr/~jch/software/hekate/

[3] http://gabriel.kerneis.info/software/cpc/

I learned about CPS in Lisps but never in "pure" C, making [3] a very interesting read.

neat! (relatedly: GHC actually does a CPS transform towards the end of compilation phases)

Jesper has a very nice blog, well worth a read:


Lately though, I've been blogging at medium.com as well:


since I prefer the platform to blogger.

Thanks for letting me know, I always look forward to reading your posts.

I really enjoyed his presentation about writing the bittorrent client in haskell comparing it to an erlang implementation: http://www.infoq.com/presentations/Combinatorrent-Haskell-ca...

Not just Erlang, but OTP. These guys wrote "real-time media gateways" on top of it, which works in production switches.

This is very interesting and actually seemed like a fair comparison of the two languages.

Say I wanted to write my own bittorrent client for fun and to learn a new language, is there a good guide on bittorrent I can read?

https://wiki.theory.org/BitTorrentSpecification is by far the best starting point for the base protocol. Roughly, you can make certain short-cuts to get running quickly:

Don't cache anything. Write data directly to disk and use seeks. Request pieces at random and don't think about getting the rarest first. Skimp on the choking strategy and don't do anything clever, but pick a simple algorithm. Skip all the extensions until the other parts work.

Get the wire-protocol and bencoder up and running first. Then handle tracking. Then handle connections to peers. Then handle storage.

Two big surprises from this article:

1. The author gives the impression that there's no big deal to fixing memory leaks in Haskell. I thought they were supposed to be its Achilles' heel.

2. Erlang lost on its home turf!

There is no clear winner. Haskell wins on having a powerful type system and producing generally efficient code. Erlang wins the maintenance-race by far since you can trace on the running system and deploy code while it is running. And the fact that Erlang is built to be robust over fast.

Saying that Erlang lost in its home turf would be a wild extrapolation of what I wrote. I don't really think there is a clear winner, but rather a set of trade-offs you will have to make.

Memory leaks due to laziness are not as frequent as the interwebs would tell you. It's not like that this type of code appears in every single program you write.

Memory leaks are notorious mostly due to leaks in lazy code looking very innocent when coming from strict languages. After you get comfortable with laziness the same code (usually) stands out very clearly. And even when it doesn't, these leaks aren't very subtle. GHC has a good profiler for memory usage, so tracking them down is mostly trivial and fixing them once you know what's wrong is not very hard.

The memory leaks I had to work around in combinatorrent were extremely subtle:

loop = forever x and loop = do {x; loop} had different laziness semantics in the Process monad. The former leaked, the latter did not.

I wrote all of the code in about 1.5 months, but spent the next 2-3 months fighting with memory leaks. In typical batch-processing code this is hardly a problem since a small leak goes away once the program terminates. In a long-running daemon-process, this is not as easy.

What leak-debugging tools did you use?

GHC has a very nice memory profiler in the runtime. You just enable that one and then it has rules to constrain views of allocation data. Profiling by "anything of type X" usually helps to narrow down a specific place in the code which has trouble. Also, splitting by module helps since it can tell you where in the code base you have the leak.

It won't point and scream HERE!, but it will give you enough information that you can use to narrow down where the leak happens to be.

What did Erlang lost? I thought he stated that both did reasonable well.

Author chose Haskell cause of static type but most of his subsequence posts are Erlang. I'd say Erlang won him ha.

The bias is also that I work professionally with Erlang.

> The type system of Haskell is the most advanced type system for a general purpose language in existence.

Isn't the type system of ATS more advanced? It might be hard to say with all the GHC extensions.

There is no total order on type systems. ATS is quite advanced yes, and as of late, I would put Scala in the same group of languages with highly complex and powerful type systems.

But I have a hunch that the type system of Haskell allows for more expression than the comparable ATS type system. Even though ATS has variants of dependent types and also linear (perhaps affine?) types.

Very next sentence:

The only systems which can beat it are theorem provers like Coq, and they are not general purpose programming languages (Morriset and the YNot team might disagree though!).

ATS is dependently typed and therefore falls under theorem provers.

ATS is meant to be a general purpose programming language. The fact that you have a theorem proving subsystem is to aid in the programming/verification. Idris is another dependently typed prog. lang. that is intended for general purpose programming.

Idris was not on my radar in 2010 :)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact