Hacker News new | comments | show | ask | jobs | submit login
Benchmarking level generation: Go, Rust, Haskell and D (togototo.wordpress.com)
42 points by stock_toaster 1578 days ago | hide | past | web | 68 comments | favorite

The problem with almost every language benchmarking blog post I see is that the author is generally an expert in at most one of the languages they are benchmarking, and so they write slow or non-idiomatic code in the other languages, leading to a useless comparison.

Here's one example that I picked up. The author writes that they had to refactor part of the Haskell code from (paraphrasing)

    if (checkColl tr rsDone || checkBound tr)
      then {- branch 1 -}
      else {- branch 2 -}

    if checkBound tr
      then noFit
      else if checkColl tr rsDone
        then noFit
        else {- branch 2 -}
        noFit = {- branch 1 -}
because of "problems with lazy evaluation". But in fact the only problem is that the call to `checkBound` is fast whereas the call to `checkColl` is slow, and the || operator evaluates its left argument before deciding if it needs to evaluate its right argument (just like in C, and every other language I've ever used). So all that is required to get the speedup is to switch the order of the calls:

    if (checkBound tr || checkColl tr rsDone)
      then {- branch 1 -}
      else {- branch 2 -}

I should have made it clearer; when I said "Lazy evaluation often tricked me up", I meant that I made the stupid mistake of thinking Haskell could somehow magically evaluate the shortest branch first via lazy evaluation. I.e. lazy evaluation tricked led me to make mistakes due to expecting too much from it, not because of any inherent fault in lazy evaluation.

The Rust code was very non-idiomatic, and is dramatically disadvantaged by using the ISAAC random number generator (which is cryptographically secure, and so quite slow). By rewriting it to use more iterators (for non-bounds checked vector access), more vector-building functions (specifically, `vec::with_capacity` and `vec::from_fn`), and a non-cryptographically secure RNG[1], I get the following:

  Rust       0.34
  Rust isaac 0.57
  Rust orig  0.70
  Clang      0.42
  GCC        0.36
My rewritten version: https://github.com/huonw/levgen-benchmarks/blob/master/R.rs (I didn't bother implementing seeding of the RNG for this, so it's the same result every time. Edit: implemented now, no performance change.)

[1]: The new RNG is XorShiftRng, which is also in the standard library, and actually has far better randomness properties than most platforms' `rand()` functions anyway.

(Edit: merging two adjacent `if` statements, took it down to 0.34 from 0.36.)

Here's the compare view for an overview of the changes dbaupp applied to the original code: https://github.com/huonw/levgen-benchmarks/compare/logicchai...

edit: wrong addition left thereafter for posterity, ignore it (apparently Go uses ISAAC so the change to XorShiftRng might be unfair to Go (though it's trailing anyway). I expect C uses the standard libc generator?)

Assuming this [1] is what Go is using, it is not ISAAC. It looks like a lagged Fibonacci generator.

[1] http://golang.org/src/pkg/math/rand/rng.go

Ah yes, I misread a comment below which talked about Go and Rust, I didn't read closely enough and took the ISAAC mention to apply to Go.

(This is now the code in the article.)

    if((((rx + rw +1 ) < x) || ((rx > (x+w +1 ))))) { RoomOkay=true
    } else if((((ry + rh +1 ) < y) || ((ry > (y+h +1 ))))) {RoomOkay=true
    }else {RoomOkay=false}
    if(RoomOkay==false){ return true}		

Gaaaah, really? "go fmt" is your friend! That mess automatically cleans up into this:

    if ((rx + rw + 1) < x) || (rx > (x + w + 1)) {
        RoomOkay = true
    } else if ((ry + rh + 1) < y) || (ry > (y + h + 1)) {
        RoomOkay = true
    } else {
        RoomOkay = false
    if RoomOkay == false {
        return true
> Particularly ugly is

> I’m not sure why the language can’t automatically dereference the slice when the [] operator is applied to it

I don't know exactly what the Go team would have to say about this, but my take would be language simplicity. Pointers are pointers, and special-casing some of them breaks the rules that both the compiler and the programmer rely on.

It may look like C, but it isn't. There's no pointer/array equivalence here, indexing and dereferencing aren't the same thing. If I weren't coming from a C background (which I'm guessing the author is, too), I'm not sure it would ever occur to me to expect a pointer-to-slice/array to be something you could index.

Also, as a slice is basically a pointer to array plus two ints, you normally wouldn't pass a pointer to it in the first place. You only need to do that if you want to resize it.

The problem when people talk about code beauty is that beauty doesn't always equate to readability. I find Go is very much one of those languages that doesn't look particularly pretty but it's a joy to read when compared to many other languages of the same ilk.

> Hopefully in future Rust will adopt something like Go’s automatic initialisation of all values to zero, or at least have this as an option (or document it better, if it already is an option).

First, thanks for the feedback!

Unfortunately, I don't think this feature will be added to Rust, though. IMO it causes bugs, as accidentally failing to initialize a variable yields a garbage value at runtime. There's no reason that 0 is any more likely to be a valid integer value than 1, or 42, or 0xdeadbeef, and the compiler shouldn't try to guess. And the feature wouldn't work in Rust anyhow due to the lack of null pointers.

However, having an explicit zero value is convenient, and this is what the Zero trait is for. With Rust traits you can overload on the return value, like Haskell, so you can write:

    let (emptyr: Room, emptyt: Tile, emptyl: Lev) = Zero::zero();
    let mut ls = [ emptyl, ..100 ];
assuming that your types derive Zero (which you can do with #[deriving(Zero)]). In the future it should be:

    let (emptyr, emptyt, emptyl) = zero();
    let mut ls = [ emptyl, ..100 ];
This could be done by either writing a zero() wrapper in the prelude or allowing static methods to be imported, both of which are probably good ideas.

IMO it causes bugs, as accidentally failing to initialize a variable yields a garbage value at runtime. There's no reason that 0 is any more likely to be a valid integer value than 1, or 42, or 0xdeadbeef, and the compiler shouldn't try to guess.

When a language default is to zero-initialize, it creates a strong cultural impetus within that language's user community to define data structures such that zero initialization is a valid state.

But if the language is taking a hard line against null (rather than e.g. being able to annotate locations as null-accepting as required, or vice versa), zero initialization may not be the right choice. However the combination of a requirement for definite assignment and non-nullability tends to make algorithms involving dynamically allocated arrays awkward to implement.

As alexcrichton points out on that bug, it's already implemented:

  rusti> let (a,b,c): (int, uint, float) = std::num::Zero::zero();
  rusti> a
  rusti> b 
  rusti> c
(The explicit type annotations of the tuple are purely because of how rusti works, in most cases they can be dropped as they will be inferred from the use of a, b and c.)

Right, sorry, I was confused because of the type annotation issue. Edited the post to be accurate.

Excellent, that was just the kind of thing I was looking for, thank you. It would be great to this documented in the Rust Tutorial section on vectors.

There's no notion of (default) constructors in Rust? How do you build complex objects in it?

The same way you do in other struct-based languages I expect, either build the struct directly or go through a static factory function.

I admit this tripped me up for a minute. After blinking a few times, I realized that any object that's difficult to initialize manually is going to be a constructor provided for it anyway. But, see also: the Zero trait Patrick mentioned, which can be derived.

I work almost full-time on C++ optimization and I find most type of such benchmarks moot, and _not_ because of methodology flaws.

We need to stop just looking at numbers and investigate what these compilers allow in terms of productivity, optimization opportunities and explicit control of generated code.

If you look at x264 numbers you'll see that there is up to a x10 gap between hand-optimized assembly and native C, and I believe that this gap exist in most fast programs because optimizing backends are too much black-boxes that consistently produce quite good, even sometimes amazing results, but with little control or feedback.

The nature of highly optimized code is detailed, explicit control of what should happen. I will take a poorly optimizing compiler any day if it allows me complete control of the backend optimizations it chooses, if I can give him register hints that actually matter, if it stop compiling and ask for the annotations he need to optimize better. C++ compilers are good at this game but this could be taken further.

Also it's unfair to never look at language productivity at the same time as speed, since the ability to produce fast programs will be notably affected by the speed at which we can achieve that anyway.

I agree. However, how do you suggest we judge languages in terms of their productivity, isn't that highly subjective? I have never seen anyone do a proper 'productivity benchmark', but perhaps with some rigor it could be done reliably.

I don't know. All methods I can think of are expensive and wasteful. So perhaps it's a good proxy.

I wonder how Rust is at this.

> For those who haven’t been following Rust, there are around seven different kinds of pointers: @ (garbage collected heap allocation), ~ (uniquely owned heap allocation), & (borrowed pointer to heap/stack allocation), * (raw C pointer, only useable in unsafe code), and mutable versions of the first three of those (actually, I’m not sure if there’s a mut & pointer, so maybe just six pointers in total).

I am not sure I like the sound of that. I guess having the unsafe stuff available is very useful for certain things, but it all sounds a bit messy, too.

We're moving the garbage-collected pointers (@) out into a library, so by default we'll have only a single pointer, ~ (and there's no mutable version of this). We'll still have & and &mut, but those aren't pointers, they're references (it's our own fault for calling them "borrowed pointers" in our documentation).

So depending on how you count, that's either one or three pointer types, which I'm happy with. The "unsafe" pointers exist only for C interop and have no bearing on normal code, so it's inaccurate to count them.

Being able to talk about memory allocation in a safe way is one of the key objectives of Rust. Rust is the only modern language (i.e. not C or C++) I know of that allows one to mix GC'ed and manually allocated memory, and heap and stack allocated memory.

Reasoning about mutability is necessary for safe (and fast!) concurrency.

> Rust is the only modern language (i.e. not C or C++) I know of that allows one to mix [...] heap and stack allocated memory.

Technically I believe (to verify) Go does that as well, but the mix is implicit and the system (not the developer) will decide on its own whether a value should be stack- or heap-allocated.

C# will also allocate value types local variables on the stack (technically all local variables are stack-allocated, but for reference types it obviously stack-allocates a reference to the heap, the end-result is that under the usual understanding of "stack allocation" only value type local variables and parameters are stack-allocated)

JVM HotSpot does this, too, implicitly, by means of EscapeAnalysis.

By "Being able to talk about memory allocation" I meant that the programmer has control over these things.

You can do it in D as well in much the same way as in C#. (You can get manual memory management with heap allocation as well, but you lose the safety then.)

It's not nearly as bad as it seems. GC is rarely needed and you won't write a lot of *s (I think bindgen will write virtually all of the ones you do need anyway.)

That leaves ~ and & which you don't normally need to think about, you could get pretty far writing ~ for all allocations and & in all fn signatures. You could write a video game like this with the SFML bindings for example.

mut doesn't really complicate things in terms of which pointer you're picking. I've screwed up mut here and there, but I feel like it's its own subject.

A sidenote: Your blog's code formatting is awful. I would recommend that you embed one of the common solutions for code snippet syntax highlighting into your blog. I needed special motivation to continue reading your post past the first code sample.

Edit: And then of course actually use the prominent guidelines for good-looking code in each language!

My apologies, I was in a hurry; I'll try to put more effort into presentation in future. Are there prominent guidelines for good-looking code in Rust?

There's the Rust style guide[1] but it's not really that useful at the moment; however, 2 points of general advice:

- make use of functions like `vec::from_fn`, `uint::range` and `some_number.times`, rather than writing `while`-loops with a counter explicitly.

- use iterators[2,3] for traversing data structures, since it allows the data structure to optimise the accesses (e.g. the vector iterator avoids bound checks).

However, all the tutorial/manual are out of date, and the documentation makes it very hard to find out about these functions; so it's definitely not the programmers fault when they turn out non-idiomatic Rust.

[1]: https://github.com/mozilla/rust/wiki/Note-style-guide [2]: http://static.rust-lang.org/doc/tutorial-container.html#iter... [3]: http://static.rust-lang.org/doc/core/iterator.html

Sorry, I really don't know much about Rust other than the occasional article I run into here on HN. Maybe someone else can point you towards a document. For Haskell, here are a few resources:

- https://github.com/tibbe/haskell-style-guide/blob/master/has...

- http://stackoverflow.com/questions/6398996/good-haskell-sour...

  > sidenote: Your blog's code formatting is awful.
fyi - it is not my blog. Just thought it was a neat post.

> The initial version must have been checking each room for collisions (a long process) before checking whether it fit within the level bounds (a short process), so manually sequencing it with if-then-else led to a spectacular 268% improvement.

Or just change `(checkColl tr rsDone || checkBound tr)` to `(checkBound tr || checkColl tr rsDone)`.

The dmd middle- and backend got little attention over the last years. Especially, when compared to GCC and LLVM. No wonder it produces slower code.

With D it is not a bad idea to use dmd for development and GDC/LDC for release builds.

Nice to see Rust doing so well, but I'm disappointed by the performance of the Haskell code.

Why? I had (and often have) exactly the opposite sentiment: it is surprising that a language that is as high-level as Haskell can compete with low-level imperative languages.

Regardless what one thinks of Haskell, it's a testament to the work of the GHC developers.

This test isn't actually a great benchmark between languages because, as the author correctly points out, it's mostly a test of random number generation and presumably all of those languages use libc's random number generator. If you run a quick profile on the C code, you'll find the vast majority of time is spent in libc generating random numbers.

To get a fair comparison without completely changing the benchmark, you should probably subtract .3-.4 seconds from each runtime as that's the amount of time spent in random number generation (again presumably all languages use libc's).

Haskell doesn't use the libc standard number generator. The implementation is here:


Go's implementation is here and doesn't use the libc generator either:


Walter Bright has already commented on D.

it's mostly a test of random number generation and presumably all of those languages use libc's random number generator

In fact, that would be better than benchmarking different PRNG methods, as it's done now ;).

Anyway, I think the benchmark is still fair, since the programs use whatever the default implementation for a PRNG is in the language's standard library. Sure, the Haskell program will be faster if you'd usea MWC or MT implementation from Hackage. But you'd expect a standard library to come with a sensible default ;).

I would be surprised if Go used libc's random number generator; Go libraries tend to reimplement libc functionality in Go. (This is in contrast to Rust, which usually uses libc.)

Indeed it is reimplemented: http://golang.org/src/pkg/math/rand/rng.go

Rust uses a cryptographically secure random number generator (ISAAC) by default, to try to protect users from themselves. This means that the RNG is going to be slower than non-cryptographically-secure random number generators. You can get a non-cryptographic random number generator, but you have to ask for it.

I do believe that there's no reason that any modern language should not have a secure random number generator by default. But while ISAAC is likely fine for this purpose, I'd recommend Salsa20/12 instead (Salsa20/8 would be fine too, but let's be conservative).

Salsa20 is parallelizable and can skip ahead arbitrarily, has smaller state, and is fast: on recent processors the amortized time per 32-bit integer is around 8 cycles. It was also thoroughly vetted as an eSTREAM cipher.

Rust has XorShift (http://en.wikipedia.org/wiki/Xorshift) as a "secondary" RNG in the standard library, for what it's worth.

What implementations do you suggest? I'd be happy to bring this up to the developers if there's a chance of speeding up our default RNG with one that's just as safe. Note that it must also be compatible with our MIT/Apache 2.0 dual license.

SUPERCOP [1] has a bunch of public-domain implementations. [2] is a portable one. To get top speed, you need SIMD, though. The available implementations in SUPERCOP are in (x86) assembly, but they really should be converted to intrinsics to be more portable.

I don't know if Rust has intrinsics or some other kind of vector register support. I'd even volunteer to implement Salsa20 on it.

[1] http://bench.cr.yp.to/supercop.html

[2] https://github.com/floodyberry/supercop/blob/master/crypto_s...

[3] https://github.com/floodyberry/supercop/tree/master/crypto_s...

There's support for SIMD in Rust, and the compiler intrinsics that LLVM supports. Inline assembler is supported as well. However, note that the SIMD support will quite possibly change to become more first-class.

Note that I'm working on this (slowly), and things of interest can/should be put here: https://github.com/mozilla/rust/wiki/Lib-rand

Interesting, although I think that makes the benchmark even less useful because the heart of it (generating random numbers) is not using the same algorithm across languages.

D doesn't use libc's random number generator because the C standard one isn't very random.


Of course, a better random number generator takes longer to compute. The blog's author switched the D version to use libc's random number generator, and got some significant speedups.


That's the thing about Haskell, isn't it? It offers enough guarantees that the compiler can work its magic. Which makes me think it can do better than that in terms of perfs.

Every benchmark is different. And especially if you use lazy lists instead of loops with unboxed parameters.

Still, it looks mostly like a test of the random number source. And also none of the nice high level optimizations are firing anyway, http://www.reddit.com/r/programming/comments/1ixnf6/benchmar...

The most recent Haskell version uses the MWC PRNG rather than Mersenne, reaching 1.05s (53% the speed of C). That's a respectable number, especially for such an expressive language. I imagine if someone could adjust it to use XorShift it would run even faster.

The reports from all of these micro benchmarks seems to be consistent on the subject of Rust: it doesn't yet have the performance that justifies its existence. I'm guessing you could get similar numbers out of Java.

I'm hoping Rust's numbers improve. It's incredibly promising, but without C++ level performance, it's too much semantic complexity.

Achieving the performance that Rust does is amazing, given the focus so far been on language design and not performance. It's not even 1.0 yet!

I'm sure these numbers will improve.

With XorShift as the PRNG, Rust is now almost on par with C and beating the DMD and GDC D implementations. If the syntax were stable, and the standard library and documentation improved, then we'd seriously consider it as an alternative to C++ for parts of our current project.

You can write slow code in any language, that's exactly what the problem was (and still is) here. The implementations still aren't using the same random number generator.

Rust has C++ level performance if you compare the same code in both languages.

+1 The same is true probably also about D.

No because D has C performance, now. Have you seen the blog numbers?

As pointed out in the other comments on this thread, this benchmark benchmarks mostly standard RNG performance, not quality of the code generated by each compiler. Here you've got much more computationally intensive benchmark, showing that D is "as fast as C" like it could be said about Java: http://attractivechaos.wordpress.com/2013/04/06/performance-...

It is close, but Java is just as close, too. Not a tremendous performance advantage that someone would choose one over another for and not something I'd expect from a language marketed as "performance oriented".

Ok you live in a world where Java is "as fast as C". Let's terminate this discussion.

I didn't say it. You said D was as fast as C now. Then, according to your definition of "as fast as C", Java must be fast as C as well. Seems your as fast as C definition must be something between 30%-95% performance of C because those are the most common performance numbers reported for D in various benchmarks (I cited one of them, and the one in the blog also confirms this - only one D compiler managed to match the best C compiler, the others were significantly behind). The same numbers are consistently reported about Java, so I don't really understand what you're arguing about. So I conclude both are the same league (with two minor exceptions: 1. I saw Java beating GCC many times, once quite severely by almost 2x on simple array numerical code, while I never saw that happening for D and 2. D has much worse GC implementation than Java or C# have).

In the real world it's not nearly as fast as C, but in a micro-benchmark like this with a bunch of loops crunching fixed-size integers it would come close.

The difference in performance between the languages is almost entirely due to differences in the implementation of the benchmark.

Any chance you could also benchmark the nimrod programming language?

I think nimrod is a very good contender and it's sad never seeing it being benchmarked when Go, D and Rust are.

I will gladly help you write a benchmark if you wish to pursue this .

Which C compiler would you use as Nimrod lacks a proper compiler?

GCC most likely.

Is it possible to use LLVM? That generally compiles faster code for me than GCC, although that may be because my LLVM version is slightly newer.

Then it would be testing how GCC compiles C code and not Nimrod, right?

I can see what you are trying to say. But it would still be testing the speed of Nimrod, whether the Nimrod compiler compiles Nimrod code to C or to Assembly is irrelevant. The fact is that Nimrod is not C, and we are interested in the speed that Nimrod can achieve.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact