
Go vs. Node vs. Rust vs. Swift on Ubuntu Linux 15.10 - grigio
https://grigio.org/go-vs-node-vs-rust-vs-swift/
======
pcwalton
The "use cases and opinions" section is pretty nice, and I agree with it.

I would expect Rust and Swift to have the same execution performance on
Fibonacci. In fact, I'd expect them to generate essentially identical LLVM IR.
The fact that they have differences in performance leads me to believe that
it's some sort of "LLVM IR optimization didn't trigger because of a bug" sort
of issue—perhaps the optimization pass that converts recursive functions to
loops.

I have to say, though: Beware of Fibonacci as a benchmark. I believe it's
vulnerable to a compiler that optimizes it to the closed form solution [1]. I
don't think compilers do this optimization today, but if you popularize a
Fibonacci benchmark you will create market pressure for them to implement it
and ruin your results. :)

[1]: [https://en.wikipedia.org/wiki/Fibonacci_number#Closed-
form_e...](https://en.wikipedia.org/wiki/Fibonacci_number#Closed-
form_expression)

~~~
jerf
"Beware of Fibonacci as a benchmark."

It seems to me one of the dangers of its use is that people tend to think that
it measures "math speed" or "execution speed" or something. But the fib
calculation itself is done with small integer addition, things that will be
about one cycle in any compiled language, and which is also what I've called
"easy mode" for a JIT before. Since the calculation under test is so tiny
compared to the overhead, the benchmark is _really_ measuring the overhead,
and due to the simplicity of the test, very vulnerable to special purpose
optimizations skewing the results vs. what you might expect for real.

This isn't "useless", because that overhead can matter, but unless you have a
program that is going to spend an irreducible and large amount of its time
being forced to make static function calls only to do tiny amounts of work per
function call relative to the function overhead, the information is only a
quite tiny part of understanding the performance characteristics of a
language. If one has optimized even a half-decently compiled language's
program down to where the function call overhead is your biggest problem, one
is to be congratulated on some fine engineering, and I expect that you'll know
what to do next if you need more speed as you've already demonstrated
significant competence.

You want to have a bit more fun, benchmark the local equivalent of "dynamic"
dispatch. That'll spread the languages out quite a bit more. For fun, toss in
Python or something too.

------
Recurecur
That's a quite small sampling of microbenchmarks, but still interesting.

One thing you failed to note is that only Rust and Swift aren't garbage
collected. That means that only Rust and Swift should be considered for
applications where deterministic performance is required, in other words soft
or hard real time.

Many games, for instance, have soft real time requirements.

~~~
vardump
I believe Swift does reference counting (ARC). Isn't that a form of garbage
collection? With reference counting object lifetime management cost is just
spread more finely over time. With GC the cost is paid in periodic "time
chunks".

Other than that, nothing prevents one from writing the performance sensitive
part in C/C++/Rust/etc. and then using whatever other language for the non-
performance sensitive portion, which typically represents the bulk of code.

~~~
Recurecur
Reference counting is a form of GC, the difference being that the programmer
has full control over when it runs. Most real time systems preallocate
everything and never deallocate.

One typical problem with GC systems is common patterns and library code cause
allocations, which cause the GC to run - consuming cycles and causing jitter
even if nothing is freed.

(Note that even calling malloc() is non-deterministic and best avoided in real
time code.)

~~~
kazinator
You certainly do not have "full control" over when reference counting runs.
You have some _illusion_ of control at best.

Fact is, that in a reference counting scheme, functions have to call
pointer->drop() or whatever (maybe dropping a reference implicitly, thanks to
smart pointers or refcounting being built into the language) all over the
place, without having any idea whether that action decrements 15 to 14, or
whether it decrements 1 to 0, triggering cleanup, and additional decrements on
objects behind that object.

You only have control in a certain well-defined circumstances, like when you
have the entire lifetime of an object (or object aggregate) in the same scope,
or a closely related set of scopes. In other words, in situations where you
could quite probably do manual memory management (and effectively _are_ doing
that, just through the refcounting calls that you are still forced to use).

~~~
Recurecur
"You certainly do not have "full control" over when reference counting runs.
You have some illusion of control at best."

If I use the factory pattern with pre-allocated object pools, and keep
references to the objects in the factory, I have absolute certainty that ARC
will never reclaim the objects. Used properly, ARC will contribute negligible
overhead at runtime.

"You only have control in a certain well-defined circumstances, like when you
have the entire lifetime of an object (or object aggregate) in the same scope,
or a closely related set of scopes. In other words, in situations where you
could quite probably do manual memory management (and effectively are doing
that, just through the refcounting calls that you are still forced to use)."

Wrong across the board.

~~~
vardump
> If I use the factory pattern with pre-allocated object pools, and keep
> references to the objects in the factory, I have absolute certainty that ARC
> will never reclaim the objects. Used properly, ARC will contribute
> negligible overhead at runtime.

Weird you'd say that, _exactly_ same applies to any garbage collected
language.

~~~
Recurecur
No, the GC still periodically runs, polluting cache, consuming cycles and
halting the rest of the program until it's done. The only exception is if you
completely avoid allocations, which is generally next to impossible in GC
based languages.

With ARC, there is exactly zero overhead unless a new reference is created or
destroyed. In that case there's an integer add or subtract, not a global "stop
the world" kind of event.

------
sudorandom

        Golang: to enable all the cores, you have to put in your code runtime.GOMAXPROCS(num of cores you want to use)
    

This is no longer true. With Go 1.5 and higher GOMAXPROCS is set to the number
of CPUs available by default.

~~~
grigio
Yeah, I've read somewhere it should be the default behavior in Go 1.5, but
currently it isn't, at least not in my system.

If I don't specify it all the computation use just one core

~~~
sudorandom
This hasn't been my experience... You can also set that value using an
environmental variable (GOMAXPROCS). You might want to make sure that isn't
set in your environment. Otherwise, you might be experiencing a bug that you
should probably report.

Reference: [https://golang.org/doc/go1.5](https://golang.org/doc/go1.5)

------
grigio
here is the source, all the tests seem quite the same
[https://github.com/grigio/bench-go-rust-
swift](https://github.com/grigio/bench-go-rust-swift)

