
What can Rust do for astrophysics? - privong
https://arxiv.org/abs/1702.02951
======
Devid2014
[https://github.com/marblestation/benchmark-
leapfrog](https://github.com/marblestation/benchmark-leapfrog)

This are not valid benchmarks at all. What is the point to run N-Body sim with
all particles set to 0.0 ?

Where is the result of the sim ?

Why it will be not validated at all ?

To make this look like real benchmark one will need to use the same start
condition and then validate that result for all languages is the same.

It would be great to have C++ code using Eigen library for example ? To show
the difference .

Also this is naive O(N^2) sim that could be speed up using Fast multipole
method (FMM), but of course this would be much more complicated to do this in
Rust or Go as in C++.

~~~
thebooktocome
I wouldn't bet that FMM is more complicated to implement in Rust as in C++.

~~~
Devid2014
For me Rust has a lot of potential to write more secure code in places where
it is needed but astrophysics is may be not one of them.

Not sure about language it self, my knowledge about Rust is limited. But st is
at least harder because we already have optimized libraries that can do this
in C++.

Do some one know such library for Rust ? It would be great too look at it.

------
ktta
It's crazy for me to see that C is much slower than rust. I'm almost sure that
there's something wrong there.

In table 1

    
    
       Rust    Fortran     C         Go
    

0m13.660s 0m14.640s 2m32.910s 4m26.240s

They did have a note that C could be faster if language specific features
could be used.

Is rust really that performant compared to C? Or did they neglect to even
bother checking C's performance?

~~~
marblestation
We do not claim that Rust is so much performant than C, we just showed that
Rust can be as fast as Fortran or C. And indeed, there must be something wrong
with the C implementation. Pull requests with improvements are welcome:

[https://github.com/marblestation/benchmark-
leapfrog](https://github.com/marblestation/benchmark-leapfrog)

~~~
CJefferson
No, there is something wrong with the Rust version. I added
println!("{:?}",x); to the end, and now I get 134 second runtime.

Also the output is [[NaN, NaN, NaN], [NaN, NaN, NaN]], which is a bit
worrying...

~~~
one-more-minute
All the particles in the simulation start at (0, 0, 0) so gravity is infinite,
and the whole computation is operating on `NaN`s.

Not sure how much this affects the benchmark, but it'd probably be smart to
randomise the starting positions.

~~~
FeepingCreature
Operating on NaN is hugely slower than normal floating point numbers. You're
hitting a slowpath in the processor's floating point logic.

------
dikaiosune
A little less than a year ago I worked on a bioinformatics project in rust,
and I think it came out very well. My perception was that bio is a field where
most practitioners rely on high speed code written by someone else that can be
called from their perl or python code. I see something very empowering with
rust, where people can write high performance analysis without having to
shoehorn their workload into an existing toolkit or framework.

That said, while I'm really excited to see rust go more places, astrophysics
seems like one of very few fields where practitioners learn a high performance
language as a matter of course, and so rust may have a less empowering effect
for them. I would wager that after paying the upfront cost to learn it,
students and researchers may see improved iteration time from fewer bugs. But
it seems like rust may have an incremental improvement to offer which is not
as powerful as the transformative nature of its potential in other
computationally intensive fields.

------
dottrap
Is the code in gravity_calculate_acceleration correct? I think 'j' will always
be 0 in the inner-loop, which seems to defeat the purpose of having a j
variable.

    
    
        void gravity_calculate_acceleration(int n_particles, double m[], double x[][3], double a[][3]) {
            double G = 6.6742367e-11; // m^3.kg^-1.s^-2
            for (int i=0; i<n_particles; i++){
        		a[i][0] = 0;
        		a[i][1] = 0;
        		a[i][2] = 0;
                for (int j=0; i<n_particles; i++){
                    if (j == i) {
                        continue;
                    }
                    double dx = x[i][0] - x[j][0];
                    double dy = x[i][0] - x[j][0];
                    double dz = x[i][0] - x[j][0];
                    double r = sqrt(dx*dx + dy*dy + dz*dz);
                    double prefact = -G/(r*r*r) * m[j];
                    a[i][0] += prefact * dx;
                    a[i][1] += prefact * dy;
                    a[i][2] += prefact * dz;
                }
            }
        }

~~~
michaf
This seems to have been fixed 3 hours ago:
[https://github.com/marblestation/benchmark-
leapfrog/commit/c...](https://github.com/marblestation/benchmark-
leapfrog/commit/c51d95dad0a83d6c3cb5f2868896da3bfa46e158)

~~~
nhatcher
Yes, but m is zero anyway.

~~~
michaf
You are correct. It seems they really should double-check their implementation
for producing meaningful results before running bechmarks on it...

------
srwalker101
For those who are looking at doing data analysis with Rust in astrophysics, I
have a crate fitsio
([https://crates.io/crates/fitsio](https://crates.io/crates/fitsio)) which
wraps cfitsio into rust, allowing the reading and writing of .fits files.

~~~
Avshalom
Nice. I mean I didn't wanna say but when I saw the headline my first thought
was "basically nothing if you don't have a fits library"

------
ptero
Just my 2c -- it is a potentially interesting article, but two things have to
be fixed:

1\. Compile C version with better optimization flags or ask an expert to tweak
it for faster performance.

2\. Include sample run on some known / meaningful data to test that software
runs correctly (not only fast).

IMO rust doesn't have to be faster than C to make a valuable article. If it is
as fast or almost as fast it is still a good data point.

~~~
tom_mellior
> Compile C version with better optimization flags or ask an expert to tweak
> it for faster performance.

Part of the point they are trying to make is that you _don 't_ need to be an
expert to reap performance benefits in certain languages. They explicitly
state that they did not do things that C experts would know how to do. This is
pretty reasonable in the context of astrophysics. The whole idea of the thing
is that if there is a language that allows you to write faster code than C
without having to "ask an expert to tweak it", you will be more productive
using that language.

That said, your point 2 is entirely valid; the paper fails to demonstrate
pretty much all of its claims, and it never makes a case for needing Rust's
safe dynamic memory allocation features in a program that doesn't use dynamic
allocation at all.

For whatever it's worth, the astrophysicists I most recently talked to wrote
their simulations in Python (with SciPy or similar) or even something called
IDL:
[https://en.wikipedia.org/wiki/IDL_(programming_language)](https://en.wikipedia.org/wiki/IDL_\(programming_language\))
and they also tend to have mad Bash skills for glueing different data-
processing programs together.

~~~
ptero
You have a good criticism on my #1. I should drop "ask the expert" part but
still maintain that when writing numerical computation software checking for
compile flags is a reasonable request. Leave code as is, but still try -O3 /
-Ofast.

At this point it is not the language expertise, but knowing about your tools.

~~~
tom_mellior
Agreed, mostly :-) The paper does state that they used -O3.

------
ced
As someone with no Rust experience...

 _a) access to invalid memory regions, b) dangling pointers and attempts to
free already freed memory, c) memory leaks and, d) race conditions._

The first three benefits also come with essentially any GC'ed language. d) is
interesting. What are the costs that come with such a guarantee? Presumably it
prohibits certain kinds of parallelism? Are the ownership/borrowing mechanics
interesting programming tools, or are they a hindrance?

People talk a lot about static typing, and I can see the benefits for
critical/user-facing applications, but not for numerical code. The nightmare
with numerical code is finding out that I forgot a minus sign somewhere, and
that it invalidates my last 6 months of published results.

~~~
eridius
> _Presumably it prohibits certain kinds of parallelism?_

Actually it enables parallelism you couldn't do before, because now you're
confident the compiler will prove that what you're doing is safe.

------
acqq
Where are the links to the sources and is there a chance to compare the
versions (Fortran vs Rust doing the same calculations)? I wasn't able to find
them.

Where is the discussion of the library availability? The libraries for Fortran
are maintained and improved during the decades?

~~~
marblestation
You can find the code for the simple N-Body implemented in different languages
here (the more advanced N-Body version with tides has not been released yet):

[https://github.com/marblestation/benchmark-
leapfrog](https://github.com/marblestation/benchmark-leapfrog)

And indeed, Fortran has decades of advantage in terms of libraries.

~~~
acqq
It's 64 lines of rs, hardly enough to demonstrate anything for the real use,
and it's suspiciously badly written, if I understand correctly:

The C code:

    
    
        void integrator_leapfrog_part1(int n_particles,
              double x[][3], double v[][3],
              double half_time_step){
    	for (int i=0;i<n_particles;i++){
    		x[i][0]  += half_time_step * v[i][0];
    
        ...
        int main(int argc, char* argv[]) {
            const int n_particles = 2;
        ...
            while(time <= time_limit) {
               integrator_leapfrog_part1(n_particles,
                  x, v, half_time_step); 
    

The Rust code:

    
    
        const N_PARTICLES: usize = 2;
        ...
        fn main() {
        ...
            while time <= time_limit {
            integrator_leapfrog_part1(N_PARTICLES,
                   &mut x, &v, half_time_step); 
        ...
        fn integrator_leapfrog_part1(n_particles: usize,
                     x: &mut [[f64; 3]; N_PARTICLES], 
                     v: &[[f64; 3]; N_PARTICLES],
                     half_time_step: f64) {
            for i in 0..n_particles {
                x[i][0]  += half_time_step * v[i][0]; 
    
    

The way I understand it, with the C code the compiler during the compilation
of the function doesn't know the size of the array, and with the Rust code it
does, it is explicitly written? I refer to the difference between the capital
letter _constant_ and the plain _variable_. What would happen if the C
compiler only knew that much too? that is, having the presence of the
N_PARTICLES in all declarations? I can also imagine that just adding the
proper compiler and linker options for C, not used in the makefile, can maybe
give the benefit of that constant propagation in this particular case? I mean
what happens when the "-flto" is added?

The N-Body on this site, with other implementations, has clearly faster C than
Rust:

[https://benchmarksgame.alioth.debian.org/u64q/performance.ph...](https://benchmarksgame.alioth.debian.org/u64q/performance.php?test=nbody)

~~~
kibwen
_> The N-Body on this site_

As noted elsewhere in here, the C implementation is using SSE, whereas the
Rust implementation isn't. It's just as much of an unfair comparison as the
one you're describing. :P

~~~
acqq
The programs on the site fit the rules defined on it, and the rules include
the verification of the results:

[https://benchmarksgame.alioth.debian.org/why-measure-toy-
ben...](https://benchmarksgame.alioth.debian.org/why-measure-toy-benchmark-
programs.html)

The OP doesn't even specify the rules for its own benchmark, as far as I
understand doesn't verify the results? People here get NaNs?

If the NaNs are produced as results, then the OP code is not measuring the
speed of calculations at all but the speed of the failed calculations. Which
is especially problematic as the main argument is "attractive for the
scientific community" "it guarantees memory safety." Which is presented as
good because not having it "can produce random behaviors and affect the
scientific interpretation of the results." If the result here is NaN the
calculation doesn't even have to be performed after the first NaN that affects
the result appears, as anything + NaN is NaN etc.

Edit: kibwen, did I write somewhere that I "refute" something? I gave a link
to the rationale behind the "benchmarksgame" site. The rest is about the OP,
surely not about your post.

~~~
kibwen
_> The programs on the site fit the rules defined on it, and the rules include
the verification of the results_

I'm not sure what you're refuting?

Verifying that the results match is a necessary but insufficient quality for
ensuring comparability. If the algorithm could be the same in each language
but isn't (e.g. quicksort vs. bogosort), then that's not a valid comparison if
your objective is to determine the overhead imposed by the language
implementation itself (and if you're not trying to determine language
implementation overhead, then what are you measuring?). Likewise if the
implementation details could be the same in each language but aren't (e.g. if
one uses 64-bit integers and the other uses 32-bit integers).

The computer language benchmarks game was initially conceived to determine a
ballpark for how slow interpreted and managed languages are compared to C.
Quantifying the overhead of interpreters and runtimes is its raison d'etre,
and it shows. When it comes to comparing low-level systems languages that have
no runtime to speak of, the best it can do is attempt to quantify the quality
of each backend's code generator (it's a missed opportunity that it doesn't
include Clang for comparison with GCC).

(And yes, I understand that the benchmarks game contains repeated massive
disclaimers that people should not take the performance results as a means of
serious comparison. Internet commentators remain undeterred.)

If you're just trying to argue that the methodology used in the OP is poor,
then obviously we're in agreement (was there ever any doubt?).

~~~
igouy
> _The computer language benchmarks game was initially conceived to determine
> a ballpark for how slow interpreted and managed languages are compared to
> C._

Today the benchmarks game is referred to from the Rust FAQ. What is one to do?

[https://www.rust-lang.org/en-US/faq.html#performance](https://www.rust-
lang.org/en-US/faq.html#performance)

Back in the previous millenia -- _" [Doug Bagley's] goal was to compare all
the major scripting languages. Then [Doug Bagley] started adding in some
compiled languages for comparison…"_

[http://web.archive.org/web/20010125021400/http://www.bagley....](http://web.archive.org/web/20010125021400/http://www.bagley.org/~doug/shootout/#News)

------
J0-nas
I don't know the astrophysics domain at all but shouldn't there be more
arguments for Go?

* Safer than C

* Almost as fast as C

* Good concurrency support

* High productivity due to simple abstractions and good tooling

Is the downside of a GC language really relevant? Does astrophysics suffer
from "stop the world interrupts" or is it just because of the performance? The
Go GC is already really fast.

~~~
claudius
> * Almost as fast as C

That's the key point. Judging by the (wrong) benchmark above where both C and
Go seem to do some work, Go is about half as fast as C.

A grad student (if paid properly) costs maybe 60k €/year. In comparison, we
regularly spend more than 250k on new and faster computers, not including
maintenance, the electricity bill etc. If you can make software even 50%
faster by increasing development time, this is nearly always worth it in an
academic setting. Note that the benchmark implementation shown here is a very
simple example, normally computing jobs take days to weeks (multiplied by
however many cores you can sensibly use) of CPU time.

Decreasing runtime also increases productivity a lot, since the turnaround
time becomes shorter; waiting 3 versus 6 weeks for results is a noticeable
difference.

Additionally, these tools rarely have safety concerns: In my own code, there
is no "untrusted user input". There is correct user input (good) and incorrect
user input (mostly it will crash, with common mistakes it will try to notify
the user). Safety is handled by the operating system (such that you can’t
overwrite someone else’s data, for example).

This means that the only advantage Go could have over C/C++ in academia is the
"good concurrency support". However, concurrency in HPC is usually handled via
OpenMP (shared memory) or MPI (distributed memory) parallelisation. This takes
a while to get used to, but is very different (and in a sense much easier)
than e.g. the typical case for a web server, where you wish to serve as many
users as possible at the same time using as little CPU time as possible – a
busily waiting loop is horrible in the latter case but perfectly fine in the
former (under some circumstances).

So overall, languages are not interesting in the academic HPC crowd if they
sacrifice speed for _anything_ , which, incidentally, makes Rust also
interesting, because that is precisely not the case (apparently, under some
circumstances, etc.pp.).

~~~
J0-nas
I'd agree but

> Judging by the (wrong) benchmark above where both C and Go seem to do some
> work, Go is about half as fast as C.

I think this is what is most flawed in the paper. For (some) concurrent
problems Go is about as fast as C [1].

But then again, I don't know what kind of computing requirements they have. Is
there a reason why GRP computing isn't mentioned? They are really efficient
and CUDA isn't that hard to learn.

> Rust allows the user to avoid common mistakes such as the access to invalid
> memory regions and race conditions

They seem to care about safety.

[1]
[https://benchmarksgame.alioth.debian.org/u64q/compare.php?la...](https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=go&lang2=gcc)

~~~
claudius
> I think this is what is most flawed in the paper. For (some) concurrent
> problems Go is about as fast as C [1].

Certainly, this was really only taken as a ballpark estimate of the
performance difference. Looking at your link, it seems to have been slightly
overestimating the difference, though the general point still stands: C is (in
most cases) faster and there is (nearly) no case in which Go beats C.

> Is there a reason why GRP computing isn't mentioned?

> They are really efficient and CUDA isn't that hard to learn.

Sorry, not sure if you mean GPU instead of GRP in the first sentence. CUDA
helps to some degree, but not always, e.g. when you are memory bandwidth bound
(common in tensor networks in physics). I have no experience with Monte Carlo
methods and can’t comment on whether they substantially benefit from CUDA; I
know of at least one (Quantum Monte Carlo) code which runs much faster on a
standard Xeon than on the Xeon Phi, though.

> They seem to care about safety.

Yes of course safety is nice to have and I’ll gladly learn Rust when I have
some free time to get that safety at hopefully zero cost. But sacrificing
performance is simply not competitive, if you can throw someone with gdb at
the problem and get essentially the same "safety".

~~~
bluejekyll
> if you can throw someone with gdb at the problem and get essentially the
> same "safety".

That's an odd tradeoff. That time of debugging could be significantly longer
than just writing the software in a safe language. Seems like a bad tradeoff,
and in many scenarios, bugs can lead to very bad things that you can't recover
from. I've experienced all of these, having to fix them over long periods of
time (not all my code, but sadly some was): data loss, concurrency (tough to
debug in gdb), major memory leaks (even from std libs), corrupted data because
of misused non null terminated c strings, array out of bound issues. Each one
of these took weeks to track down, maybe because I'm not smart, which is a
valid criticism; with Rust I've never had issues with any of those (2 years
and running), and given that I'm not smart, it helps me by telling me where I
got something wrong.

Be safe out there people...

