
Why Rust fails hard at scientific computing - alex_hirner
https://andre-ratsimbazafy.com/why-rust-fails-hard-at-scientific-computing/
======
mintplant
> Consequences ? You can’t use Rust arrays to represent a matrix bigger than
> 4×8, how useful is that?

I went and Google'd "rust linear algebra library" and immediately found
[https://nalgebra.org](https://nalgebra.org) which can represent such matrices
just fine, and even integrates with LAPACK. Did the author do this most
cursory of research before springing to write their rant? This is like saying
"Python fails hard at scientific computing" because the default list type
isn't an efficient basis for representing matrices, while entirely ignoring
NumPy.

Also, those are _memory slices_ , not arrays: they aren't pointers to heap-
allocated data. So [[f32; 9]; 5] corresponds to a contiguous block of memory
representing a 5x9 matrix of 32-bit floats. It has the same memory structure
as [f32; 45], but you can index it as eg. my_matrix[3][2] to access the
element in the 2nd column of the 3rd row. Again, I guess it's much easier to
bash something than try to understand it.

~~~
coldtea
> _I went and Google 'd "rust linear algebra library" and immediately found
> [https://nalgebra.org](https://nalgebra.org) which can represent such
> matrices just fine, and even integrates with LAPACK. Did the author do this
> most cursory of research before springing to write their rant? This is like
> saying "Python fails hard at scientific computing" because the default list
> type isn't an efficient basis for representing matrices, while entirely
> ignoring NumPy._

The author didn't want to use/leverage/whatever some ready made package. He
wanted to write his own.

Python does suck for scientific computing at that level as well -- Numpy and
Scipy are huge wrappers for C/Fortran etc code.

~~~
jbooth
If the author wanted to write his own, he could very easily write an nd-array-
alike wrapping a really long double[] in Rust, right?

From the author: _You can work around it by using a Vec (arbitrary sized
sequence /list) but then your matrix is allocated on the heap not the stack,
meaning slower operations. Plus that means you cannot use Rust wonderful type
system to check that you multiply matrices with compatible dimensions, say a
2×2 matrix with a 2×1 matrix, without jumping through hoops._

So he thought of that, then decided not to go that way, even though that's how
it's done in C/Fortran underneath every single linear algebra library in the
world, and his conclusion is that rust sucks?

I'm finding it a little hard to believe that there are real-world linear
algebra problems where allocation is the bottleneck and doing it all on the
stack is the answer.

~~~
Typhon
If you have to do X in the new language in the same way as in Fortran, the new
language is not better for doing X than Fortran, is it ?

~~~
jbooth
Sure it is. Both in larger program design and in affording several different
ways to accomplish the subscripting into the giant nd-array Vec<f64>. Maybe
Rust should also provide language support for the ndarray type, but lack of
such isn't a barrier to being as-good/better than existing langauges.

If you're doing linear algebra on big matrices, putting them on the stack is
most likely madness. It sounds like the author is interested in Rust for the
'expressive' type system enforcing logical invariants about his matrices,
which isn't why Rust built their expressive type system.

------
fermienrico
What happened to the readability of code? This looks dreadful. I am sure there
is a way to come up with a better syntax without sacrificing the type safety
aspects.

    
    
      fn main() {
          let my_str: Rc<RefCell<Box<MyStruct1>>> = 
      Rc::new(RefCell::new(Box::new(MyStruct1)));
        
      my_trait_fn(Rc::new(RefCell::new(Box::new((**my_str.borrow()).clone()))));
          my_str.borrow().my_fn();
      }
    

I come from a Python/Java world and I remember when I first delved into
embedded MCU programming in C/C++, it was a cognitive overload and a half. I
think we need to take a step back and while I understand the differences
between high level languages (Java/Python) and C/C++ having the burden to
allow for low-level programming; readability aspects of code often gets the
lowest attention.

If we, humans, spend so much time writing/parsing code through our vision
system, programming languages need to take this aspect seriously and optimize
the syntax over readability than expressivity.

I recall Kernighan's "The Elements of Programming Style" where he talks about
making the code more understandable than "clever". The syntax of a programming
language directly facilitates (or impedes) this idea.

Edit - Everyone seems to keep pointing out it is a contrived example. Yes, I
agree but the syntax has nothing to do with making the contrived example even
more difficult to read?

~~~
gaius
_I am sure there is a way to come up with a better syntax without sacrificing
the type safety aspects_

Well, this is the generic curly-braces-and-semicolons style. If you want to
see nice clean yet type-safe syntax, just take a look at OCaml or Haskell.

~~~
weberc2
What is meant by "type-safe syntax"? A syntax that permits type annotations? I
will say I find Haskell and especially OCaml hard to read; I'm writing a
pretty minimal language with a Haskell-esque syntax, but that's because it's
easy to build a parser, not because I find it more readable. And the syntax
will likely change when readability becomes more important than prototyping.

~~~
evincarofautumn
I think the intent was “a _language_ with nice clean syntax that is
nonetheless type-safe”.

A large portion of readability comes from familiarity. I find most Haskell
code fairly easy to read—or at least, approximately as simple or complex
syntactically as it is semantically. Other mainstream languages tend to be
either much more complex syntactically than what they do semantically
(“verbose”) or much simpler (“magic”).

~~~
weberc2
That makes sense, and I agree with your assessment of readability.

------
evmar
I wish the HN community would collectively decide to downvote posts like
these, which reward people who use clickbaity titles to get attention at the
cost of hiding more careful analyses.

The post can be summarized as a critique of a programming language that did
not meet the author's desires, but sections of the critique is (1) dislike of
the syntax, (2) a specific missing feature, and (3) a very specific feature
which doesn't exist in any of the popular scientific computing languages. It's
fine to talk about not liking a tool, but it doesn't merit "fails hard".

~~~
dnautics
3) very much exists in julia

------
fny
Aside from all the discussion around these low-level complaints, the Rust
community is well aware of the lack of higher-level scientific computing
libraries, which IMO is the more relevant reason not to use Rust today:
[http://www.arewelearningyet.com/scientific-
computing/](http://www.arewelearningyet.com/scientific-computing/)

Hell, Rust doesn't even have stable bindings around BLAS/LAPACK/LINPACK and
friends which are the lifeblood of _every_ language capable of scientific
computing.

Give it time. Scientific computing was never the primary thrust for Rust, and
other more pertinent domains have yet to reach complete maturity.[0]

[0]: [http://www.arewewebyet.org/](http://www.arewewebyet.org/)

~~~
dnautics
It's also totally fine for rust to be not as good for scientific computing. If
you try to be everything for everyone you'll fail to be good at any one thing.
It's hard to imagine rust will ever be competitive at scientific programming,
say, with Julia. You might get say a 20% boost in number crunching performance
at the expense of developer time.

Conversely, you'd be a damn fool to try and write Firefox in Julia.

------
Animats
Lack of multidimensional arrays as a language primitive is a problem. It was a
problem in C, C++, and Go, too. There, you either have arrays of arrays, or
some macro/template that does a multiply on the subscript. The reply in one of
the Rust forums is that multidimensional arrays are only found in "domain-
specific languages", like FORTRAN.

When someone brings this up the usual excuses are trotted out:

\- "It can be done with templates".

\- "You can have an array of arrays".

Then there's bikeshedding. Some people want to be able to extract an arbitrary
subarray slice from an N-dimensional array and treat that like a regular
array. This is possible but complicated, and if implemented slows down the
simple operations. Then there are people who want N-dimensional arrays where N
is determined at run time. (This now includes the machine learning people.)
Endless arguments between these groups resulted in nothing being done in Go. I
haven't followed that discussion for Rust.

The big advantage of doing this in the language is that there's a lot of
hardware support for doing number-crunching fast on dense arrays, and you want
the compiler to be using it effectively. Matrix multiply on arrays of arrays
is much slower than on a dense array where the compiler knows the spacing
between the rows at compile time.

At least have arrays as good as FORTRAN. This is why some heavy number-
crunching work is still done in FORTRAN.

~~~
nine_k
Are there any more efficient ways to represent a dense multi-dimensional array
than a one-dimensional array + index multiplication? Is there any difference
between said multiplication done by some internal function inside the
compiler, or by a regular (and trivial) function generated by a macro /
template system?

~~~
Animats
Yes. And yes.

One dimensional array + index multiplication doesn't make it clear to the
compiler that the rows are disjoint. So the compiler can't assume there's no
aliasing. It also probably can't strength-reduce the multiply to an add when
advancing through an array in the "wrong" direction. FORTRAN compilers have
been doing subscripting optimizations like that since 1956.

After those optimizations, it may be possible to use some of the SSE
instructions on x86.

~~~
mamcx
I wonder how are FORTRAN arrays implemented. I was also assuming was a bing
single arrays + indexes...

If I wanna add this to a language, what I need to consider to be as FORTRAN?

~~~
Animats
Look at the generated code for a matrix multiply in different languages. How
tight is the inner loop?

------
blinkingled
I am going through the rust by example thing every weekend and also trying to
write some Rust code on my own and I found the 'too much symbols'/unreadable
code and just the general complexity of the language very off putting.

If I have to write low level code, I might just stick with C or C++ - the
devils I know rather than trying to tame a new incarnation of the devil that
is Rust. The complexity and learning curve costs of Rust far outweigh the
benefits at this point.

Edit: If I sounded like I am giving up on Rust above, that's not yet the case.
I am currently enjoying challenging myself to learn something complex and to
that effect, Rust is holding my interest.

~~~
steveklabnik
Did you try the book? Just curious.

~~~
blinkingled
I started there, 1st edition IIRC - went up to refs and borrowing and then
thought maybe the by example approach is something I would like better - and I
do find the example approach keeps my interest better.

Although I suppose I should go look at the 2nd ed draft when I need to look
into more theory - reading the refs and borrowing section from 2nd ed right
now :)

~~~
steveklabnik
Ah yeah, the second edition is just straight-up better, and _does_ contain
examples. For example, ownership, references, and borrowing are theory-based
in 1st edition, but are with String/&str in the second. That said, both are
great resources. For some people, the book doesn't work for them but RBE does,
and for others, RBE doesn't but the book does.

~~~
pjmlp
I went through the books, yet it was a big challenge porting an old Gtkmm toy
application to Gtk-rs, because of lifetimes on callbacks.

Initially I was referring to the fields on the struct, but ended up passing
them around as parameters.

You have a referrence to this kind of problem on the NLL proposal.

But unless I missed it, the books don't talk much about this.

~~~
steveklabnik
The book does, but it basically says "Don't do that", whereas since you're
binding to a library that does that, you don't have a ton of choices.

I haven't really used Gtk-rs enough to say more than that, really.

~~~
pjmlp
Basically the idea of having structs implementing widgets, which need to
respond to UI events and handle their state, stored as struct fields.

On modern C++ you can easily use this approach with
_std::enable_shared_from_this_.

------
spaceseaman
> then your matrix is allocated on the heap not the stack, meaning slower
> operations

This doesn't seem right at all... Does anyone know more? Maybe they meant slow
to allocate / deallocate? The author's problem appears to require you to
generate a lot of temporary data and then throw it away. They might benefit
from just writing their own pool allocator, so they don't have to wait for
heap allocates (assuming that's their problem).

~~~
Jare
I imagine the following problems affecting performance outside of the
alloc/dealloc themselves:

\- Heap memory may not be reused as cleanly as stack memory, which means lower
cache efficiency. Your suggestion of a pool allocator should take care of
this.

\- Runtime code for bounds checking needs to take the array size from the
object even if it's constant. I recall there were XXX_unsafe() versions of
accessors, which should not do any bounds checking, but the code is going to
look ugly.

\- Having to go through an indirection from the object to the memory buffer.
Here you are at the optimizer's mercy.

So to avoid overly complex / ugly / brittle code where you need high
performance, you are pretty much forced to write your own matrix types from
the ground up, which is one of the concerns of the OP. I don't believe this is
a terribly bad thing, and if the existing crates for this are not great yet,
it would seem to me that it's just not been a focus for the Rust community,
not an insurmountable problem.

------
wcrichton
Extensive discussion on /r/rust:
[https://www.reddit.com/r/rust/comments/76olo3/why_rust_fails...](https://www.reddit.com/r/rust/comments/76olo3/why_rust_fails_hard_at_scientific_computing/)

------
aldanor
Re: (2), this RFC was merged: [https://github.com/rust-
lang/rfcs/pull/2133](https://github.com/rust-lang/rfcs/pull/2133)

~~~
weinzierl
To add a little context here:

> 2\. Arrays in Rust are a second-class citizens, actually I think they don’t
> even have their visas. I hear them laughing at me when I try to use them.
> You can’t even clone them:

RFC 2133[1] adds Clone to Arrays (and Tuples), it has been merged and
implemented[2].

> 3\. Rust is still “discussing” integer as generic type parameter (since
> 2015), meaning a matrix type Matrix[M, N, float] will not exist before a
> long long time. The following github discussions are quite the read:

It links to a bunch of stuff, but fails to recognize that this also has a
merged RFC [3] and implementation is actively worked on[2].

[1] [https://github.com/rust-lang/rfcs/pull/2133](https://github.com/rust-
lang/rfcs/pull/2133)

[2] [https://github.com/rust-lang/rust/pull/43690](https://github.com/rust-
lang/rust/pull/43690)

[3] [https://github.com/rust-lang/rfcs/pull/2000](https://github.com/rust-
lang/rfcs/pull/2000)

[4] [https://github.com/rust-lang/rust/issues/44580](https://github.com/rust-
lang/rust/issues/44580)

~~~
tom_mellior
> fails to recognize

The article is dated April 1st, your links on const generics are from later.

~~~
weinzierl
Ah, sorry, I missed that.

------
microcolonel
> _I’m not even talking about Rc, RefCell and Box which seems like security
> through obscurity._

What do you mean?

------
gravypod
I don't know if Nim would be my first choice for a future scientific computing
language. I think I'd be more focused on Julia or, recently, Crystal.

Julia has some extreme magic that it can do to make your life easy. Crystal
will attract the Ruby-like crowd of people who will create very neat and
effective libraries.

~~~
dom96
I disagree (but then again I am biased). I think Nim has a chance of
attracting the Python-like crowd of people (plus Python has some very good
scientific libraries like pandas) and I would be willing to bet that Nim's
ability to perform magic surpasses that of Julia (or at least is equivalent).

Edit: Shameless plug. My book, Nim in Action, is 50% off today with code
HalloweenPB50 (From Manning). Perfect opportunity to grab a copy and learn Nim
:) [https://book.picheta.me/](https://book.picheta.me/)

~~~
dnautics
Does Nim have first class distributed computing primitives? For embarrassingly
parallel operations, I can deploy on a slurm provisioned cluster on a
supercomputer in about 5 lines of native Julia.

There's a lot of really nice "magic" in Julia. I implemented an experimental
numerical system (a binary representation of real numbers) and after
generating the +-x/ operations, I had complex numbers, Fourier transforms, and
lu decomposition for free.

With the macro system I then wrote a library that let's me generate verilog
code and built a hardware implementation of the number system and deployed
comprehensive tests cross correlating the results of all operations across
native Julia, hardware simulation in Julia,and verilog version transpiled
through verilator.

I wanted to play around with Mary Wootters' discoveries about Reed Solomon
encoding, so getting working Reed Solomon code over arbitrary galois fields
was as simple as building the type rules and basic +-x/ and then again using
matrix solving to solve erasures (in this case you can tell the type system
there is no order to the type so it doesn't do partial pivoting). Then I
needed to do some brute force searching of a large parameter space. The core
program for this search is about 25 lines of code in the main loop, which
reads like simple math statements out of a well written text, and the backing
galois field and Reed Solomon libraries are about 100 and 50 lines,
respectively.

------
amelius
I suspect Rust also fails hard at interactive code (UIs).

Why? Because Rust's memory management doesn't work well with cyclic
structures, and closures are such structures. Closures have shown to be very
useful for interactive code (see Javascript).

Someone please prove me wrong. (I'm wondering how Firefox deals with this).

~~~
steveklabnik
Rust uses closures quite heavily.

The biggest problem with UIs is that they tend to be based on inheritance,
which Rust doesn't have. It's not insurmountable; there are people who have
written macros to make this kind of code more concise, or alternative UI
libraries that don't rely on inheritance.

~~~
amelius
> Rust uses closures quite heavily.

I suppose they only support read-only free variables then?

~~~
steveklabnik
Not exactly, no. There's a bit more ceremony since there isn't a GC, but
that's a systems language for you.

Closures are basically syntax sugar for a struct that represents the
environment, and a function pointer which takes that struct as one of its
parameters. So you can do anything that you can do with that setup, aka,
anything :)

[https://doc.rust-lang.org/book/second-
edition/ch13-01-closur...](https://doc.rust-lang.org/book/second-
edition/ch13-01-closures.html)

------
estebank
> You might say, 30 lines, easy no? Well except that Go can be played on 9x9,
> 11x11, 13x13, 15x15 and 17x17 too, any oddxodd size, no way I'm implementing
> those 30 lines x 6 times, this is not fun

I'm curious if the author has tried using macros, which would require him
writing the code only once.

------
baldfat
Scientific Computing doesn't really happen in low level language spaces today.
FORTRAN, C and C++ are what powers the Higher Level languages.

Most is done in R, Matlab, Python, and Julia and for good reason. Most people
doing scientific computing are not coders that can do the work at a higher
level at a productive level.

------
posterboy
for one point, an efficient programm probably wouldn't need to pack memory
intensive arrays on the stack all at once.

