
Python vs. Rust for Neural Networks - rencire
https://ngoldbaum.github.io/posts/python-vs-rust-nn/
======
tanilama
Nobody writing NN in Python, they are just describing it.

For NN or DL in general, the correctness doesn't really lie too much on the
code quality level, like ownership Rust people love to talk about. It is more
about Numeric stability under/overflow and such. Choice of programming
language offers limited help here.

I don't think Rust has a killer app for ML/DL community to offer as of now,
the focus is vastly different.

~~~
kuzehanka
I've had a few Rust lovers come and mention this project to me recently. None
of them had any data science or ML experience. None of them knew that Python
is just used to define the high level architecture.

At the same time, comparatively tedious languages like Rust will never attract
data science practitioners. They don't care about the kind of safety it
brings, they don't care about improving performance in a component that's idle
99% of the time.

The bulk of the load in an DL workflow is CUDA code and sits on the GPU. Even
the intermediate libraries like cublas would only see marginal-to-none
benefits of being reimplemented in rust.

This is a cool project, but it has no chance to displace or even complement
Python in the data science space.

~~~
danielscrubs
I am a data scientist and I care. The time when you could just do proof of
concepts or a PowerPoint presentation is long behind us. So now we have to
start to take it into production, which means we get the exact same problems
as SE has always had.

Iff Rust helps us take it into production we will use it.

But it’s a lot of land to cover to reach Pythons libraries so I’m not holding
my breath.

That said, Pythons performance is slow even when shuffling to Numpy.

~~~
kuzehanka
I must be missing something. Modern data science workloads involve fanning out
data and code across dozens to hundreds of nodes.

The bottlenecks, in order, are: inter-node comms, gpu/compute, on-disk
shuffling, serialisation, pipeline starvation, and finally the runtime.

Why worry about optimising the very top of the perf pyramid which will make
the least difference? Why worry if you spent 1ms pushing data to numpy when
that data just spent 2500ms on the wire? And why are you even pushing from
python runtime to numpy instead of using arrow?

~~~
corndoge
Good lord, hopefully latency isn't 2.5 seconds!

~~~
sjwright
I can’t even. How could you ever get 2500 msec on transit? That’s like
circling the globe ten times.

~~~
justinclift
Maybe a bunch of SSL cert exchanges through some very low bandwidth
connections? ;)

Still, it's more likely a figure used for exaggeration, for effect.

------
psv1
Neural network libraries (Tensorflow, Pytorch) have a C++ backend and a Python
interface. Which is great - you get a performant compiled language as the
backend and a flexible user-friendly language as the interface.

Rust vs Python is a weird question because in reality no one writes their own
neural network with numpy, and no one expects Rust to act like an interpreted
language suitable for data science workflows. It would be more apt to compare
Rust and C++.

~~~
ianamartin
Yeah, this is the real point.

Python is an interface to C, C++, and FORTRAN for a lot of stuff. There are
even crossover libs for running R.

This is like comparing apples and steaks.

~~~
TheRealKing
"Fortran", since 1990 (FORTRAN refers to F77 :-)

------
archgoon
This seems to be comparing a hand implemented neural network in python (and
numpy) and one in rust. Even in this simple case, the author discovers that in
the python case, most of the time is spent in non-python linear algebra
libraries.

Most of the major deep learning frameworks for python (tensorflow, keras,
torch, mxnet, etc) will not normally be spending the majority of their time in
python. Typically, the strategy is to use python to declare the overall
structure of the net, and where to load data from, and then the actual heavy
lifting will be done in optimized libraries written probably in C++ (or
fortran, I seem to recall BLAS used fortran).

~~~
sgillen
I think BLAS is a spec rather than one library. So your version may or may not
be fortran. I do think the original "reference implementation" was written in
fortran, which is sometimes called "The BLAS library" but I think most BLAS
you see in the wild are not that.

~~~
the_svd_doctor
Yes. BLAS was originally specified in Fortran. But many BLAS implementation
(like cuBLAS, the CUDA/Nvidia version for GPU's) don't use Fortran at all.

------
sgillen
>> Is rust suitable for data science workflows? >> Right now I have to say
that the answer is “not yet”. I’ll definitely reach for rust in the future
when I need to write optimized low-level code with minimal dependencies.
However using it as a full replacement for python or C++ will require a more
stabilized and well-developed ecosystem of packages.

I'm not sure rust is really aiming to be something used for data science
workflows. I'm not sure the community will be putting much effort into making
this a reality.

~~~
s_Hogg
Fine, but it seems reasonable for someone to check. It's much easier to make a
decision about whether a programming language is what you want if you've got
evidence from someone who has tried something similar to you, as opposed to
just hearing people say "Language X is awesome because of unimaginably low-
level (from my point of view) feature Y".

Rust seems great, to be honest, just not universally so. Nothing wrong with
defining the boundaries.

------
marmaduke
Rust seems more suitable for implementing the next OpenBLAS.

While Julia's single language mantra is great, as long as things like Python
exist, there will be a need for C/C++/Rust.

~~~
asdjlkadsjklads
Yea, this whole discussion feels weird to me. Different use cases. I love Rust
(and dislike Py lol), but from everything i hear a highly dynamic frontend
(like Py) has little downsides to authors of ML/etc. All of the hotpaths are
in other already because Python is so slow.

The only downside i've seen is _sometimes_ the programmer will want more
safety. In such a scenario Rust for the "frontend" would be very useful.

So we have two concerns, frontend and backend. For the backend Rust would
perfectly acceptable, but i'm not sure it is fixing a safety issue/etc in
other (C/etc) languages - aka, perhaps little value in the backend. For the
frontend it only has value in some areas.

Regardless, i love Rust and would totally welcome any tooling to keep me in
Rust. However i'm not an ML person hah.

~~~
MiroF
A dynamic language can be super frustrating to develop with because you have
to keep tensor dimensions memorized in your head or in comments

~~~
marmaduke
How is à statically typed language going to help with tensor dimensions?

~~~
MiroF
Named/typed axes checking - ie. I shouldn't be able to contract along two
dimensions of different types (like batch dimension contracted with time
dimension)

------
xiaodai
"The first step here was to figure out how to load the data. That ended up
being fiddly enough that I decided to break that off into its own post." this
is exactly why we have R and pandas!! Julia is doing an increasingly excellent
job of it as well.

------
bayesian_horse
It's going to be tough to beat Numpy, Cython and Numba on speed with Rust.

And this kind of workload really isn't the problem Rust wants to solve.

------
pilooch
vs C++-14 ? Indeed most DL is in fact C++. The Pytorch recent C++ API is a
must. As professionals in this industry, my colleagues and I have switched to
full C++. I'd be interested in advantages of Rust vs C++ instead of Python
(which truely in terms of performances is C in the background).

------
longemen3000
Nice to see, a Julia implementation should be fun to do, but the point of the
article is true:when you work with linear algebra, most of the time is spent
in BLAS.

------
roadbeats
I’m a newbie in NN topic and feel surprised to hear that noone uses Numpy in
actual NN implementations, although it’s written in C++ and highly optimized.
Why is that ?

And, how about Gonum (Go equivalent) ?

Finally, I’m currently going through the deeplearning.ai program. I got one
week left, and will experiment with building some apps. Which technical stack
should I choose ?

~~~
csande17
Most software (including Python and Numpy and Go and pretty much every Rust
program) runs on your computer's CPU. The CPU is good at running programs with
a lot of different instructions and if-statements and loops and stuff.

But for neural networks, people often prefer to use special hardware like
graphics cards, since graphics cards are really good at doing relatively
simple math on many pieces of data at once. So they create special libraries
like TensorFlow that can send commands to the graphics card instead of doing
the math on the CPU. (And they don't use Numpy because even though it's highly
optimized, it's highly optimized _for CPUs_ , and graphics cards are a lot
faster than CPUs at running neural networks.)

------
HelloNurse
In the end, hypothetical future good Rust libraries for linear algebra will
just be turned into Python extensions and embraced into the mainstream.

They'll also, equally well, serve the needs of Rust programs which need
serious number crunching (presumably a small niche); there is no "versus" in
the comparison.

------
timClicks
I think it's quite impressive actually that someone can pick up Rust and
manage to out-perform Numpy in their first project. BLAS implementations are
decades-long exercises in optimization.

In my own experience, Rust has been excellent for the more boring side of data
science - churning through TBs of input data.

~~~
archgoon
> I think it's quite impressive actually that someone can pick up Rust and
> manage to out-perform Numpy in their first project.

They didn't. At the end of the article they discuss this.

"In fact it’s worse than that. One of the exercises in the book is to rewrite
the Python code to use vectorized matrix multiplication. In this approach the
backpropagation for all of the samples in each mini-batch happens in a single
set of vectorized matrix multiplication operations. This requires the ability
to matrix multiplication between 3D and 2D arrays. Since each matrix
multiplication operation happens using a larger amount of data than the non-
vectorized case, OpenBLAS is able to more efficiently occupy CPU caches and
registers, ultimately better using the available CPU resources on my laptop.
The rewritten Python version ends up faster than the Rust version, again by a
factor of two or so."

The original python code was written to be easily understood; not performant.
Shifting more of the work to the libraries improved the performance.

~~~
mplanchard
That is a different algorithm, though, right? They didn’t rewrite the rust
code to use matrix multiplication, so it’s no longer a direct comparison, and
is instead comparing unoptimized rust with a less efficient algorithm to super
optimized C++ with a more efficient algorithm (being called from python).

In which case, IMO, it’s fairly surprising that the rust implementation is
_only_ 2x slower.

------
ageofwant
This approach I think is missing the point. You will write highly optimized
libraries in Rust, and then use those in Python.

This is why Python has eaten the world. Not because its the best at any one
thing, except bringing all those things together - at which it is
unparalleled, and is unlikely to be surpassed anytime soon.

numpy, scipy, pandas, tensorflow all those have very little actual Python
code, its c++ and even fortran here and there.

This whole Python vs Bla thing is just silly nonsense. I know Python and some
Bla, and so should you. Tonight someone will release SuperFantasticNewThing
implemented in Bla, tomorrow someone else will wrap that in Python, and
tomorrow night the rest of us will use PySuperFantasticNewThing, and that's
exactly how it should be.

~~~
dmos62
I think you're being unnecessarily dismissive of discussions surrounding
python's fitness, and I find your remark "that's exactly how it should be"
confusing.

Python is an ok interface language, in that it's script-like, dynamically
typed and simple to comprehend. It's popular, which makes on-boarding
efficient due to the sheer volume of tutorials online. And, it has built up a
large ecosystem, because of the last two points.

That said, it's naive to suppose that python is the currently-ideal or future-
ideal interface language. It's just ok, plus it's popular.

~~~
skohan
Yes exactly: Python is dominant at the moment, because it has a head start,
but it's not because Python is uniquely suited to the problem by any means.

This is the same logic one might have used to conclude that the LAMP stack
would be dominant forever in web development if you took a snapshot of the
world in 2004.

------
ahurmazda
I could see how Rust could be a good option for writing the prediction server.
In my use case, DL models are trained offline. It does not matter if it’s
faster by 30 mins in training (maybe when I R&D-ing).

On the other hand, you typically have to reach for something like c++ for a
low latency, high thruput environment. It can be a bear to write a server.
Anything more ergonomic would be welcome

------
acollins1331
They already have a language to blow python out of the water in ML and it's
called Julia.

------
adsharma
[https://medium.com/@konchunas/transpiling-python-to-
rust-766...](https://medium.com/@konchunas/transpiling-python-to-
rust-766459b6ab8f)

Its possible to transpile. But the tech is not mature yet

------
ram_rar
I dont think, rust is competing with python in a similar space. The biggest
win for Rust would be to have someone in the community rewrite Cython in Rust
and get rid of GIL. Its a win-win for everyone.

------
qaq
Given work Chris Lattner is doing with Swift it has better chance of becoming
a contender in this space vs Rust.

------
sdinsn
Rust isn't going to be used until it has better CUDA support.

------
daenz
tldr: The bottleneck is in the linear algebra libraries. Unoptimized Rust is
2x faster than unoptimized Python. Optimized Python is 2x faster than
unoptimized Rust because it can vectorize matrix multiplications, but Rust
can't.

My takeaway: stick to a GPU where everything is more parallel.

------
lucasbonatto
tldr "Right now I have to say that the answer is “not yet”. I’ll definitely
reach for rust in the future when I need to write optimized low-level code
with minimal dependencies. However using it as a full replacement for python
or C++ will require a more stabilized and well-developed ecosystem of
packages."

