
The Safety Boat: Kubernetes and Rust - DeathArrow
https://msrc-blog.microsoft.com/2020/04/29/the-safety-boat-kubernetes-and-rust/
======
zozbot234
It's a bit weird to see "several weeks" of effort being described as a
problematic learning curve. At least the blogpost makes it clear that the
effort pays out hugely but still, "several weeks" is not rocket surgery. It's
not learning Haskell or category theory! ISTM that they're just running with
an assumption that most devs wouldn't be professional and committed to this,
which strikes me as an unwitting gatekeeping attitude.

~~~
steveklabnik
Go has a shorter time, and that’s the measuring stick in this area.

~~~
wrs
Go _appears_ to have a shorter time, because you don't realize how much
higher-level stuff you're just expected to do The Right Way, with no support
from the language or libraries. So you're free to think you've finished
learning Go, but then the actual learning begins.

~~~
dnautics
Agreed. Worked with a senior who "learned go in a week" then faffed around for
months (to prod! because he had the implicit trust of management because he
was "a genius") deploying broken software with tons of concurrency bugs
because he didn't know how to manage shared state.

------
ernado
> we caught a significant race condition

It is a data race, not a race condition.

> and which passed the race checker for Go

No, it is not.
[https://github.com/helm/helm/pull/7820#issuecomment-60436062...](https://github.com/helm/helm/pull/7820#issuecomment-604360627)

There is a comment by issue author which is literally a go data race detector
warning. Like "WARNING: DATA RACE".

~~~
Rusky
Data races are a kind of race condition, no?

~~~
sa46
I wasn't sure. After a bit of research, this seems to be a debate [1]. Using
the common definitions, it's possible to have a data race that doesn't cause a
race condition. [2] It's also possible to have a race condition without a data
race.

[1]
[https://en.wikipedia.org/wiki/Race_condition#Data_race](https://en.wikipedia.org/wiki/Race_condition#Data_race)

[2]
[https://blog.regehr.org/archives/490](https://blog.regehr.org/archives/490)

~~~
Rusky
I would argue that "a data race that doesn't cause a race condition" is still,
itself, a tiny race condition- just a contained one.

But you're right, this is just choice of terminology. :)

------
boulos
The bug they caught [1] is one of the reasons some languages require you to
explicitly name your captured variables. You still could have typed that code
in, especially if you started with a for loop and then made it parallel (fwiw,
perform should have been named something clearly suggesting it was parallel),
but you'd at least be confronted with "oh, you went from serial, local state
to a capture. Still think it's okay to explicitly borrow that state from this
scope?". Then again, that's the point of Rust here :).

Fwiw, it's too bad the commit message didn't say something like "Since we're
doing delete on many resources in parallel, we need to hold a lock while
updating errs/res.Deleted". The reviewer was also obviously confused at first.

[1]
[https://github.com/helm/helm/pull/7820/commits/edb2b7511bcb9...](https://github.com/helm/helm/pull/7820/commits/edb2b7511bcb9dc3c03bad3da1099b7ebffefd0b)

------
melling
“For comparison, last week we caught a significant race condition in another
Kubernetes-related project we maintain called Helm (written in Go) that has
been there for a year or more, and which passed the race checker for Go. That
error would never have escaped the Rust compiler, preventing the bug from ever
existing in the first place.”

I’ve heard people brag that Haskell is a great language because it’s
supposedly easier to write correct code.

Rust has this same reputation?

~~~
lmm
Yes, for much the same reasons. Pretty much any ML-family language has the
same effect; just having proper sum types, polymorphism and first class
functions (and _not_ having null) goes a long way to preventing huge classes
of bugs.

~~~
nindalf
The critical feature enabling fearless concurrrency is Rust's borrow checker
though, something that the other ML languages don't have.

~~~
lmm
In practice you have fearless concurrency in every other ML language I know,
because they're all immutable-first. It's true that if you wrote some code
that mutated data then it wouldn't be concurrency-safe, but why would you do
that?

------
shock
> One of the biggest ones to point out is that async runtimes are still a bit
> unclear. There are currently two different options to choose from, each of
> them with their own tradeoffs and problems. Also, many of the implementation
> details are tied to specific runtimes, meaning that if you have a dependency
> that uses one runtime over another, you’ll often be locked into that runtime
> choice.

My understanding of how async/await works in Rust is that you can have
multiple async runtimes in one Rust program. Is that not the case?

~~~
roblabla
That is the case, but it's super awkward to use. Basically, you cannot await a
tokio future on an async-std runtime, or an async-std future on a tokio
runtime. You can, however, have both runtimes running at the same time, and
use some form of message-passing to bridge them.

It's definitely easier to only deal with one runtime. Ideally, we should have
some kind of abstraction to allow crates to support both runtimes (e.g. a
trait that'd allow creating an async TcpSocket of the right "kind" for your
runtime), but AFAIK this is not currently done.

~~~
steveklabnik
> AFAIK this is not currently done.

That's correct; we're still working on these abstractions. It's the end goal
that most folks have in mind, though.

~~~
jeffdavis
Can you explain and/or link to the issues? I thought Futures were the
abstraction that lets you choose a runtime?

~~~
Rusky
Futures are part of the answer, and more specifically the way that the Wakers
passed to Future::poll use dynamic dispatch to re-schedule the task.

Other major abstractions that are missing so far include async versions of the
Read and Write traits, a Stream trait for the async equivalent of the Iterator
trait, and perhaps a way to spawn new tasks.

This series of interviews covers these in more depth:
[http://smallcultfollowing.com/babysteps/blog/2020/04/30/asyn...](http://smallcultfollowing.com/babysteps/blog/2020/04/30/async-
interviews-my-take-thus-far/)

------
Thaxll
I don't believe for one second that it takes just a couple of weeks to an
average SE to be proficient in Rust.

~~~
steveklabnik
It really depends on so many factors it’s extremely hard to tell. We’ve
brought folks at Cloudflare up to speed roughly that fast.

“average” and “proficient” are both very variable in that statement, imho.

~~~
sitkack
They started with >1 Klabnik units and every person you bring up, it creates a
larger pool of folks to lean on for support.

~~~
steveklabnik
I can’t take credit here, while I am around to answer questions, getting folks
going is not my job.

It is true that we have a chat room with a bunch of folks, of which I’m part.

~~~
e12e
I wouldn't discount what gp is saying though - having an (or a few) experts on
hand from the start, can help training the first new convert "the right way"
and they can then mentor the next one and so on.

Even just by being availabletto answer questions or help with code review.
Doing some pair programming sessions would probably be useful too.

~~~
steveklabnik
Oh yeah, it’s helpful for sure. I just don’t want to take too much credit!

------
xrd
After reading this article, I'm excited about finding a reason to write a
component in Rust and WASM. Can anyone recommended the best getting started
guide for dipping your toes in the water? This article didn't have a link to
anything that seemed appropriate for that goal.

------
ronlobo
It is exciting to see Microsoft is pushing so many efforts into Rust and WASM.

The Rust onboarding experience is incredibly explicit and once things start to
click and code compiles, you're on the train.

------
conroy
I looked into WASM / WASI last week but couldn't find an answer to this
anywhere: can I write a network service in Rust and compile it to WASM / WASI?

I know that wasmtime can execute a WASM module and give it access to a file
system. Can that filesystem contain a socket that the WASM module can interact
with?

~~~
jononor
Very curious as to why you would want to do that? If you want a network
service, WASM does not seem to help with much, only complicate things?

------
Klasiaster
Good article but it somehow suggests that because there is no garbage
collection you would need to fight the borrow checker. This is not fully true
because you could put your data in a Box (so that it is stored on heap instead
of the function's stack) and you can wrap it in a mutex with reference
counting (Arc+Mutex or Rc+RefCell), which roughly gives you what garbage
collection does. Also cloning can avoid solving the borrow-check puzzle if you
don't need a shared state. Of course you would not want to pack your code with
Arc+Mutex or data copying if performance matters, but it's fine for a beginner
to start with when writing Rust and then learn to do the optimized borrow
version a bit later when needed.

~~~
pjmlp
It doesn't give the productivity that GC allows for writing GUI code and UI
designers.

Imagine having JetPack Composer, SwiftUI, Qt designer, or WPF/UWP Blend in
Rust.

~~~
wwright
Qt is actually a good example because C++ has a similar memory model to Rust
(at least with respect to GC). The Qt solution was basically to give
everything a “Cow” (copy on write) wrapper, and to use an event loop-based,
somewhat manually-annotated GC for objects that want it.

Rust could totally do the same thing, and you could probably make it way
easier to use than the mess that is Qt.

~~~
pjmlp
Having dealt with Gtk-rs, and their current solution being the clone! macro, I
am not so sure.

Remember that not only is GUI development with proper tooling very
interacting, instead of the FOSS alternatives of code-compile-check visually,
there is also the whole eco-system of third parties selling component
libraries, with no control how they get integrated into the component toolbox.

So whatever solution one comes up with,it needs to be more productive than
forcing users to scatter Rc<RefCells<>>, or fix their code that broke
compilation, just because moving a widget on the GUI tree invalidated the
borrow checker assumptions.

------
klitze
It has a weird taste that Microsoft is preferring Rust over Golang considering
that Golang is a Google thing.

Don’t get me wrong, all technical arguments are correct and rust does have
advantages for cloud software. But this also comes quite handy for MS. :)

~~~
pjmlp
VSCode support for Go, and some Delve improvements, were actually developed by
Microsoft.

------
sittingnut
rust and kubernetes - post with mostly useless hype monsters united.

------
rvz
> For comparison, last week we caught a significant race condition in another
> Kubernetes-related project we maintain called Helm (written in Go) that has
> been there for a year or more, and which passed the race checker for Go.
> That error would never have escaped the Rust compiler, preventing the bug
> from ever existing in the first place

While the possible security benefits of Rust is interesting in software like
Kubernetes, it seems like this whole blog-post is an implicit RIIR proposal
for the Kubernetes ecosystem from a Microsoft software engineer which isn’t
going to happen anytime soon.

> Rust has made great progress in the past year with its async story, but
> there are still some issues that are being worked out.

On top of that, there are still many crates that aren’t using async-await yet
and most are not even 1.0, thus are not stable. I would not touch such crates
if they are still immature or even unsafe.

Realistically, a Rust Kubernetes is possible but practically the effort of a
production ready version is measured in years.

~~~
steveklabnik
Kubernetes is an ecosystem. It doesn’t need to be written in Rust for Rust
components to play a part. Helm is not Kubernetes, for example, though your
comment seems to blur the two. There are folks writing stuff to interact with
the broader ecosystem in Rust. That’s one of the interesting bits of networked
systems! You can be heterogeneous with languages more easily when the
network/api is the boundary.

~~~
dilyevsky
Doesn’t microsoft own helm now? Nothing is stopping them rewriting it in rust
since it can easily interact with kubernetes via rest api

~~~
seneca
No. Helm is owned by the CNCF.

~~~
dilyevsky
CNCF doesn’t “own“ anything afaik. If you look at maintainers list most seem
to still belong to deis org which is part of msft now

~~~
seneca
CNCF hold the copyrights.

~~~
CameronNemo
That is absolutely not true. The contributors to helm retain full copyright.
No assignment or even CLA is used (only a DCO).

~~~
seneca
I stand corrected!

