
Why did we choose Rust to develop TiKV? - cyber1
https://pingcap.github.io/blog/2017/09/26/whyrust/
======
eikenberry
TLDR; the author likes rust and wanted to use it. The article reads like some
dev's rationalizing what they want to do to the management. These types of
things are fine, but as a dev to a dev it is obvious that they just want to
use this cool tech. Good for them.

~~~
dullgiulio
Even more, Go is excluded as (author's opinion) goleveldb is not as mature as
RocksDB. Thus they should have used CGo, which is way suboptimal, slower etc.

The title should have been: "C++11 or Rust? We chose Rust". In a greenfield
project like this one seems to be, it's I choice I would approve.

A personal note regarding future comments to this thread: I have had enough of
negative advertising against Go in every language related thread. People who
use Go are not stupid: they know the language limits and tradeoffs and are
okay with them. Deal with it.

~~~
edem
I haven't used Go or Rust but when I compare them I have a gut feeling that
I'd be better off with Rust primarily because Go's build tools are not
adequate. What do you think about this?

~~~
jimsmart
Well, your question is simply too open-ended: nobody can really answer that
for you, as you give no indication whatsoever as to your potential use-cases.

FWIW: I've been coding in Go for a few years now, for me and (perhaps more
importantly) the kind of projects I choose to use it for, the build tools have
been more than adequate.

As with many things: it ultimately depends on what you're wanting to do, and
what your expectations are.

~~~
lilbobbytables
What kinds of projects do you use it for?

~~~
jimsmart
I've implemented some website backends, a ton of 'micro' services, various
command-line tools, and a bunch of data-processing stuff. (The distinction
between some of these is somewhat arbitrary)

These are the areas I think Go is most suited to, currently — and they've all
been a breeze to implement/test/deploy/maintain.

------
pornel
I can't wait until Rust is no longer be perceived as hipster trendy choice.

It's a very solid language in the no-GC niche and shouldn't need a blog post
from every project that uses it.

Does it have to be 30 years old before it's not "new" and weird?

~~~
zerr
On the other hand I can't wait when Swift becomes general, non-Apple language,
available on most platforms (including the most popular one) "with batteries"
\- that will be the end of Go and Rust I think :)

~~~
rf15
while I think that that would be a good thing (I like what happened with C#),
I see the languages specified and controlled by Microsoft, Google and Apple as
second-class languages, since they are often lacking in community input and
are usually designed with certain platform-specific goals in mind instead of
being cross-platform. (or company-strategic goals when it comes to Google)

~~~
blub
Yes, there's something not quite right with these company languages.

Whoever is smart will take note of what happened to VisualBasic and is
happening to Objective-C.

~~~
pjmlp
You mean like C and C++ being developed at AT&T, nowadays designed at ANSI,
with people on ARM, Google, Apple, Blommberg, Sony, IBM, Microsoft's payroll?

~~~
blub
Looking into the past can have some predictive value, but looking that far
back and at a different company is of very limited value.

If one evaluates C# they should obviously look at what MS are doing. Same
thing for Swift and Apple.

~~~
pdimitar
I think what <pjmlp> is saying is that (1) these languages are too widely
adopted to fail -- somebody is bound to take the ball and continue dribbling
it even if the worst happens (company abandons language) and (2) to have a
widely adopted language means you have to be a member of a number of
committees -- because without standards you wouldn't become a widely adopted
language in the first place.

In short, we're quite safe in terms of if C#, Swift and Go will live on. They
will.

~~~
blub
The issue with C# and Swift is that the Apple and MS dev communities enjoy
having a ready made solution that they can pick up and work with immediately.
Official support is very important.

C# and to some extent Swift are also two huge platforms, there are very few
organisations out there that would be able to steer their development.

Finally, they are not standardised in any way. MS tried something with 2.0 and
then gave up.

My impression is that these two live and die by the will of their corporate
masters. I'm not saying they will kill them or anything, that would be pretty
stupid of them to do.

~~~
pjmlp
> Finally, they are not standardised in any way. MS tried something with 2.0
> and then gave up.

They are standardized, you just need to look at the right place.

[https://docs.microsoft.com/en-us/dotnet/csharp/language-
refe...](https://docs.microsoft.com/en-us/dotnet/csharp/language-
reference/language-specification/)

[https://docs.microsoft.com/en-us/dotnet/visual-
basic/referen...](https://docs.microsoft.com/en-us/dotnet/visual-
basic/reference/language-specification/)

[http://fsharp.org/specs/language-spec/](http://fsharp.org/specs/language-
spec/)

[https://swift.org/documentation/](https://swift.org/documentation/)

~~~
blub
Documentation & specs is not standardisation, not in the C or C++ sense. But
you already know that, so why are we having this conversation?

~~~
pjmlp
I assume you know what a _de facto_ standard is all about.

------
cyber1
I think TiKV is a good example where team chose Rust over Modern C++. Rust
gives the same performance and is close to metal like C when it is necessary.
All possible memory management mistakes it catches at compile time if it's not
"unsafe" and this is a really great!

With Rust I can hack without fear! I shouldn't remember tons of C++ rules
which are described in
[http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines),
"C++ programming language" book, also here
[https://herbsutter.com/gotw/](https://herbsutter.com/gotw/), etc and I can
focus on algorithms and implementation.

C++ combines a lot of different paradigms m.b. more correct I would like to
say "C++ paradigms hell"! Which of C++ subsets is the right way, no one
understands. Even Bjarne Stroustrup said, "Within C++, there is a much smaller
and cleaner language struggling to get out." \- and where is this "smaller and
cleaner language"?. What is the idiomatic style in C++? Is it Google
guidelines, CoreCpp guidelines or other enormous guides?

I look inside a lot of C++ projects and each of them has different styles, use
different paradigms, sometimes look like different languages!

Rust, Go, C, Java code bases look the same, they have their own idiomatic
style, their own way.

I think Rust is the next step in the evolution of system programming language.

~~~
blub
C and Java code bases certainly don't look the same. In fact, claiming that
about C is simply insulting to the intellect of anyone reading your
overenthusiastic message.

I also sincerely doubt that Rust code bases will look the same in ten years.
It supports functional and OO paradigms and it's attracting very different
classes of programmers. Recently someone wrote a post about difficulties with
some OO concepts in Rust, and a top reply said that they never encountered
such issues because they program in a functional way.

Go is an exception here, but as soon as they extend the language in a
significant way (e.g. templates) differences will start to appear.

C++ doesn't have an idiomatic style because it's used in very different ways
by different people. It's impossible to have a fixed style and address the
mass market.

~~~
usrusr
I think you misread a terribly misreadable statement: what he meant wans't
that C and Java look the same, he meant that any two C codebases will share a
lot of similarity, any two Java codebases will share a lot of similarity and
so on. C++ on the other hand can be anything from "C with objects" to deep
template metaprogramming to "has there even been a C++ before 2011?" styles,
which are as dissimilar from each other as you can get without crossing
language borders.

~~~
blub
No, there's OO C, low-level C, GLib C, etc. There's Android Java, Enterprise
Java, Standard Java. Any language catering to different customers will have
various styles.

There are two big C++ coding styles: C with classes, an old style which has
little use nowadays and modern C++, the recommended way, used in new projects.
Asking "what is the idiomatic style in C++?" must be a rhetorical question,
because it's obviously modern C++, the leaders of the C++ community have made
this clear repeatedly. Template metaprogramming is a technique, not a
programming style.

OP should do more hacking without fear and less spreading FUD.

------
jcelerier
> After years of usage of GC, it is very hard to go back time for manually
> managing the memory.

... are you guys sure of your "experienced C++ developers" ? There's as much
memory management in modern C++ than in GC'ed language: none. Create your
objects with `make_unique` or `make_shared` according to what makes sense (or
just enforce `make_shared` if you're really dubious of the coding abilities of
your team but at this point you'll have problems whatever you do).

~~~
blub
Yeah, that read like some Java advertising from the 90s.

I wish more of these posts were honest and said "we picked X cause we think
it's cool and we're gonna get paid to learn it". But they have to make up some
convoluted explanation that sounds rational and acceptable instead.

~~~
deadjoe
from the post , looks like they really do think picking rust is dope and
already built a cool db system upon it. Maybe they already got bunch of bucks
in the pocket. Huhhh..

------
sriram_malhar
Considering that their team likes Go, it seems strange to me that they would
consider Rust over Go for the storage layer. A storage layer should be IO-
bound, and should hardly trouble the CPU; the choice of language really should
not be a determining factor. The big wins in that space are architectural, not
language specific.

~~~
jerf
"A storage layer should be IO-bound, and should hardly trouble the CPU; the
choice of language really should not be a determining factor."

This used to be true, but it's out-of-date now. You can now get a network pipe
in to a system that a rather beefy multi-core CPU using a user-space TCP stack
can barely keep up with, let alone do any real work, and if you can scrape up
the PCI express lanes, putting a few of the latest SSDs into a system can
start getting you theoretical maximum bandwidth numbers that just a few years
ago looked more like what you'd expect for a RAM bandwidth number.

I'm of the opinion that it was already not as true as commonly supposed 5
years ago (in my experience using slow languages on putatively IO-bound tasks
was still noticeably slower than using fast languages), but the latest in
network pipes and SSDs have really ended it. It's true that on most desktop
systems you've still got more CPU than you know what to do with, but as you
step into the serious database space that's not true anymore. For a serious
database I wouldn't be perturbed if someone looked at Go's performance and
just plain discarded it on the spot, even before considering GC issues. It's
very fast for a scripting language; it's fairly slow for a compiled language.
"The compiler spends hardly any time on optimization" is not what you want to
read about your database implementation language.

(I've got one of the nvme SSDs in my laptop, and it is interesting to see just
how many CPU bottlenecks there still are in systems nowadays. In some sense, I
really shouldn't ever see a "loading" screen because you "ought" to be able to
read things off of my SSD fast enough to completely fill my RAM in 5-10
seconds; "merely" loading Firefox ought to be somewhere in the 50ms range. In
practice I still see loading screens and load waits, because the CPUs are
still doing things. Lots of things that used to be dominated by and hidden in
the load time, but aren't anymore.)

~~~
sriram_malhar
> You can now get a network pipe in to a system that a rather beefy multi-core
> CPU using a user-space TCP stack can barely keep up with.

I'd love to learn more about this. Can you please point me to some links that
elaborate on this point with benchmarks? Thanks much.

~~~
jerf
Google around for the Intel DPDK, and you can find things like this:
[http://www.cs.cornell.edu/courses/cs5413/2014fa/projects/gro...](http://www.cs.cornell.edu/courses/cs5413/2014fa/projects/group_of_dsd96_vs444/final_doc.pdf)

You can also do a simple math analysis to see it. If you have an incoming
10Gbps connection, a single core machine has approx. 1/3rd of a cycle per bit
to do everything it's going to do with that packet. Even going to a 128 core
machine and assuming perfect parallelism with some sort of magical packet
muxer gives you a whopping 43-ish cycles per bit. I've never worked on this
myself, but I saw a team in my company working with it and were pretty pleased
to be able to push ~2Gbps through their 10Gbps network connection with a
pretty beefy machine, and just about all they were doing was relatively simple
load balancing.

------
Shorel
And lack of experience with D.

Just kidding, great job!

------
bpicolo
Anybody have experience with TiDB? How does it stack up against CockroachDB?
Seems hard to find comparison. Probably hear less about it mostly because it's
developed in China? Looks like it's an impressive piece of tech, though.

~~~
sanxiyn
[http://weekly.pingcap.com/2016/10/17/how-we-build-
tidb/#atom...](http://weekly.pingcap.com/2016/10/17/how-we-build-tidb/#atomic-
clocks--gps-clocks-vs-timestamp-allocator) explains their main difference from
CockroachDB.

~~~
bpicolo
Mostly curious how it works out in practice. Performance, etc.

------
baldfat
> its innovation in the type system and syntax gives it a unique edge in
> developing Domain-Specific Libraries (DSL).

I think Racket still has the edge for producing DSL?

------
noncoml
The one reason I would give is Algebraic Data Types.

------
baq
it'd be enough for me to say 'rust is kinda like c++ in terms of performance
and complexity but without the 0 pointer'

~~~
jhasse
Rust also has a "0 pointer": [https://doc.rust-
lang.org/std/option/](https://doc.rust-lang.org/std/option/) The equivalent of
a null pointer exception in Rust is an unwrap panic.

(IIRC if you use Option<Box<...>> None will even be represented by a null
pointer internally)

~~~
Ygg2
Umm unwrap panic is definitely not the same.

For one it's heavily discouraged. Any example using it is not best practice.

~~~
jhasse
I still got a few unwrap panics in libraries I've used (not in example code).

Also: Using a null pointer in C++ is also heavily discouraged ;)

~~~
Ygg2
Yeah, but using Option is encouraged, while using Option.unwrap as error
handling is discouraged.

Ok, to be more precise, Option in Rust, isn't a null pointer. It's a nullable
pointer. Practically speaking only Option::None is the null pointer. You can
either deal with it (using if-let or match) or you can `unwrap` and assume
it's never null. If you make that assumption and if and only if it was
actually Option::None, will it throw null pointer exception.

In contrast something like C/Java will allow you to use your nullable pointer
(because all pointers/references are nullable by default) without any check
and it's relatively easy to skip this step. In Rust, it's relatively hard to
skip this step.

EDIT: Changed per burntsushi post.

~~~
burntsushi
Use of Option.unwrap (or Option.expect) in libraries is not discouraged, and
it shouldn't be. It's an excellent way of checking a runtime invariant. Use of
Option.unwrap is discouraged for _error handling_.

Stated differently, if a library you're using causes a panic, then it should
be interpreted as a bug. The bug might be in the library, or it might be in
the way that the library is being used (assuming the panic conditions have
been documented as part of the library API's contract).

A separate argument says that you should reduce the number of places where
your code can panic. That sounds like a fine goal to strive for, but must be
balanced with other things.

------
amelius
If you had lots of circularly referencing data structures, would it make more
sense to choose a garbage-collected language like Go?

~~~
int_19h
Most circular data structures still have some node that is semantically the
owner of the whole thing. Having true circular _ownership_ is much rarer.

~~~
amelius
Closures naturally generate cycles in the data dependency graph. A way out
would be to copy the environment of a closure, but that would mean a
performance penalty.

~~~
evincarofautumn
Whether and how closures generate cycles, and consequently the best
implementation strategy, depends heavily on the language, though. You might
have a strictly nested call stack or thunks and continuations; shared mutable
environments or immutable copies and moves; copyable environments or
linear/affine closure types; boxed closures that can be stored in data
structures or unboxed closures that can’t always; first-class or second-class
closures; a GC to rely on or none; &c.

------
StreamBright
It is kind of funny how software engineers can engage in lengthy discussion
about tooling. Imagine the same for architects. Instead of looking at the
building they would talk about the type of hammer they used while building it.

~~~
arjie
With software the material you construct your creations influences the means.
Architects most certainly do argue about whether they should use cross-
laminated timber, reinforced concrete, glulam, or steel. They talk about these
things and write long pieces on them. The materials influence the design of
the building.

They don't talk about it on blogs on the Internet because that's not where the
audience is. But they do talk about this.

~~~
new299
> They don't talk about it on blogs on the Internet because that's not where
> the audience is. But they do talk about this.

It's a real shame. I love reading about other disciplines and knowing about
the practical trade offs.

------
klakier
When I see posts like "Why did we choose something over something other", I'm
like "Nobody cares"

~~~
sgift
And with that attitude we get the same problems over and over again.

