
The Path to Rust - adamnemecek
https://thesquareplanet.com/blog/the-path-to-rust/?
======
pimeys
In my current job, I was given a task to write a service which should handle
billions of events eventually in the fastest possible way. The language choice
was given for me to decide, and I was thinking that maybe I'll do it with
C++14, Scala, Go or Rust. C++ has it's quirks and I'm not really enjoying it's
build tool of choice, cmake. Scala I can write fast, but scaling the app in
Mesos would consume lots of memory; every task would take a huge slice of RAM
because of the additional JVM. Go I just don't like as a language (personal
taste) and I think the GC adds a bit too much of pausing for this app, so I
gave Rust a shot.

The first week was madness. I'm fairly experienced developer and the Rust
compiler hit me to my fingers constantly. Usually the only way out was to
write the whole part again with a different architecture. On the second week I
was done with my tasks, very comfortable with the language and just enjoying
the tooling. Also I was relying on tests way less because of the compiler,
even less than with Scala. If it compiles, it has a big chance of working.
Cargo is awesome. Let me repeat: Cargo is awesome. Also I like how I can write
code with Vim again and even though for some things I need to read the Rust
source, it is pretty easy to get answers if you're in trouble. In the end I
wrote some integration tests with Python and I'm quite happy with them.

Now I want to write more Rust.

~~~
gizzlon
> _Go I just don 't like as a language (personal taste) and I think the GC
> adds a bit too much of pausing for this app_

Not saying you have to like Go, we all have different preferences. But FYI,
the gc has gotten _much_ better in the latest releases. I don't think you
would be bothered by gc pauses. (my understanding is that it' both faster and
more "spread out")

~~~
pimeys
We're doing real time stuff where these pauses matter. Just that we don't
really need GC for anything and adding it to certain services cause more harm
than it causes good.

~~~
falcolas
How do you handle the reference counting pauses in Rust then; rely on them
being deterministic? Or do you completely avoid the reference counting boxes?

~~~
kibwen
Not the OP, but in my experience it's fairly rare to encounter someone using
the `Rc` type in Rust. It's nowhere near as prevalent as `shared_ptr` seems to
be in C++, for example.

~~~
pimeys
There is also the `Arc` type, which is an atomic reference counter. You still
need those, especially if you need to share stuff between multiple threads.

~~~
Jweb_Guru
You can often get away with using scoped threads.

------
loup-vaillant
> _Rust is not the most beginner-friendly language out there — the compiler is
> not as lenient and forgiving as that of most other languages […], and will
> regularly reject your code […]. This creates a relatively high barrier to
> entry […]. In particular, Rust’s “catch bugs at compile time” mentality
> means that you often do not see partial progress — either your program
> doesn’t compile, or it runs and does the right thing. […] it can make it
> harder to learn by doing than in other, less strict languages._

I don't see how making the type system stricter makes the language harder to
learn. Maybe that's because I know another relatively paranoid type system
(Ocaml), but still.

A type system that rejects your code is like a teacher looking at a proof you
just wrote, and tells you "this doesn't even make sense, and here's why". It
may be frustrating, but this kind of feedback loop is tighter than what you
would get from a REPL.

And you _do_ see partial progress: the type errors change and occur further in
the source code as you correct your program. Each error is an opportunity to
fix a typo or a misconception. The distinction between a broken prototype that
doesn't even compile and a working program isn't binary: when you correct a
type error, your program is _less_ broken, even though it doesn't compile yet.

~~~
scottlamb
> A type system that rejects your code is like a teacher looking at a proof
> you just wrote, and tells you "this doesn't even make sense, and here's
> why". It may be frustrating, but this kind of feedback loop is tighter than
> what you would get from a REPL.

I've been trying to port a personal project to Rust recently.

Sometimes Rust's type system is too limited to understand why something is
safe. Here are a couple examples:

* in C++ code, you can have a class in which one field has a reference/pointer into another which came earlier in the declaration order (and thus will be constructed first and destructed last). This is often a useful thing to do (one example: [https://users.rust-lang.org/t/struct-containing-reference-to...](https://users.rust-lang.org/t/struct-containing-reference-to-own-field/1894)) You can't do that in Rust. They have to be separate instance variables on some thread. If there's no thread running to own it, you have to use referencing counting (Arc or Rc) or unsafe blocks. Or maybe instead of keeping a reference, have all calls take the outer struct as a context argument and use some sort of struct/lambda which knows how to find the thing you're referencing given that (this is what I'm trying in my code). Someone proposed a language change for "self-borrowing structs" ([https://mail.mozilla.org/pipermail/rust-dev/2014-February/00...](https://mail.mozilla.org/pipermail/rust-dev/2014-February/008658.html)) but it didn't go anywhere as far as I can tell.

* in C++ code, you can loop over one instance variable and then call a private method on self which mutates a different instance variable. In rust, you'll get errors about self being partially borrowed. I think you have to restructure the other method to not take self, which probably means grouping things into child structs. Basically Rust doesn't look across functions boundaries to decide if something is safe so it has to consider this an error even if it isn't for the particular method you're calling.

Fundamentally, I think the choice is between these three options:

* use a garbage-collected language and not have to be explicit about these details. There's no possibility of buffer overflows or use-after-free errors but you have to pay the runtime overhead of the garbage collector. Go is clear about the costs involved: pauses up to 10 ms, 25% of all CPU cycles, and 50% of RAM (see [http://golang.org/s/go14gc](http://golang.org/s/go14gc)). Other GCed languages likely have similar costs even if they aren't stated as clearly.

* use an unsafe language like C/C++, enforce these things manually in your head and with comments, and occasionally have security problems when you screw up.

* use a safe-but-explicit language like Rust/Swift and have to "show your work" to the compiler quite a bit more, finding a different way if safety is too hard to prove.

~~~
Jonhoo
For self-referential datastructures (your first point above), using an Rc or
Arc shouldn't have any overhead. I agree that it would be nice to be able to
express this, but it's not really that big of a problem.

For partial borrows, there has been some work on it ([https://github.com/rust-
lang/rfcs/issues/1215](https://github.com/rust-lang/rfcs/issues/1215)), but I
agree that this is something that's missing. That said, I _very_ rarely run
into this, and there's usually some fairly obvious restructuring I can do to
make it work out.

~~~
scottlamb
> For self-referential datastructures (your first point above), using an Rc or
> Arc shouldn't have any overhead. I agree that it would be nice to be able to
> express this, but it's not really that big of a problem.

I don't see how that could be true. It means using a separate heap allocation
for each referenced piece (although the owning_ref thing steveklabnik
mentioned might minimize that), a bit of extra RAM for the counter, and a bit
of bookkeeping. I'm not saying the overhead is huge, but how could it be zero?

fwiw, I finished this section of my code, and the context approach I mentioned
worked out well for me. It was just a bit of a puzzle to find a way Rust would
like.

I'm pretty happy with Rust so far even though I've had to restructure parts of
an apparently-working program to fit its model. My program is now more
obviously correct, benchmarks are pretty good so far (though I wish profile-
driven optimization were supported/mature:
[https://unhandledexpression.com/2016/04/14/using-llvm-pgo-
in...](https://unhandledexpression.com/2016/04/14/using-llvm-pgo-in-rust/)),
and the open source library situation seems better than C/C++ for what I'm
doing and actively improving where C/C++ is stagnant.

------
Animats
Well, the functional crowd won. An example expression from the parent article:

    
    
        let idx = args
        // iterate over our arguments
        .iter()
        // open each file
        .map(|fname| (fname.as_str(), fs::File::open(fname.as_str())))
        // check for errors
        .map(|(fname, f)| {
          f.and_then(|f| Ok((fname, f)))
            .expect(&format!("input file {} could not be opened", fname))
        })
        // make a buffered reader
        .map(|(fname, f)| (fname, io::BufReader::new(f)))
        // for each file
        .flat_map(|(f, file)| {
          file
            // read the lines
            .lines()
            // split into words
            .flat_map(|line| {
              line.unwrap().split_whitespace()
                .map(|w| w.to_string()).collect::<Vec<_>>().into_iter()
            })
          // prune duplicates
          .collect::<HashSet<_>>()
            .into_iter()
            // and emit inverted index entry
            .map(move |word| (word, f))
        })
      .fold(HashMap::new(), |mut idx, (word, f)| {
        // absorb all entries into a vector of file names per word
        idx.entry(word)
          .or_insert(Vec::new())
          .push(f);
    

Is there editor support for indenting this stuff?

~~~
lifthrasiir
More readable, and IMHO more idiomatic version: (Disclaimer: never tested)

    
    
        let mut idx = HashMap::new();
        for fname in &args {
          let f = match fs::File::open(fname) {
            Ok(f) => f,
            Err(e) => panic!("input file {} could not be opened: {}", fname, e),
          };
          let f = io::BufReader::new(f);
          let mut words = HashSet::new();
          for line in f.lines() {
            for w in line.unwrap().split_whitespace() {
              if words.insert(w.to_string()) { // new word seen
                idx.entry(w.to_string()).or_insert(Vec::new()).push(fname);
              }
            }
          }
        }
    

People who was introduced to the functional approach for the first time seems
to enjoy it so much that everything becomes a hard-to-read mess of functions.
I had similar experiences with Python list comprehension and C# LINQ.

~~~
jupp0r
Sorry, but I disagree. Although more verbose (as in more characters of source
code), I could easily understand what the original code did. Your code has a
high cyclomatic complexity and I have to keep all those nested for loops in
mind when trying to figure out what you are doing.

I write C++ for most of my day job, and this would not pass code review
because of readability problems in my team.

~~~
lifthrasiir
I know there are obviously lots of rooms for improvements. Indeed, if the code
would become more complex I would immediately refactor. I can list some
problems with my example:

\- The error handling should really have been refactored. `try!` would make
this easier. (I haven't used it since it is in the `main` function and the
code was a direct replacement.)

\- Error during reading words is not accurately handled. Again, `try!` would
make this easier.

\- `words` and `idx` look coupled to each other, which ideally shouldn't have
been.

\- `words.insert` is quite an opaque method; not everyone is sure if it
returns true on a duplicate key or not. I've added a comment but frankly I'm
not satisfied of that. <deleted> One alternative is to use `words.entry`
instead, which gives a named enum variant. </deleted> Oops, `HashSet` does not
have `entry`...

\- Ultimately, the body of the outermost loop should go to a function.

That said, do you really think that `line.unwrap().split_whitespace().map(|w|
w.to_string()).collect::<Vec<_>>().into_iter()` is a chunk of code which can
be read at a glance? It at least has to be named (like T-R's example).
Functional approach means that you can split functions and _individually_
review them; the original code, IMHO, didn't.

~~~
T-R
I think the issue with procedural loops (not to be critical, just in general)
is that there's no abstraction - it's not clear what's getting mutated, or
what the result is (or its type), it's harder to look at the individual steps,
and it's harder to refactor.

The nice thing about functions like "map", "filter", "fold", "flatmap", etc.,
is that they describe intent - when you see a "map", you know it's not
aggregating things, just applying a function: "map" has a distinct purpose
from "fold" (and depending on how pure things are, you may not even be
_allowed_ to do anything crazy).

Aside from that, the higher-order functions have pretty intuitive algebraic
laws for refactoring (like with "compose" mentioned in my other comment) -
It's not clear how you'd factor code out of a nested loop without
understanding the whole thing, whereas "flatmap" (a.k.a. "bind" for the list
monad) has laws for refactoring it.

~~~
lifthrasiir
I agree to you with the general sentiment. That said, Rust is not a functional
language per se; the degree of functional decomposition is thus fundamentally
limited. This is partly because it tries to be efficient, and as arielb1
pointed out, ownership sometimes makes it worse.

> It's not clear how you'd factor code out of a nested loop without
> understanding the whole thing, [...]

You are right, loops are particularly hard to refactor. I still argue that my
code is better (barring any future expansion) because it fits within handful
lines; you cannot easily understand 100 lines of code with the cyclomatic
complexity of 1, but you can often easily understand 10 lines of code with the
cyclomatic complexity of 8 (e.g. triple loops). The size of code, syntax and
contextual information matters as much as algebraic laws.

------
Munksgaard
> This latter point is particularly interesting; the Rust compiler will not
> compile a program that has a potential race condition in it.

I feel obliged to point out that this is false. Rust prevents _data races_,
but not _race conditions_. You can read more in the Rustonomicon here:
[https://doc.rust-lang.org/nomicon/races.html](https://doc.rust-
lang.org/nomicon/races.html)

~~~
Jonhoo
In my defense, the next sentence is "Unless you explicitly mark your code as
`unsafe`, your code simply cannot have data races." That said, I've updated
the article text to now say data races in both places.

------
0xmohit
Good to see such articles that provide an insight into various aspects of a
programming language.

A couple of other beginner-friendly resources would include:

\- An alternative introduction to Rust [1]

\- 24 days of Rust [2]

\- CIS 198: Rust Programming [3]

[1] [http://words.steveklabnik.com/a-new-introduction-to-
rust](http://words.steveklabnik.com/a-new-introduction-to-rust)

[2]
[http://zsiciarz.github.io/24daysofrust/](http://zsiciarz.github.io/24daysofrust/)

[3] [http://cis198-2016s.github.io/](http://cis198-2016s.github.io/)

------
joobus
I'd like to know what the author considers "systems work"; I don't consider
garbage-collected languages (Go, Python) "systems" languages.

~~~
sanderjd
"Systems" has come to include things like networked services (eg. HTTP and DNS
servers), which you can certainly do successfully in GC'd languages in many
circumstances. Personally, I'd just like to see a widely agreed upon
definition, so that we can all stop being confused when this comes up.

~~~
LionessLover
Naming confusion is completely normal. Even in fields like learning human
anatomy you'll learn different words for some things depending on who your
teacher is, even though one would think that there should have been sufficient
time for such issues to settle and the professionals agree on one word. But
naming always is an accident of space and time - where did it happen, when did
it happen, and then the path from there. Language is dynamic and a little
fuzzy - in the sciences as well. You can define all you want, the problem
always is _other people_ , who choose their own definitions. It's fun - it
keeps you on your toes :-)

In this context, for example, I don't see a reason for a world government to
impose _one_ definition of "systems programming" worldwide by military force
(this is what it would take!). Just like in almost everything in human
language, you get it from the context. You should try to teach a computer to
recognize human speech _including the meaning_ , then you'll realize that
almost all of it relies on context and pre-existing knowledge. "Precision"
comes from the interpretation - human language communication is the nightmare
of people who love functional programming, it's full of hidden state and
context and active interpretation.

Which is why it's so easy for this to happen:
[http://dilbert.com/strip/2015-06-07](http://dilbert.com/strip/2015-06-07) If
someone _wants_ to argue, there is no way to provide a water-tight human-
language text that "Dick from the Internet" can't attack. There always is a
way to mis-interpret human language.

~~~
sanderjd
Thanks, this is a great response. I shall henceforth embrace the fun and swear
to never again be lured by the superficial temptations of naming hegemony!

------
Jonhoo
Author of the post here. Curious that this got posted again. Was originally
posted as
[https://news.ycombinator.com/item?id=11773332](https://news.ycombinator.com/item?id=11773332).
Can the posts be merged by a mod somehow?

------
jeffdavis
I really like my experience with rust so far also, but a few caveats:

* try!() Is pretty annoying

* Working effectively with C in non-lexical ways seems to involve some unstable libraries and still requires nightly rust

* Macros are safer, but can't do some things that C macros can. For instance, they are hygienic, which means you can't conjure up new identifiers. For that, you need a syntax plugin, which is very powerful but the APIs aren't stable yet. This goes to the previous point.

* A few annoyances, like warning when you don't use a struct field as "dead code". If I'm interfacing with C I probably need that struct field whether the rust compiler sees it or not, but I don't want to disable all dead code warnings for that.

~~~
tatterdemalion
You can attach the allow(dead_code) attribute to the struct - or even the
individual field - which will scope it to that declaration only. Still
annoying when you have a lot of them, but better in my opinion than not having
dead code warnings on fields, since the use case you describe is the minority
of Rust structs that are declared.

~~~
jeffdavis
Maybe there should be an annotation saying that the layout of a struct is
important for ABI compatibility, and it would silence warnings related to the
layout.

~~~
tatterdemalion
C FFI structs have to be marked `#[repr(C)]` to guarantee they'll be
represented how C structs would be. It might be a reasonable change to
implicitly allow dead fields on any struct with a C repr (since using the
struct in C effectively makes all of its fields public).

------
zimbatm
Is rust ever going to re-introduce the N:M model again ? For services which
need to handle 1M connections the system threads are too expensive and mio
brings back the callback hell.

~~~
lifthrasiir
Have you tried mioco [1]? It looks fine enough to avoid the callback hell.

[1] [https://github.com/dpc/mioco](https://github.com/dpc/mioco)

~~~
zimbatm
Thanks. So no clear winner yet. Does libraries have to be re-written to
support the coroutines ?

------
georgewsinger
If I'm not hacking on something super low-level, like hardware or an OS, then
should I still try Rust? Why not stay within super high-level/expressive
programming languages like Haskell/clojure?

I ask because a lot of extremely smart people I know like Rust.

~~~
nercury
Rust is good.

Great tooling: Libraries can be written, published and used quickly. Code can
be easily rebuilt on upcoming Rust versions without uninstalling anything.
Integrated testing and documentation generation. Cross-compilation.

Truly cross-platform. As an example, Rust even supports both GNU and MSVC
targets on Windows, that means Rust libraries can be linked into the C++
projects compiled with MINGW or MSVC. All the standard library features are
cross-platform, unless namespaced otherwise.

Rust linear type system requires variables be either uniquely mutable or
immutable, but not both. This solves whole class of resource management
problems (memory being the most important) in the simplest possible way. This
compile-time guarantee also works for all references to data, which become
thin pointers at runtime. This, together with Send-able and Sync-able types,
also ensures no data races.

Language is up-to-date. No-nulls, pattern matching, closures, generics,
attributes will feel familiar to programmers coming from various languages.
Language had many iterations and changes, and every bit of it was designed
with care. Right now it may be a bit explicit, but sugar can always be added
later if it appears worth it.

Stable release cycle, stability guarantee. We know when new beta will be
renamed to stable and we know what's in it! That means there is time to check
all ecosystem for possible breakage, even though historically Rust introduced
hardly any breaking changes since 1.0 a year ago.

------
Mihies
One thing I am missing is dependency injection/ioc. How does one effectively
unit test without it?

~~~
Jonhoo
You can still do dependency injection in Rust..? There even exists a create
for it: [https://github.com/Nercury/di-rs](https://github.com/Nercury/di-rs).
There's also some further discussion here: [https://users.rust-lang.org/t/how-
do-you-implement-dependenc...](https://users.rust-lang.org/t/how-do-you-
implement-dependency-injection-in-rust/213)

~~~
Mihies
I saw that but it seems really complicated for something that should be
straightforward. Or it just looks so? (coming from c#)

~~~
SideburnsOfDoom
If I understand it right, Rust has "zero-cost abstractions" meaning that they
don't exist at run-time, just when compiling. e.g. [http://blog.rust-
lang.org/2015/05/11/traits.html](http://blog.rust-
lang.org/2015/05/11/traits.html)

This is the right choice if you want the language to be "a better C" but it
means that there is none of the Run-Time Type Information that we are used to
in C# (or java).

The RTTI plays a small but key role in e.g. how a DI container works: The
container works out what parameters a constructor needs at runtime and
instantiates those recursively; and in how a mocking framework works:
Generating a class on the fly with the right interface declaration. Even
XUnit/NUnit work by using RTTI: The testing framework looks for all
classes/methods with the right attribute metadata, and runs them.

It's easily fast enough for how we use C# and Java, but it's not zero-cost.

There are unit testing tools in Rust, but they cannot work the same way -
IIRC, there is compiler support for them. So, While Rust as a good chance at
being "A better C", don't expect it to do all the things that C# does
involving types at runtime. A different goal meant a different language
design.

There are other interesting things in Rust like the Macro system that might
help do similar things, but it's really not going to work like in C#

------
namelezz
> Once your code compiles, you’ll find (at least I have) that it is much more
> likely to be correct (i.e., do the right thing) than if you tried to write
> similar C, C++, or Go code.

What issues in Go do this sentence refer to?

~~~
seren
I don't think that Go can detect race condition errors at compile time. So it
could compile and sometimes fails mysteriously. In Rust, as long as you don't
use unsafe block, the compiler should detect it. This is not really a Go
issue, rather that Rust is much more stringent on what it "correct" code
compared to C, C++ and Go.

~~~
gizzlon
I think you're right. Go has a race detector, but it work on run time:
[https://golang.org/doc/articles/race_detector.html](https://golang.org/doc/articles/race_detector.html)

Guess you would normally run your tests with it, and maybe compile a dev /
beta / RC version with race detection enabled?

Never used it in prod, but it did find a problem for me.

~~~
seren
I believe this is one of the main promise of Rust, even if it is not stated
like that, if it compiles it works and it is safe.

But obviously for any other language (C,Go,etc) you can use a compiler +
static analysis + dynamic analysis to have the same confidence.

What I like with Rust is that implicitly you know that every one has to run
the equivalent of analysis tool to release/deploy the code.

~~~
vertex-four
> I believe this is one of the main promise of Rust, even if it is not stated
> like that, if it compiles it works and it is safe.

Not quite. There's certain types of issue which its type system won't detect -
you can, in fact, easily get into a race condition, just not what's known as a
data race - and there's no promise that what you've written actually does what
you expect it to do, just that it'll do something without use-after-free and
other memory-related bugs, and if you've written your program well it's more
likely than e.g. Python that it does what you intended (in my personal
experience).

Getting closer to "if it compiles, it works", but still not entirely there,
there'd be the likes of Idris and other systems with proofs embedded into
their type systems - but even then you still have to verify that the proof
matches what you're actually trying to do in the real world.

~~~
seren
I agree that "it works" is an overstatement. It is more akin to if it compiles
it won't segfault. (Arguably this is not unique to Rust, but there is no
overhead in runtime to achieve that).

------
shmerl
This is also a pending issue: [https://github.com/rust-
lang/rfcs/issues/349](https://github.com/rust-lang/rfcs/issues/349)

