
Why Rust's ownership/borrowing is hard - wkornewald
http://softwaremaniacs.org/blog/2016/02/12/ownership-borrowing-hard/en/
======
jerf
"Rust's ownership/borrowing system is hard because it creates a _whole new
class_ of side effects."

I'd submit it doesn't create them... it _reveals_ them. They've always been
there. Almost every other language fails to shine the light on them, but that
doesn't mean they aren't there. All GC'ed languages still have issues of
ownership, especially if threaded, and all non-GC'ed languages have all the
issues Rust has... it's just that the language doesn't help you.

~~~
klodolph
Don't be so quick to say that Rust is doing the "right thing" here. An early
example in the article shows one of the big Rust gotchas: you want to borrow a
reference to a part of a structure, but if you encapsulate that in a function
which takes a reference to the enclosing structure, the entire enclosing
structure is borrowed. This isn't a _revealed_ side effect, this is an
_invented_ side effect that is an artifact of the type system.

~~~
nostrademons
I thought that was one of the most fascinating parts - Rust's borrow-checker
enforces the Law of Demeter and Principle of Least Privilege as a side-effect.

Code that takes a full structure when it only needs to operate on a part of
the structure _is_ badly designed. It's not conveying the full information
about the data that it actually needs, which means that unexpected
dependencies can crop up, implicit in the body of the function, as the code is
modified later on. This is behind a lot of long-term maintenance messes; I
remember a few multi-year projects at Google to break up "data whales" where a
single class had become a dumping ground for all the information needed within
a request.

Thing is, we all do it, because taking a reference to a general object and
then pulling out the specific parts you need means that you don't have to
change the function signature if the specific parts you need change. This
saves _a lot_ of work when you're iterating quickly and discovering new
requirements. You're trading ease of modification now for difficulty of
comprehension later, which is usually the economically wise choice for you but
means that the people who come after you will have a mess to untangle.

This makes me think that Rust will be a very poor language for exploratory
programming, but a very good one for programming-in-the-large, where you're
building a massive system for requirements that are largely known.

~~~
jeremyjh
>Code that takes a full structure when it only needs to operate on a part of
the structure is badly designed.

No, it is not; this is done all the time with methods and it improves
encapsulation - you may not want your clients to be able to decompose your
data structures. Do you really mark every member of your data structures as
pub??

Sorry but this is a poor ad-hoc defense of an actual annoyance in the borrow-
checker.

~~~
pdpi
If you're trying to achieve proper encapsulation, you just have a module that
implements some sort of functionality, and shouldn't need to borrow anything
from it. The real question is why you're pulling data instead of pushing
messages.

~~~
tatterdemalion
This is incorrect. "Sending a message" involves borrowing the data so that the
method can run. `foo.bar()` borrows `foo`.

~~~
pdpi
Sorry, I wasn't clear. You are, of course, correct: you're only ever in a
position to call a method if you hold a reference to the struct you're calling
on.

My meaning was that you should favour a usage pattern that looks like you
either move/copy things into the called method, or lend a reference to
something you own (which is, presumably, not going to be held on to for very
long), and then you're either given ownership of whatever return value you
get, or get a reference whose lifetime depends on the arguments you passed in
(but not the object itself). All of this ends up being quite clean, and you
don't end up tying yourself into a borrowing knot.

You do end up in a weird place when your methods return references to fields
of the owning object. When that happens, you're restricted in what you can do
with the owning object until the reference goes out of scope. Rust mutexes are
implemented precisely like that, which highlights what sort of behaviour
you're getting from this usage pattern.

The former provides better encapsulation and more closely resembles the
message-passing approach to OOP, whereas the latter pattern is not only not
very ergonomic, it's quite indicative of poor encapsulation (because you're,
by necessity, asking for internal state).

~~~
tatterdemalion
Here is the issue: `self.foo.bar(self.baz())` is an error if `foo.bar()`
mutates foo, even if `baz()` doesn't touch `foo` and even if `baz()` doesn't
return a reference. This is because borrowck doesn't properly understand that
baz will be evaluated before bar, and can't distinguish which elements of a
struct are accessed by that struct's methods. Both of these are problems that
can be solved, and neither of them is actually promoting good practice in my
opinion.

All it does is force you to use unnecessary temporaries, like `let baz =
self.baz(); self.foo.bar(baz)`

~~~
Manishearth
Yeah, this is basically a borrowck "bug" which will probably be fixed post-
MIR.

Note that there are cases where such code is invalid even with the temporary,
and they can be related to Demeter. Ish. Also to API contracts; the guarantee
should be embedded in the signature (so changing the internals shouldn't cause
its usage to stop compiling), which is unweildy to do.

------
jupp0r
Coming from a mainly C++ background, I find that the Rust language makes best
practices in languages with manual memory management (and to some degree also
in garbage collected environments) explicit. This is a great property of the
language and it definitely changed the my C++ programming. You can see it as
automatization of the more boring aspects of code reviews.

~~~
Manishearth
I am from a more mixed background, but I have had my fair share of C++ before
I learned Rust.

Now when I code C++ my Rust knowledge is a double edged sword. On one hand, I
have a much better idea on how to manage my data in C++. I had this discipline
before learning Rust, but I didn't have explicit rules to it; it was just a
... nebulous bunch of idea about how data works. Now it's explicit. On the
other hand, I am absolutely terrified when writing C++ code (something I would
do with ease in the past). Well, not exactly, but it's hard to accept
somewhat-unsafe code (which is probably safe at a macro level -- i.e. safe
when looked at in the context of its use) and while I can see that something
is safe, I can also see how it could _become_ unsafe. And I fret about it.
Rust trains you to fret about it (more than you would in C++, that is), and
simultaneously Rust handles it for you so you don't have to fret about it :)
C++ doesn't handle it, but you still fret about it sicne Rust taught you to.

I guess it's a "Living is easy with eyes closed" thing :P

------
dikaiosune
There were also some interesting comments yesterday when this was posted to
the Rust subreddit:

[https://www.reddit.com/r/rust/comments/45gcmh/why_rusts_owne...](https://www.reddit.com/r/rust/comments/45gcmh/why_rusts_ownershipborrowing_is_hard/)

~~~
gulpahum
There was one good point, which was also my conclusion:

"Given all that, I wonder if it makes sense to prefer plain old functions most
of the time. Is that right, or am I overlooking something?"

The response was yes. Avoid impl methods which take a mutable self.

~~~
steveklabnik
I wouldn't characterize it this way. There was a whole thread on just this
question:
[https://www.reddit.com/r/rust/comments/45j6ua/why_dont_we_pr...](https://www.reddit.com/r/rust/comments/45j6ua/why_dont_we_prefer_plain_functions_over_impl/)

------
viperscape
Try a temporary variable for state, then use the consume in a scoped let
block.

------
cm3
Why is move the default? In many code bases the number of immutable references
outweighs that of pointers.

~~~
jonreem
In early rust, you actually did need to explicitly write `move` to move a
value. However, this was extremely annoying as you move values _a lot_ so
moving was changed to be the default, which is far more tolerable.

EDIT: There still is a `move` keyword, but it is used to indicate that
closures should take ownership of their environment vs. just borrow values
from it, not to move individual values.

------
imtringued
>Nothing in the experience of most programmers would prepare them to point
suddenly stopping working after being passed to is_origin()!

I'm not even a Rust programmer and only read three paragraphs about the borrow
checker and I instantly saw that point is moved.

------
jorgecurio
Should I learn Rust in 2016? What are you guys building with it and why Rust
in particular?

~~~
dikaiosune
I've found it very fruitful. I had previously spent a couple of months hacking
away at a project in C++ which needed lots of fine-grained parallelism and
custom data structures (metagenomics analysis tool). When we decided to shift
our approach and that we wouldn't be saving any code, I started out trying
Rust and fell in love. Many of the issues I faced in trying to quickly put
together an application in C++ were just non-issues as a result of Rust's type
system.

~~~
pjmlp
One of our customers is using C# for sequencing and it is quite fast for his
datasets.

Of course, using Rust is even cooler.

~~~
dikaiosune
Yes it is :). Also, I've had mixed results with managed languages and loading
1.5 TB text indices into RAM (not even to say anything about how much of
NCBI's nt database will fit into 1.5TB when using different runtimes), so
using manual memory management seems like the smart move here. Also, C++ had
many libraries we could lean on for manipulating very large genomics datasets,
and Rust is starting to grow a little ecosystem as well. I'm not familiar with
a comparable availability of open source tools in C#, although I don't have
any experience in it so maybe that lack of exposure isn't reflective of the
state of things.

~~~
pjmlp
What I can disclose is that they mostly use R, Java and C#.

C++ is used by research algorithms used in HPC context or by the device
drivers for the readers.

In this case the data sets are around 1GB, but I don't know what they are
actually loading into memory.

