
Explore the ownership system in Rust - adamnemecek
http://nercury.github.io/rust/guide/2015/01/19/ownership.html
======
simfoo
Am I crazy to say I prefer the ownership and copy/move system of C++11? It
feels much easier to reason about but then again maybe that's just because I'm
more familiar to it.

~~~
dan00
> It feels much easier to reason about but then again maybe that's just
> because I'm more familiar to it.

I'm a professional C++ programmer for a decade and have started to look into
Rust, and there's no way that C++ is easier to reason about.

There's no automatic copying of complex data in Rust, you can't move something
away and later use it again, you can't invalidate interators and this list
goes on and on.

At the beginning you have to get used to the borrow system of Rust, learn the
idioms, but after that, it really seems to be in a lot of parts a better C++.

------
djur
This was very clear and made sense to me, but it unfortunately doesn't cover
lifetime annotations, which make sense to me 80% of the time and are
completely mystifying the other 20%.

------
Animats
If you've written some Rust, and you're more confused after reading that, it's
not your fault.

Here's how to think about ownership in Rust. First, realize that Rust has
basically the same memory model as C/C++. C/C++ has three big sources of
memory trouble: "how big is it", "who owns it", and "who locks it". C/C++
provides little help in dealing with those issue. Rust locks down all those
problem. Mostly through compile time checks.

Ownership in Rust starts out simple. There's single-ownership, and multiple-
ownership with reference counting. The latter comes in two flavors, with and
without concurrency locking. Reference counting works roughly the way you
think it should.

In addition to ownership, there's "borrowing". This usually means creating a
local reference to something. The local reference must have a scope that
doesn't outlive the thing being borrowed. That makes borrowing safe. Borrowing
is a compile-time thing with no run-time representation. Borrowed objects with
reference counts don't need reference count updates on the borrowed reference,
which speeds things up a lot. Borrowing in Rust is cheap and easy, and should
be done frequently.

When a reference to something is passed into a function, the compiler needs to
know if it's being borrowed, or whether ownership is being handed off to the
function. The default is to borrow; for more complex situations, there's
special lifetime syntax. A similar issue appears when a function returns a
reference. Returning a reference is complicated - is it a new object, or a
borrowed reference to an input object? Creating new objects in functions and
returning them should be avoided if possible. It's a Rust idiom to create the
object in the caller and pass it into a function to be modified.

Single ownership can be handed off to another owner. This, after some
controversy, is the default action for the "=" operator. (Using "<-" for
ownership transfer probably would have been better.) Such a handoff
invalidates the variable that gave up ownership, and that variable can't be
used again. For some types, though, you get a copy instead. For other types,
you have to explicitly ask for a copy. This part of the language is kind of
ugly.

Rust has the concept of "mutability". This is just the inverse of "const".
Since immutability is the default, you write "mut" in a lot of places.

Programs in Rust need more design effort than in some other languages. You
need to plan out who's going to own what, and who gets to change what. "Agile"
types may find this troublesome. The payoff is that once the program has
compiled, whole classes of errors have been eliminated.

~~~
lohankin
Noisy, unreadable code, obsessed with ownership, may give rise to whole new
classes of errors that otherwise won't be there. Good that someone
experimented with building a language around ownership, but for me it all
demonstrates that the idea is not viable. I tried to read 10 tutorials
already, each leaving me more confused than ever before.

~~~
Animats
_I tried to read 10 tutorials already, each leaving me more confused than ever
before._

I can understand that. The language has been changing so fast that most of the
stuff on the Web is outdated or wrong. I just found a place in the official
reference manual that's out of sync with the compiler, and sent in a note.
(The lambda/closure syntax is still in a state of flux.)

Rust roughly follows C/C++/Java syntax, except that, like Go, it's a "name:
type" language. Talking about ownership in code is new, as is the syntax for
that. The ownership syntax is a bolt-on, and it shows. The lambda syntax is
kind of weird; the designers went for conciseness over clarity. Some of this
will take getting used to. The syntax is no worse than C++, and in some areas,
better.

The syntax isn't the hard part, anyway. It's living within the ownership
restrictions that's hard. It's hard because that's a design issue. There's a
Rust port of Doom, and it has far too much unsafe code, because Doom's data
structures were not designed with Rust in mind. It's not yet clear how big an
issue this will be. The remark that 10% of Servo, the web page renderer, is
unsafe code is a bad sign.

I don't have a personal opinion on this yet. I haven't written enough Rust
code. I'm writing an RSS feed parser to get a sense of what it's like to deal
with a complex tree in Rust. So far, it's going well, with no need for unsafe
code.

If Rust can eliminate buffer overflows, dangling pointers, and memory leaks,
that's enough to justify using it in place of C/C++.

~~~
eddyb
Mentioning Go for `name: type` may not be inaccurate, but it is an odd choice
(we're seeing more comparisons of Rust and Go than Rust and any language it is
closely related to). I believe the origin for Rust's choice is actually the ML
family. __EDIT __: I just checked more carefully and Go doesn 't even have the
colon ([this blog post]([http://blog.golang.org/gos-declaration-
syntax](http://blog.golang.org/gos-declaration-syntax)) used it in
pseudocode).

~~~
Animats
Sorry. "name : type" comes from Pascal, and the Modula/Ada/Oberon family of
languages continue it. Now that type expressions have become so complex, it's
better to have a syntax where you know a type expression is expected. C/C++
struggle with this; they have to know which names are types just to parse.
This is a pain for tools which try to parse C/C++ without reading all the
include files. C originally had only built-in type names and "struct".
"typedef" came later, and made the syntax context-dependent.

------
jonalmeida
I found the rust-lang guide (now the rust-lang book[1]) very useful when first
learning rust. The style used to teach ownership in that guide is very similar
to this, but helped clear out a few remaining doubts I've had.

Consider adding contribution to the book!

[1]: [http://doc.rust-lang.org/book/README.html](http://doc.rust-
lang.org/book/README.html)

~~~
steveklabnik
Thanks for the kind words. I certainly plan on expanding it.

------
zaroth
I have zero experience in Rust, so maybe I shouldn't comment on an
intermediate tutorial, but just some thoughts. I'm sure the answers to most of
this are out there, this is more stream of consciousness...

"let i = 1", so let isn't picky about types, and types can be implied. In the
'fn foo(i: i64)' we see the "<name>: <type>" syntax, which I'm not an
immediate fan of that, but no big deal.

We have value types, defined as a 'struct' with a 'Copy' flag. The "impl Copy
for Info {}" syntax seems weird, and the "#[derive(Copy)]" only slightly
better. Also, why does the struct have a hanging comma after the last (only)
member?

Next we add some methods to the Bob struct. Seems like there are multiple
ways, depending on the type of method? 'new' is just nested inside an 'impl
Bob' but 'Drop' is more explicitly nested inside 'impl Drop for Bob' and then
a lowercase 'drop' function. Huh.

Inside the 'new' function we see the first use of return values. And starting
to catch on that Rust likes chatty syntax, e.g. the '->' before the return
type. But yet, a very implicit structure to both initializing and returning
the new 'Bob' object! I get the sinking feeling the 'name: name.to_string()'
in the Bob initializer is important, and that '{ name: name }' would sadly not
work... Not loving the underscore in method names.

I guess the 'drop' function is a first example of an instance method,
identified by having '&self' or in this case '&mut self'. For a function that
internally doesn't explicitly mutate bob, it's curious to see the &mut in that
case. I'm going to guess 'drop' is a special case which always must be 'mut'.

The next step is "make bob value format-able when outputting to console" which
sounds an awful lot like implementing to_string() but is accomplished through
the 'Show::fmt' trait which is implemented using 'fmt::Show'. Is it really
called the "Show::fmt" trait, and not the "fmt::Show" trait?

Here's where I like the syntax less and less: "fn fmt(&self, f: &mut
fmt::Formatter) -> fmt::Result". Why must 'f' be 'mut' here? Why is the '&mut'
before the type and not before the name? The one colon for name/value versus
two colons for namespace is starting to hurt. Why a double colon instead of a
dot? Also, I have to figure out what are all these ampersands are actually
doing... they are notably absent from the 'black_hole' function.

Skipping ahead... owned values can be mutated, you just have to flag it with
'let mut'. Almost wish that was simply 'met' vs 'let', or even 'mlet'. Any
owner can flag their value mutable, it doesn't have to start that way.

Then we get to "fn mutate(mut value: Bob) { ... }". Earlier we saw the 'fmt'
function apply 'mut' before the type definition, here mut keyword is before
the variable name. Is this the same, or something different happening? It
seems weird there's not just one place to put the 'mut' flag.

BTW, can you really not just do 'bob.name = "mutant";' versus 'bob.name =
String::from_str("mutant");'? Up until this point string handling seems
reasonably sane.

Further down we start putting things on the heap instead of stack. In 'let bob
= Box::new(Bob::new("A"));' \-- do you have to always 'Box::new' to get to the
heap, or is there a way to define Bob as always living on the heap, i.e.
'class' vs 'struct'? ... I think I spoke too soon, because further down when
it takes all of 'Rc::new(RefCell::new(Bob::new("A")))" to simply have a stack
pointer to a mutable instance on the heap, I think I'm ready to bolt.

It's an impressive level of control over the semantics of how objects are
created and destroyed without having to explicitly control every aspect of
memory management, and there do seem to be a lot of follow-on benefits for
going through all this trouble, but perhaps by 2.0 they will have made strides
to improve the syntax to something a bit less obnoxious.

~~~
Twisol
> Next we add some methods to the Bob struct.

`impl Bob` defines methods inherent to the type; `impl Drop for Bob`
implements methods defined by the Drop trait. (Think Java interfaces.)

> I get the sinking feeling [...]

`{name: name}` works just fine.

> I'm going to guess 'drop' is a special case [...]

Since the signature for `drop` is defined elsewhere, by the `Drop` trait, we
must use `&mut`, regardless of if we actually mutate `self`.

> Why must 'f' be 'mut' here?

This time the trait is `Show`, and the same reason as for `Drop` applies. Now,
however, we're actually using the fact that it's `mut`: you can't write to an
immutable formatter.

> Why is the '&mut' before the type and not before the name?

Because mutability and ownership is information carried by the type, not by
the value.

> It seems weird there's not just one place to put the 'mut' flag.

You use `mut` to define whether a stack-allocated value can be mutated, and
also to define whether a reference can be used to mutate its referent.
Function parameters are stack-allocated, so you can put `mut` before them in
the parameter list.

> Up until this point string handling seems reasonably sane.

`String` is a resizable string. `str` is not. It's a bit like Vec<T> vs. [T] -
see [http://cosmic.mearie.org/2014/01/periodic-table-of-rust-
type...](http://cosmic.mearie.org/2014/01/periodic-table-of-rust-types/) .

> I think I spoke too soon, because further down [...]

That incantation effectively disables Rust's compile-time ownership checking
and replaces it with a runtime-checked structure. RefCell implements `&mut`
semantics at runtime, and Rc allows multiple owners of the same data. There
are often ways to avoid needing these constructs, and it's considered a code
smell, so the incantation is long for a good reason. (If you really want a
shorter version, you can use `unsafe {}` and raw pointers. Or C++.)

You can have a mutable heap-allocated value simply by using `Box::new` and
marking the stack variable as `mut`: `let mut foo = Box::new(Bob::new("A"))` .

I agree that the syntax sometimes isn't quite there. There used to be a `box`
keyword for heap allocations, but it's been feature-gated for the alpha.

~~~
sanxiyn
> `{name: name}` works just fine.

It doesn't in this case. "name" field is String, "name" is &str.

~~~
Twisol
That's true. I thought he was making a point about syntax.

------
Astrobastard
Would these checks be possible to implement as some sort of diagnostic or
sanitizer in clang?

~~~
sanxiyn
Some of it, yes. As a matter of fact, Clang already includes one, which is
deployed on a large scale at Google.

[http://clang.llvm.org/docs/ThreadSafetyAnalysis.html](http://clang.llvm.org/docs/ThreadSafetyAnalysis.html)

------
lohankin
Some serious voodoo science is going on here, with a good chance to become a
next fad. Programming is difficult as it is, but apparently not difficult
enough for someone's taste.

~~~
andolanra
This 'ownership' system is a pattern which is used _informally_ in C, C++, and
other languages with manual garbage collection—the idea that you have a bit of
your program responsible for allocating and freeing memory, and another bit
which merely uses it. Rust takes this pattern and has the compiler enforce it.

I suppose in some ways it makes programming more difficult, but it's also
going to make low-level programming _safer_ by virtue of the fact that certain
invalid programs are no longer possible to express. Rust is one of very few
languages which can offer that.

(Also, having used Rust a fair amount, I don't think it's actually that
difficult once you're used to it. I can think of much more complicated
programming features that have been present in popular languages for decades!)

~~~
rubiquity
For me, understanding Rust's type system has been far more complicated than
understanding the ownership model. But that's likely because I have zero C,
C++ or Haskell experience.

~~~
andolanra
Has it been something about the type system in particular, or just keeping
track of all the parts? I would not call it a simple language, but I think
it's ultimately tractable, albeit with work.

