Hacker News new | past | comments | ask | show | jobs | submit login
Common Rust Lifetime Misconceptions (github.com/pretzelhammer)
178 points by donmcc on May 23, 2020 | hide | past | favorite | 43 comments



> T only contains owned types

I'd say this is true, because I consider &mut T and &T to be owned types in their own right. They own a pointer. I can cast them to raw pointers. &T impl's Copy even if T doesn't. https://doc.rust-lang.org/1.43.1/src/core/marker.rs.html#794

Otherwise, the compilation is great.


Is there any type that is _not_ owned by this definition?


Not really- which is why "owned type" is kind of imprecise and usually just means "a type not constrained by any lifetime."


This is not how the author uses the word owned.


If you use the author's definition of the owned term as "some non-reference type, e.g. i32, String, Vec, etc", the misconception is indeed one. But I disagree with the definition.


Reading this document suddenly made learning Rust a lot scarier, and also increases my respect to Rust developers (people writing programs in Rust). It kind of shows that increased safety doesn't come for free.


Hi, document author here. I learned Rust because I was too scared to learn C++, lol.

Rust has a steeper learning curve than most languages but it also has a very beginner-friendly and welcoming community. So don't be scared, you can do it! If you get stuck on something it's always easy to get help.


As someone who doesn't know Rust yet I feel there's a "do this instead" missing to point 8. It's not clear to me how to do it instead.


In my (admittedly somewhat limited) experience, these don't come up that frequently. except number 5; I've had several times where I've had to be aware of where references are coming from.

I would say give Rust a go anyway. Worst case is you end up not getting along with it, best case is you now have a new tool to reach for.


For most of these, the net result is that the compiler behaves strangely, but the programs that compile still work as intended. These are intermediate to advanced Rust concepts, and most programmers can eventually learn them from experience fighting with the compiler.


If fighting with the compiler is necessary for learning, I think that would men lots of users would drop out, not “eventually learn”.

That’s why lots of work has been done and still is being done to make that “work with the compiler’s help”.

The thing is: in _any_ programming language, whenever you pass anything mutable to another function, you have to think about ownership.

For the sake of reliability, rust makes that explicit, where other languages such as C and Java assume the programmers know what they are doing. To not make that too tedious for the programmer, rust also infers quite a few rules.

Rust is getting better on both fronts. Its ability to infer ownership grows, and it gets better at reporting problems it can’t solve on its own.


> If fighting with the compiler is necessary for learning

I don't think "fighting the compiler" is a useful concept. No matter what language you're learning, you have to learn both the language's syntax and semantics.

If your program barfs up an error at runtime because you made a semantic error the language forbids, that's no better than barfing up an error at compile time because you made semantic error the language forbids.

So it's both more accurate and more precise to say that a language's semantics might be complicated or subtle, but the compilation step is orthogonal to the actual problem you're talking about.


The difference in when the error is flagged is a large difference, though. Usually, runtime semantic errors won’t be flagged until the program tries to actually perform some nonconforming action. That lets the programmer work on unrelated issues by simply not exercising the defective path.

Compile-time errors, on the other hand, ensure that the entire program meets some baseline of correctness before any of it is run. Neither is necessarily better or worse than the other, but they are categorically different things that have their own effects on the experience of writing programs.


You’re right, of course. To my mind, one way to classify how severe mistakes are is by looking at their potential consequences. For programming, there’s unpredictable, hard-to-identify bugs with far-reaching effects at oneend of the scale. On the other end are mistakes that announce themselves loudly, in a predictable way that are isolated from everything else.

My original comment was a clumsy attempt to reassure ‘antpls and other Rust beginners that lifetime errors generally manifest as the less-troublesome kind on this scale.


Pretty much all misconceptions named there can be avoided by following the advice that every Rust programmer will give you: read the damn book, it will save you a lot of time.

If you are in the category of users that never RTFM, then I guess this article might be for you, but then kind of by definition, you won't read this article either, so...

---

For example, in C++ a generic `T` can only mean an owned type:

    template<class T> void foo(T t);
    template<class T> void bar(T& t);
You can't pass `foo` a reference to `T`, you'd need to pass it an owned type like `foo(std::ref(t))` instead or similar.

In Rust, when you write

    fn foo<T>(t: T);
you can call `foo` with an owned type or a reference, no problems, generic types are... well... generic.

So if you start writing Rust code with a C++ background without reading the book, then you might probably think that generics in Rust work like templates in C++. One couldn't be more wrong about this, they are in fact completely different language features. They might solve very similar problems, but they are as different as Python is from Haskell.


> For example, in C++ a generic `T` can only mean an owned type:

Wrong.

  template<class T>
  void foo(T t) {
  }

  int main() {
    int x = 0;
    foo<int&>(x);
  }
Unadorned Ts as function parameters will by default be deducted as value types, but a generic T parameter can represent references just fine. This is especially important for template classes.

> you might probably think that generics in Rust work like templates in C++. One couldn't be more wrong about this, they are in fact completely different language features. They might solve very similar problems, but they are as different as Python is from Haskell.

it seems to me that they are more similar than different.


> Wrong.

Indeed, thanks for the correction!

> it seems to me that they are more similar than different.

I think the Haskell vs Python comparison is accurate. Rust generics are strongly typed (definitions are checked), while C++ templates are "weakly" typed (type checked when they are instantiated). In C++ it is trivial to write templates that will never type check when instantiated, while in Rust this is impossible.


>> Wrong.

>Indeed, thanks for the correction!

Sorry for the brusque response, I had a "someone is wrong on the internet moment".

>Rust generics are strongly typed (definitions are checked), while C++ templates are "weakly" typed (type checked when they are instantiated).

True. The usual solution is to force-instantiate them against archetype classes. While it is not perfect it is a good approximation.


> True. The usual solution is to force-instantiate them against archetype classes.

This is correct, but you not only need to make sure that they accept an archetype class, you also need to make sure that they reject all others.

This often requires writing and instantiating a substantial amount of code for testing purposes, which often ends up as large or larger than the actual logic itself.

I suppose you already know the pain of doing this, but for those who do not, it is like testing a function in python for all possible inputs. Can be done, but is a lot of work.

In practice, most generic C++ code is not tested in this way due to how painful it is to do so.

So what in Rust takes no work, takes a huge amount of effort in C++.


I have programmed in something like 12 langs, and yes, the first 2 months of rust was kind of brutal to me.

But a lot of it is UNLEARN. Before, I was doing mostly F# and believe back then I was a very functional developer, but now with rust I do more functional programming than before.

Most of time, errors and problems show up because rust is trying to put you towards a way to design the code. The most you fight it, the harder is.

I think, 90% of time rust is right. Doing ERP-like work I say rust is even more productive than F#, that was also great to me.

--

Rust get truly hard when try to do pseudo OO, trying to do introspection (the harder stuff I think), and mix mutability inside the same struct. But most of that happend outside regular development (I'm building a toy lang and hit the hard stuff often)


This article is probably not aimed at Rust beginners and you shouldn't shy away from Rust, if you feel that you cannot possibly ever understand the lifetimes. The perspective of the article seems to stem from the author's own learning process, and might make sense only after you have some experience with the borrow checker and lifetimes. Most of the time everything might go smoothly, when you know the basic rules, but when the compiler gives you an error message, and you think it shouldn't have, then reading the article might give some insight on the problem.

And the Rust developers indeed deserve respect in my opinion too. They are trying to solve a hard problem and not taking the worse-is-better-option.


In a way, it's liberating. Look at it as a service the compiler offers, where lifetime errors are a normal part of the development cycle.

I use rust and I don't worry about the lifetime stuff. Just hack away and get the job done. Overall, rust is lower startup friction than most of the tools I use (though YMMV).


Don't let it, I've only skimmed it but I think you can get started and write some effective rust without even having or understanding the misconceptions.

And even if you did gain a misconception, so what, apparently you'd just be joining other people writing rust who share the same one.


It depends on your background.

If you come from C/C++ then all the notions discussed should be familiar to you, because you used to do the borrowing and lifetime management manually, or you were doing something really wrong. :)

Whereas the rust compiler helps you with all this (super hard and annoying) stuff.


I read this article with caution. I don't want to upset the delicate balance that is my intuition of how the borrow checker works. Articles that say "Don't think of it like X" are double edged swords ;-)

In reality, though, while it took me a while to get relatively comfortable with the borrow checker (much longer than I expected as an old school C++ programmer), 99.9% of the time it works exactly as I expect. Of course the other 0.1% of the time takes a lot of my energy ;-)


Tbh, I'm not sure that reading a list of falsehoods is the best introduction to actually programming in a language.


Coming from JS it was rather counterintuitive for me that a function would "own" a value even if it was ran synchronously and finished.

Also, the whole 'move' terminology sounded to me like Rust would move memory around, but it usually referred to the move of ownership. Also 'consume' didn't make no sense to me. "into_iter consumes a vector" what does it mean? Where does the vector go?

The most valuable info I got about lifetimes was: references/borrowing is usually what you want, so always throw in a &.


These two stack overflow discussions helped me with 'into_iter' and move semantics.

https://stackoverflow.com/questions/34733811/what-is-the-dif...

https://stackoverflow.com/questions/30288782/what-are-move-s...

There are two types of ownership operations in Rust: consume/move and borrow.

A consume or a move is transferring ownership. Consume is often used because it makes more sense in most cases as a word than move. The function is consuming or eating the variable so it is no longer available for another function to consume/eat. The term move comes from the formal system behind the borrow checker, the "Affine type system".

Borrowing is fairly self explanatory. The function isn't eating the value, it's just looking at it or in the 'mut' case, is modifying it in some way(if food is a good analogy, think adding seasoning or removing a crust).

As for 'into_iter', it basically can be defined as a move, borrow, and/or mutable borrow. This let's you set rules on how the data type may be used. This way the type can be used in most language constructs naturally with the compiler picking the best version based on the constraints of the program.

Apologies in advance if you weren't looking for an explanation and I misread the room.


"Ownership", "move", "lifetimes", etc have nothing* to do with what happens in the compiled program. It's a kind of static analysis. The closest analogy from the JS world would be what Typescript is to JS. Though that analogy is far from perfect.

If a function "consumes" your owned value that means the value can no longer be used by you. It's swallowed by the function you called. The function may simply drop the value (in which case the memory is likely freed). But it's also able to reuse the underlying buffer or integrate it into another type. Which is where `into_` comes in. It can take the value away from you and give you back another type which is now responsible for managing your original value.

-----

* Note: "nothing" is perhaps a bit too strong but conceptually it helps to think of ownership as its own thing.


> Also, the whole 'move' terminology sounded to me like Rust would move memory around

That's exactly what happens: a Rust "move" is under the covers a memcpy().

> but it usually referred to the move of ownership.

That's also what happens. For instance, when you move a Vec, you are moving (through memcpy) only three words of memory (a pointer, the length, and the capacity), but you are also "moving" the ownership of the memory behind that pointer. That is, the new Vec is responsible for managing that memory, and the old Vec can no longer be used.

> Also 'consume' didn't make no sense to me. "into_iter consumes a vector" what does it mean?

It means that it takes ownership of the vector, and doesn't give it back.

> Where does the vector go?

It's destroyed within that function. In the case of into_iter() of a vector, it's broken into its component parts (the pointer, the length, and the capacity), and the struct into_iter() returns (which is not a Vec, but something else) is now responsible for managing the memory behind the pointer, and releasing that memory once it's no longer needed.


> That's exactly what happens: a Rust "move" is under the covers a memcpy().

Not exactly. The Rust compiler is usually pretty smart about optimizing out memcpy for most moves.

For example when returning a large struct from a function, the optimizer could add an "out" reference parameter to the function and the function would directly initialize the return value into that reference.


> That's exactly what happens: a Rust "move" is under the covers a memcpy().

Is it really? I had always assumed that the compiler would almost always optimize a move down to nothing, simply reusing the address. If access to the address is still exclusive, what would be the benefit of moving it in memory?

memcpy sounds like what would happen in the case of a copy or clone, not a move.


Yes, the optimizer (LLVM) can often optimize memcpy(), and this is not exclusive to Rust: the same optimizations happen with C and C++ (a well-known trick to avoid some undefined behaviors in C is to do a memcpy(), which the optimizer will then elide). It doesn't change that, before the optimization passes, it's doing the equivalent of a memcpy().

> memcpy sounds like what would happen in the case of a copy or clone, not a move.

In Rust, copy and move are nearly identical; the only difference is that, after a move, the compiler doesn't let you access the old value. Clone is different, since it can be overridden to do anything the programmer desires, but by default, it's identical to copy.


Generally speaking, I've found Rust terminology confusing. For example, why does the `IntoIter` iterator gives you owned values? What the heck does "Into" mean to a Rust developer? There are lots of things like that. I've found that fluency in Rust is partly trying to understand the concepts and partly just learning to understand the specific meanings of certain vocabulary. I've likened it to working with Ruby on Rails. There are a lot of useful facilities, but they practically force you to think exactly like the original author -- who doesn't seem to quite speak like I do.


"Into" means it turns the Vec _into_ an iterator.

Normally an iterator iterates _over_ a Vec or other collection, without moving or destroying the Vec.

There's also iter_mut which iterates over the Vec but allows you to mutate elements in-place.

Those are the 3 big groups of iterators: Immutable, mutable, and the ones that consume (i.e. destroy) the vector.


I think the thing is that because the IntoIter consumes the collection, you are right that it is turning it "into" an iterator... I guess. But for me that's not the intent of what I'm doing. If I use iter() or mut_iter(), I'm also getting an iterator. The fact that it doesn't consume the collection is not the thing I care about. I don't program thinking "Oh, I want to consume this collection when I create the iterator, so let's use into_iter()". I think, "Oh, I want to get owned objects out of the iterator. As a side effect of that, it will consume the collection". For me, anyway, it's just a seriously backwards way of saying what you want.

I find that, similar to my experience with Rails (and Rails developers), once you've drunk the Kool aide, it starts to make sense. You are fluent in the language that is being used and you can't think any other way about it. Even if it is awkward, it seems normal and obvious. But when you first approach it, it can be hard to acquire the idea.


I could see that. It was a little confusing to me too. At first I couldn't figure out why anyone would want into_iter(). Didn't realize it returned owned objects.

Now that I think about it, how does into_iter() work on a Vec? Is it just copying the data out of the Vec anyway? Because the data in a Vec is allocated on the heap, so how can you get an owned object that isn't just a reference (pointer) to the heap memory without copying?

If that's the case, is the only benefit of into_iter() that you don't have to explicitly deref and copy the items?


> What the heck does "Into" mean to a Rust developer?

Into means it will consume the value (you don't have it anymore): https://doc.rust-lang.org/1.0.0/style/style/naming/conversio...


(Please note this document is outdated; you’ve linked to the 1.0.0 docs. This particular page is pretty ok but we don’t ship this at all anymore, and haven’t for years.)


I think the maintained equivalent would be in the Rust API guidelines:

https://rust-lang.github.io/api-guidelines/naming.html#ad-ho...


I also came to Rust from JS! I plan to write an article in the future comparing Rust to Typescript.


This is a fantastic article, please someone bump this up to the Rust book.


This is an EXCELLENT article. Point # 8 should be directly in the Rust book.

"Once a variable is bounded by a lifetime it is bounded by that lifetime forever. The lifetime of a variable can only shrink, and all the shrinkage is determined at compile-time."




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: