This seems very hard for a programmer to reason about. It’s odd to do this sort of flow-sensitive analysis on the CFG for a language feature that affects whether code is accepted or rejected, instead of some optimization that’s transparent to the programmer.
So originally, lifetimes were lexical only. And when the move towards non-lexical lifetimes was being considered, this aspect was discussed heavily.
It turns out that programmers seem to think about scope in a cfg style, and don’t really think in lexical scope, even though it’s “easier.” Tons of people, even new ones, would look at some code Rust rejects and be like “but obviously this is safe because:” and then state the non-lexical reasoning. People generally find the non-lexical borrow checker to be far easier to use.
I think it is actually one of the larger things Rust has discovered, that doesn’t get talked about very much. And it’s also a great demonstration that simpler doesn’t always mean easier.
- The cfg style checker accepts a strict superset of the programs that a lexical checker would (so if you're thinking lexically then your thinking will still work).
- The checker is (modulo bugs) sound, so if it thinks a program is valid then it is (so you don't actually have to think deeply about the complex cases if you don't want to).
Not necessarily. Imagine a checker that accepted lexically valid programs and also cfg-valid but lexically-invalid programs where the number of lines was odd; such a checker would fulfil your criteria but probably be more difficult to use than a purely lexical checker.
IMO Rust has already handicapped their language evolution with NLL and the more complex they make the checker the worse a pit they're going to dig themselves into. As long as it's simple enough to understand it's not too bad, but if we get to the point where people can't reason about the checker there'll be no coming back.
I think they should have focused on making lexical lifetimes more usable and lightweight, in a general way that was available to functions rather than just language keywords. Think making it possible to write if/while/for as regular functions in the language - which would also make it possible to write async a lot less painfully, because you'd be able to write your async control flow functions "in userspace". Instead they've chosen to deeply intertwine their particular control flow model with their borrow checker, which means changing or experimenting with those control flow keywords (or introducing new ones) is going to be a lot more costly, and any radical new ideas (whatever the next 10-20 years' equivalent of async is) are going to be much harder to introduce.
This is rather fascinating to me, as someone who maintains a flow-sensitive checker. No relation to the Rust borrow checker, since mine is an unsound linter for C++ code. Though there are similarities, as mine is also about memory safety (it looks for pointers that are live across a function call that might GC, and therefore could invalidate them by moving the data they're pointing to.) I spend a lot of effort trying to make the errors comprehensible, so I have to think about the mental models people are using.
The entry point for people is almost always a reported error. The notion "live across a GC call" isn't natural to most people, so I dump out the entire live range (which might be from the previous iteration of a loop, for example) in the hopes that it'll sort of tell a story: "see, you set the variable's value to a pointer here, then you run this code, and this function call within it might GC, and then you use the variable's value here". But that's just an example path through the function, and potentially confusing if that particular path is not possible in practice (but another unmentioned path is). That's the risk of having a more precise analysis: there are more ways that the specific error can be wrong even though the report is a true positive.
Earlier, the analysis was simpler to mentally model: is there a path fragment that mentions variable X, then calls something that can GC, and finally mentions X again? But since then, I've added much more precise data flow to check whether the final use of X is an actual use and not clobbering the previous value (in which case, its live range ended sometime earlier). As with the borrow checker, that just provides reasons to not report an error that the simpler analysis would have, but those do matter to the user. Especially when they're struggling to restructure code, and having to mentally do the same analysis to figure out what changes will fix the problem.
In my experience, there are usually simple code fixes that eliminate a problem entirely and would have worked with the simpler analysis. For the more complicated cases, users are generally pretty appreciative when the analysis is smart enough to allow fixes to be more targeted and natural. (They aren't appreciative of the false positives that the more sophisticated analysis never showed them, because they pretty much only think of this stuff when they get an error report. But that's ok!)
That sounds like an interesting project! I wonder if the soundness is part of it, see nicoburns' sibling comment. But error messages are quite hard. If I recall correctly, in Rust's case, error messages got better with NLL, but that may have been because a tremendous amount of work was put into it by people who were concerned that it was going to become harder to understand, rather than some property of NLL itself. It certainly was an active concern at the time, though the exact timelines and steps are a bit fuzzy to me these days.
Is it that the CFG version better reflects programmer’s intuition about “what should pass the borrow checker,” or that programmers can more easily predict, after understanding the rules, whether the CFG version will accept or reject particular code?
I describe it as intuition because people talk about these things intuitively. They say things like "the borrow checker finally accepts code I know it should have accepted all along". For example, let's look at the example used in the very first part of the NLL RFC:
This code works just fine. But maybe we want to pull the slice out into a variable. So we do this:
fn foo() {
let mut data = vec!['a', 'b', 'c'];
let slice = &mut data[..];
capitalize(slice);
data.push('d');
}
This code would fail to compile under the old borrow checker. Why? Because you have two &mut references pointing to the same thing. People are, I think, rightfully surprised that this code doesn't just work. It is very unintuitive that extracting a temporary to a variable causes code to no longer work. But that is exactly it: you are now changing the scope from small to large, thanks to lexical scoping.
But people didn't talk about this code as like "Oh I understand that I am introducing a lexical scope and that means it conflicts." They talk about it like "but I don't use slice anywhere after the capitalize call. This code is safe. There's no point where these two mutable references are active at the same time. And you know I'm right because the temporary version works just fine, and this is equivalent." That is, their perception of scope maps closer to a CFG than to lexical rules.
Frankly, I am not even sure that many Rust programmers could describe what a control flow graph is, let alone that they somehow learn the exact rules and then apply them. People don't learn programming languages that way, just like they don't learn human languages that way. It is largely a practice of building up intuitions and testing them out.
> Frankly, I am not even sure that many Rust programmers could describe what a control flow graph is, let alone that they somehow learn the exact rules and then apply them. People don't learn programming languages that way, just like they don't learn human languages that way. It is largely a practice of building up intuitions and testing them out.
I think a lot of programmers want to be able to understand how the language works in terms of formal rules and predict whether a given piece of code will pass the compiler without guess-and-check. Indeed, it seems like a pervasive complaint about Rust is that it requires too much of this “try things until it compiles” approach, and makes it hard to develop a clear mental model of how the borrow checker works.
As a novice Rust programmer I agree with everything here lol. I don't know what a CFG is, and the 2nd code block not compiling breaks referential transparency which is very very weird to me.
If i understand correctly, the lexical scope idea is just saying that `slice` exists in scope until the end of the block regardless of what happens inside the block, whereas CFG is smarter and says that since we don't use slice after line 3, its not in scope anymore. So from my POV the non lexical scope is unequivocally better, more fine-grained reasoning from the type system.
Yes. The first code example in the blog post, referred to as "Problem Case #3", shows a case that intuitively should work, is rejected currently, and would be accepted by Polonius.
If I understand the article correctly (and I'm not a Rust programmer), the above is an optimization that permits the programmer to write code that is correct with regard to object lifetimes, but currently rejected by the compiler. This is because the current borrowing model is too simple to account for that particular case, and hence, a new model is being developed that makes compiling this code possible without sacrificing correctness.
But, still, the whole selling point of Rust is "flow-sensitive analysis (...) for a language feature that affects whether code is accepted or rejected". Accepting or rejecting some code based on whether it fits the memory model of that language is the main advantage that Rust has over other strongly typed statically compiled languages.
“Until last year” is technically correct but not really accurate. Lifetimes were made non-lexical in 2018, but only for Rust 2018 code and newer. Last year is when they were made non-lexical for 2015 code as well. For the vast majority of Rust developers, lifetimes have been non-lexical for five years now.
This is done so that the programmer doesn't have to reason about it. Polonius makes the compiler accept code that's obviously valid, but the current borrow checker isn't sophisticated enough to declare it as safe.
I've bumped into this kind of problem when writing rust, at first it was hard to understand why the compiler doesn't accept my code.
This kind of control would drive me up the wall. If I write code and my logic is wrong I appreciate compiler being able to tell me to sod off. Doing it when the logic is perfectly valid is a turn off.
That's the draw of having a borrow checker, which is arguably the contributor to the biggest pain points of Rust. Everything is a tradeoff. Rust's "unsafe" blocks are also a tradeoff. To my understanding, Rice's theorem ensures that, for checking borrows, either all invalid programs and some valid programs are rejected, or all valid programs and some invalid programs are accepted. Given Rust's goal of overall safety, the conservative route (which rejects some valid programs) is favored.
Some languages require you to declare your variables, and if you use an undeclared variable, the compiler complains and refuses to emit any code. Is that a turn off for you, too?
> It’s odd to do this sort of flow-sensitive analysis on the CFG for a language feature that affects whether code is accepted or rejected
Is this something new here with this rendition of the borrow checker? I thought the whole premise of the borrow checker was to do a data flow analysis, decide whether to accept or reject the program, and if necessary to provide information about why the program was rejected.
It looks like what this project is trying to accomplish is to use better control flow mapping to reduce the number of unnecessary rejections that programmers see.
Until last year, the borrow checker operated on lexical scopes, which are apparent in the source code and the programmer can reason about. Last year, the borrow checker was changed to support non-lexical lifetimes by operating on the control flow graph—something the programmer never sees and is hard to reason about. This project continues running further in that direction, making it harder for the programmer to understand why the borrow checker does what it does.
The new implementations are more permissive than the old ones*. So if your program is rejected, it would have been rejected under the old, simpler implementation as well, and you can use your intuition of the simpler rules to understand why.
*excluding bugfixes, of course
What's actually hard to reason about is problem cases like in TFA which seem like they should be valid, but the compiler rejects anyway. These are cases where people's intuition doesn't match the current implementation, so I think moving the implementation closer to people's mental model of borrowing will make the language easier.
I believe not. The method signature indicates that each reference claims exclusive access to the struct, so it may even be backwards incompatible to add the new behavior without a syntax addition. I believe one of the proposals to this end is "view types".
I don’t see how that code is safe or can be considered so. Your signature suggests that you mutably borrow self. Even if you only use a member, the signature could confuse fellow developers that they can play with the whole object.
I’ll use an Arc or an Rc depending on the need of thread safety. I think way too many people are over optimizing their Rust programs. Unless you are building for resource constrained devices, I think the cost of atomic primitives is negligible.
It can be safe if the language provides a mechanism to tell the compiler that borrow_field1() and borrow_field2() will never mutably borrow the same fields. This would be part of the public API of those functions, and compilation would fail if the implementations were changed to borrow the same fields.
I’m curious how adding this layer of additional flow analysis will impact compilation times. The article sort of glosses over this risk with worrying phrases like “hopefully efficient enough” and “harder to do efficiently”. The scope of what’s still left to do and even undesigned also makes the goal of incorporating this into the 2024 edition seem quite optimistic.
`cargo check` is relatively fast compared to `cargo build`, so at least currently, borrow checking is a small task relative to code generation, optimization, and linking.
There have been many similar changes in the compiler's lifetime, and so far additional type checking work hasn't caused noticeable slowdowns, apart from a few accidentally-quadratic bugs.
Well, if the doubtlessly complicated code needed to parse such uncommon borrowing scenarios is only called when such a scenario is found, then it should only affect compilation times for code using such scenarios (which is, I hope, uncommon)?
I doubt it all that uncommon. Rust reject A LOT of perfectly valid ideas. Or, at least, it used to. Apparently this has been getting better.
If you look around at lifetime recommendations, for a while they were so annoying that most people were saying “just force a copy and don’t use lifetimes”.
There's more nuance to "don’t use lifetimes". Novices typically don't understand the relationship between owning and borrowing, and try to use temporary references where they semantically don't belong (in Rust ownership can't be as ambiguous as in GC languages, and references can't be made out of thin air — they borrow from somewhere that must already be owned/stored somewhere).
So often the advice is not "don't use lifetimes, because the borrow checker can't handle them", but rather "don't use lifetimes where an owned value is required" or "don't use lifetimes until you understand how ownership works".
If the scenarios were that uncommon, it wouldn't be worth fixing. And now that they're accepted, they will undoubtedly proliferate. This is the new common.
Language design and compiler engineering is about tackling progressively smaller niche cases, as the prior ones are addressed. One can draw a line in the sand and say "I'm done, the next niche use case is too niche and the added maintenance burden isn't worth it", but where that line is drawn is different for different products.
The thing is though that a lot of these problems are NP-Hard, so you're often better of with a general problem solver that knows tricks how to optimise the search for the solution, than something hard-coded where it's impractical to do that kind of dynamic programming.
On the other hand, there may be a greater gain in specializing the solver for the problem you're tackling. Besides, they can copy some of the algorithms.
As someone who was a bit obsessed with Hamlet as a teen, it's hillarious that they named this Polonius. Polonius is Ophelia's father in the play, and they probably named it because of his famous line:
Neither a borrower nor a lender be,
For loan oft loses both itself and friend,
And borrowing dulls the edge of husbandry.
He's a tragic but somewhat foolish figure, prone to fretting and offering kind of annoyingly obvious advice of which the above is an example.
It may be obvious, but it's still worth stating! I loaned $1000 to a friend in my 20's, and ended up losing both the friendship and the money (though one was not wholly causal of the other).
That's great. In my case, the person said he needed the money because of a rent payment deadline, but then something else happened such that he could make the deadline anyway, so he spent the money I'd lent him on jewelry for his girlfriend, with no regard for the reason I lent it to him. I lent it to him because of what I thought was a need, and then he used it merely to pamper his GF, and made no motion toward paying me back.
Later in his life he went to prison for a year (for unrelated reasons).
I think the positive experiences people are presenting are great. I don't think the Hamlet quote is remotely a hard-and-fast rule, but it does embody a principle that is worth stating, and which can be useful food for thought in making a lending decision.
Absolutely! But I didn't know that at the time of the loan. It turned out it was a test of character, a test whose results were demonstrated when he eventually spent a year in prison.
I believe this is type of BS pushed by people with the agenda. What kind of friend you are (talking about real friends, not drinking buddies) if you let your friend sink.
My friend loaned me $3000 cash when I was at the vet and my dog needed an urgent surgery. And I paid him back and he did not loose money and we did not loose friendship. And later on when I started to make good money I sometimes loaned to friends. One could not pay me back but we did not loose friendship either as he simply had circumstances beyond his control. I just told him consider it a gift.
I don't think accusing folks of having an "agenda" is a productive argument. I do think some of the rest of your post is valuable.
I have both loaned money and borrowed from, at various points in my life, friends and family. When lending, I have always looked at it as 1) A way to see how reliable my friend/family really was. 2) A sort of 'gift" - knowing full well they may not repay- thus I only loan what I am willing to lose. There have been times when I have lost more money than I would have liked to, but I knew it was a possibility going in, and thus I did not lose the friendship. I just know that at that point in time, the person I loaned to was not as reliable as I had hoped, and I know to be cautious in the future.
There have been many great business deals conducted between friends. If we decide not to do business with our friends because it is possible that the deal will go sour, we are left doing business with strangers, and we lose a lot of potential opportunities, not just to improve finances, but also to strengthen relationships.
> And I paid him back and he did not loose money and we did not loose friendship.
You know all this talk about friendship, but you still mention "And I paid him back" in the same breath as "did not loose friendship". Sometimes you have to deal with the fact that you will never see the money again over such trivial things as a thoroughly mismanaged business venture.
What sort of agenda do you imagine people have? That you had a good experience with a friendly loan doesn't discount the bad experiences of thousands of people who did lose their friendships.
We've given money to friends on multiple occasions to pay for medical bills, groceries, and the like, but every time I've entered a reciprocal business arrangement with friends or family (be that a loan or a paid work arrangement), I've regretted it immensely.
Everyone's situation is different. The fact that you've regretted it means zilch for somebody else. Maybe it is you an the type of friends / family you have. I helped my daughter few times and did not think one second about it. No regrets. You can not and should not generalize a particular situation.
You're also generalizing though. The point is that either outcome could happen and you should be mindful of that. If you specifically never have an issue like that, good for you! But other people have been burned, and it's not any more valid for you to dismiss their experience.
Thanks for the explanation! But if his advice is to completely forego borrowing and lending, it's a bit counterintuitive to use his name for the borrow checker? OTOH, the borrow checker can certainly be described as "prone to fretting and offering kind of annoyingly obvious advice", so maybe it's appropriate after all?