> Of course they are considered problematic in Rust.
> And leaks are hard to code by accident in Rust.
> ...
I enjoyed this gem and its descendants from the comments. What I see instead, commonly, even in big rust projects, is that it's easy to accidentally define something with a longer lifetime than you intend. Some of that happens accidentally (not recognizing the implications of an extra reference here and there when the language frees things based on static detections of being unused). Much more of it happens because the compiler fights against interesting data structures -- e.g., the pattern of allocating a vec as pseudo-RAM, using indices as pseudo-pointers, and never freeing anything till the container itself is unused.
There's nothing wrong with those techniques per se, but the language tends to paint you into a bit of a corner if you're not very good and very careful, so leaks are a fact of life in basically every major Rust project I've seen not written by somebody like BurntSushi, even when that same sort of project would not have a leak in a GC language.
I think that regardless of what references you have, Rust frees values at the end of their lexical “scope”.
For example, in the linked code below, x is clearly unused past the first line, but its “Drop” implementation executes after the print statement at the end of the function.
The takeaway is that if you want a value to drop early, just explicitly `drop` it. The borrow checker will make sure you don't have any dangling references.
In general, I think "lifetimes" only exist in the context of the borrow checker and have no influence on the semantics of Rust code. The language was designed so that the borrow checker pass could be omitted and everything would compile and run identically.
The vec pattern has some advantages though, in particular you can often get away with 32 bit indices (instead of 64 bit pointers), making it a little more cache-friendly. I did it for regex ASTs that are supposed to be hash-consed, so never need to die until the whole matcher dies [0].
A more contrived example is Earley items in parser that point inside a grammar rule (all rules are in flat vec) and back into previous parser state - so I have 2 u32 offsets. If I had pointers, I would be tempted to have a pointer to grammar rule, index inside of it, and pointer to previous state [1], so probably 3x the space.
In both cases, pointers would be easier but slower. Annoying that Rust doesn't really let you make the choice though...
By all means. It's a pattern I use all the time, not just in Rust (often getting away with much less than 32-bit indices). You mentioned this and/or alluded to it, but my core complaints are:
1. You don't have much choice in the matter
2. When implementing that strategy, the default (happy path coding) is that you have almost zero choice in when the underlying memory is reclaimed
It's a pattern that doesn't have to be leaky, but especially when you're implementing it in Rust to circumvent borrow-checker limitations, most versions I've seen have been very leaky, not even supporting shrink/resize/reset/... operations to try to at least let the user manually be more careful with leaks.
I understand the problem isn’t that the tools exist, it’s that there are Rust users who are not aware of the concept and its evolution over the years, which I’d argue is not a uniquely Rust issue.
Rust (most of the time, I'm not arguing about failure modes at all right now, let's pretend it's perfect) drops items when you're done with them. If you use the item longer, you have a longer implicit lifetime. The memory it references will also be claimed for longer (the reference does not outlive its corresponding memory).
You only fix that by explicitly considering lifetimes as you write your code -- adding in concrete lifetimes and letting the compiler tell you when you make a mistake (very hard to do as a holistic strategy, so nobody does), or just "git gud" (some people do this, but it takes time, and it's not the norm; you can code at a very high level without developing that particular skill subset, with the nearly inevitable result of "leaky" rust code that's otherwise quite nice).
I don't know where you got this idea but this is wrong.
> you use the item longer, you have a longer implicit lifetime. The memory it references will also be claimed for longer
This is only true at the stack frame level, you cannot extend a variable beyond that, and there's no difference with a GC in that particular case (a GC will never remove an element for which there's a variable on the stack pointing towards it)
> You only fix that by explicitly considering lifetimes as you write your code
Lifetime parameters don't alter the the behavior, and their are either mandatory if the situation is ambiguous or entirely redundant. And it only works at the function boundaries, where lifetime extension cannot happen anyway. (See above)
Please stop spreading bullshit criticism that have no grounding in reality (Rust has real defects like everything else, but spreading nonesense isn't OK).
If you heap allocate something and store a reference to it, Rust will absolutely keep that object alive till all such references are gone. It's not "bullshit [with] no grounding in reality."
That property is, yes, what you would see in a GC language (e.g., Entity Framework back in the day in C# would have a leak if you tried to create short-lived objects from long-lived objects using the built-in DI, since EF had a bug maintaining a strong reference to those objects tied to the lifetime of the outer object). My observation is that people tend to do that much more in Rust than in GC languages because the compiler makes other coding patterns harder.
Rust does not keep variables alive until references are gone. It checks to make sure that any reference lives as long as or shorter than its referent. If the reference is longer lived than the referent, that’s a compile error.
Can you share a rust playground link with an example of what you are describing?
I still don’t follow how a reference can extend the lifetime of some memory, even on the heap.
Sounds like you are describing a language that uses reference counting to manage memory, and I’m not sure what purpose a borrow checker would serve in a reference counted language.
I did, because that's the only thing that can be kept alive longer than needed (till the end of the scope where the variable is defined).
> If you heap allocate something and store a reference to it, Rust will absolutely keep that object alive till all such references are gone.
No it doesn't.
And it shows you have no understanding of how Rust works. A GC will do that, but the borrow checker won't, it cannot, all it can do is to yell at you so that you make your variable live as long as it needs.
Why doesn't anyone rewrite some of these tools in a language that can be compiled to a native binary by GraalVM and benefit of all Java's security guarantees?
Would it be too slow? Don't the advantages outweigh the inconvenients?
Why go for Java when you can go for .NET? Or Go? .NET seems to perform on par and seems to produce smaller executable, and Go seems to be faster in general.
Personally I don't really care what language common tools are written in (except for programs written in C(++), but I'll gladly use those after they've had a few years to catch most of the inevitable the memory bugs).
I think the difference is that there aren't many languages with projects that actually end up writing full suite replacements. There's a lot of unexpected complexity hidden within simple binaries that need to be implemented in a compatible way so scripts don't explode at you, and that's pretty tedious work. I know of projects in Rust and Zig that intend to be fully compatible, but I don't know if any in Java+GraalVM or Go. I wouldn't pick Zig for a distro until the language hits 1.0, though.
If these projects do exist, someone could probably compare them all in a compatibility and performance matrix to figure out which distribution is the fastest+smallest, but I suspect Rust may just end up winning in both areas there.
Why use Java when you can use Rust? In all seriousness, Rust is a joy to work with for these kind of tools which typically don't have complex lifetimes or ownership semantics.
On top of that you get better performance and probably smaller binaries. But I would pick Rust over Java for CLI tools just on the strengths of the language itself.
The claim was “leaks are hard to code by accident”. I agree with gp that this is false.
Preventing leaks is explicitly not a goal of rust and making lifetimes correct often involves giving up and making unnecessary static lifetimes. I see this all the time in async rpc stream stuff.
I think leaks are indeed harder to code by accident in non-async Rust, which was the original setting Rust was developed on. I wouldn't say it is absolutely hard (that really depends on the architectural complexity), but there seems some truth in that claim.
Leaks are as hard to do in Rust as with a GC though.
That is, they aren't impossible and you'll eventually have to fight against one nasty one in your job, but it's far better than without a GC or borrowck.
Cycles with reference-counted pointers are how leaks are caused in Rust and it's specific to Rust, but GCs have their own sources of leaks and in my experience they occur roughly as frequently as cycles in Rust (rarely but not exceptionally either)
When a type is 'static, it doesn’t mean it’s leaked for the lifetime of the program. It just means it owns everything it needs and doesn’t borrow anything. It will still free everything it owns when it is freed.
What your link says is correct about `&'static T`, a lifetime on a reference. What GP says is correct about `T: 'static`, a type constraint. They both use the "keyword" 'static, but in different senses.
`'unbounded` isn't a bad idea. With my team when someone new to Rust hits the static type constraint I usually tell them an overly simplified "is not a reference" followed up with "The data does not have any constraints (or bounds) on its lifetime that this code needs to worry about."
I don't know if `unbounded` would immediately be understood by someone new to Rust, but it captures that `T: 'static` is a negating assertion. I.e. it largely says what it can't be.
Unless one is writing something where pauses are a no go, even a tiny µs, I don't see a reason for rushing out to affine type systems, or variations thereof.
CLI applications are exactly a case, where it doesn't matter at all, see Inferno and Limbo.
I genuinely don't understand why this is the top comment, as it is almost complete BS and the author confuses the behavior of the borrow checker with the one of a GC (and ironically claim a GC would solve the problem when in fact the problem doesn't exist outside of a GC's world)
There's no confusion between what a borrow checker and a GC do. The borrow checker enforces safety by adding constraints to the set of valid programs so that it can statically know where to alloc/dealloc, among other benefits. A GC dynamically figures out what memory to drop. My claim is that those Rust constraints force you to write your code differently than you otherwise would (since unsafe is frowned upon, this looks like hand-rolled pointers using vecs as one option fairly frequently), which is sometimes good, but for more interesting data structures it encourages people to write leaks which wouldn't exist if you were to write that same program without those constraints and thus wouldn't exist in a GC language.
> There's no confusion between what a borrow checker and a GC do
There is, with Rust you cannot extend the lifetime of an object by keeping a reference to it, when this is exactly the kind of things that will happen (and cause leaks) with GC.
> this looks like hand-rolled pointers using vecs as one option fairly frequently
Emphasis mine. Idk where you got the idea that was something frequent, but while it is an option that's on the table in order to manage data with cyclic references and is indeed used by a bunch of crate doing this kind of stuff, it's never something you'd have to use in practice.
(I say that as someone who's been writing Rust for ten years in every possible set-up, from embedded to front-end web, and who's been teaching Rust at university).
> the pattern of allocating a vec as pseudo-RAM, using indices as pseudo-pointers, and never freeing anything till the container itself is unused
Are you talking about hand-rolled arena allocation? I don´t see how a GC language would have a different behaviour as long as you also use arena allocation and you keep a reachable reference.
> There's nothing wrong with those techniques per se, but the language tends to paint you into a bit of a corner if you're not very good and very careful, so leaks are a fact of life in basically every major Rust project I've seen not written by somebody like BurntSushi
If I take 3 random major Rust projects like Serde, Hyper, and Tracing, none of which are written by BurntSushi, your claim is that they all suffer from memory leaks?
I wouldn't be surprised if that style of leak were more prevalent than one would expect. It's pretty subtle. But that link is the only such instance I'm aware of it happening to such a degree in crates I maintain. Maybe there are other instances. This is why I try to use `Box<[T]>` when possible, because you know that can't have extra capacity.
I find the GP's overall argument specious. They lack concrete examples.
Curious why you would call that a memory leak? The memory is still accounted for in the Vec and will get released properly when it deallocates, right? This looks like optimizing memory usage to me, not plugging a leak.
If you sit down and really think about it, I think you'll find that a precise definition of a leak is actually somewhat difficult.
I am nowhere near the first person to make this observation.
I point this out to avoid a big long thread in which we just argue about what the word "leak" means. You could absolutely define "leak" in a way that my example is not a leak. But I prefer a definition in which "leak" does include my example.
I do not care to litigate the definition. If you want to recast my example as a "space" leak instead of a "memory" leak, then I don't object, and I don't think it changes the relevance of my example. (Which I think is absolutely consistent with the context of this thread.) In particular, I don't think "memory leak" in this thread is being used in a very precise manner.
Short of using a different data structure, I'm not sure how you would get out of that one. The claim was that some of these leaks ("leaks"?) could be avoided by using a language with a GC. As far as I know, most modern languages' equivalent of Vec will do exactly the same thing, GC or not.
I enjoyed this gem and its descendants from the comments. What I see instead, commonly, even in big rust projects, is that it's easy to accidentally define something with a longer lifetime than you intend. Some of that happens accidentally (not recognizing the implications of an extra reference here and there when the language frees things based on static detections of being unused). Much more of it happens because the compiler fights against interesting data structures -- e.g., the pattern of allocating a vec as pseudo-RAM, using indices as pseudo-pointers, and never freeing anything till the container itself is unused.
There's nothing wrong with those techniques per se, but the language tends to paint you into a bit of a corner if you're not very good and very careful, so leaks are a fact of life in basically every major Rust project I've seen not written by somebody like BurntSushi, even when that same sort of project would not have a leak in a GC language.