
Pre-Pooping Your Pants with Rust - Manishearth
http://cglab.ca/~abeinges/blah/everyone-poops/
======
PieSquared
I hope that the Rust devs end up doing the "right thing", which from this
article seems like going back and fixing Rc, instead of just marking
mem::forget as safe. They're still pre-1.0, and anything they do now they will
be stuck with for a long, long time, especially if Rust succeeds as much as I
hope it does.

Delaying 1.0 by a few weeks may seem like a big deal, but ultimately it is a
_self-imposed_ deadline. It's great to have those, but following them
dogmatically might not be the best strategy. In cases like this, I generally
lean towards slowing down and doing things _right_ : otherwise you will pay
the price ten times over later.

(That said, while I understand this issue, I don't know very much about the
context in the Rust community, so I'm not actually sure that mem::forget
_should_ be unsafe. It was just the impression from the article and from
previous Rust code I've read/written.)

~~~
Manishearth
I'd prefer the `Leak` based solution too, but it's going to cause a _lot_ of
churn. 1.0 has already been delayed often, every time it gets planned and then
not followed through on because the date is fuzzy and ignorable. This time
they've set a concrete no-nonsense date which we should follow through on IMO.
We already had a second alpha.

~~~
krick
I guess it's better that way than saying something broken is "ok" only because
"I promised myself that I'll finish until next Monday". It's more like
Ubisoft-style, or something, except deadlines really matter for them, because
money depend on it, and they cannot just "move the deadline" because of all
the advertising must be timed, people tend to buy games and go to cinema more
on specific dates, etc.

For project like Rust deadline doesn't actually matter that much, grasping for
it is almost stupid. The only thing that actually can suffer from breaking it
is self-esteem, which should't worry a reasonable man much. But magic "1.0"
number does matter a bit more than just deadline, because after it there's no
"breaking changes".

So I'd be much happier if Rust wouldn't reach 1.0 for the next 2 years, but
would actually become satisfying instead. Somehow "non-stable but working" is
better suited for making software than "broken and stable". We have plenty of
"broken and stable" out there already.

~~~
kibwen
Rust is not seeking pure programming perfection, it is seeking to be useful
and to fulfill its goals of safe systems programming, and it is fulfilling
these goals with flying colors. There's absolutely no reason for this to push
the release date. There was a bug in a stdlib API, and that issue was fixed
weeks ago. Trying to pivot the language by taking a fundamental feature back
to the drawing board for nebulous benefit would be utter foolishness at this
point.

------
garethrees
There is a general problem with combining destructors and automatic collection
of reference cycles, and its one that several languages have unexpectedly
found themselves facing.

For example, Java has the function System.runFinalizersOnExit, but if you look
at the documentation you'll see that it now says, "Deprecated. _This method is
inherently unsafe. It may result in finalizers being called on live objects
while other threads are concurrently manipulating those objects, resulting in
erratic behavior or deadlock._ "
[http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.h...](http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#runFinalizersOnExit\(boolean\))

The problem is discussed in detail in Hans Boehm's 2002 technical report
“Destructors, Finalizers, and Synchronization”.
[http://www.hpl.hp.com/techreports/2002/HPL-2002-335.pdf](http://www.hpl.hp.com/techreports/2002/HPL-2002-335.pdf)

Boehm's key points are: (1) When a destructor on an object O runs normally,
anything that O points to is still alive (because of the reference from O),
but when a cycle is collected, not everyone can go first: that is, all but one
destructors in the cycle have to run after one or more of their references are
dead. It is hard to write destructor code that is safe in all cases. (2) Any
destructor that needs to update a concurrently accessed data structure has to
take a lock, but destructors on cycles run asynchronously with respect to the
rest of the program and so an unlucky timing leads to deadlock.

If this problem was better appreciated then language designers wouldn't go
down the rabbit hole of trying to figure out how to make destructors work
together with automatic collection of reference cycles, and instead try the
alternative approach of providing mechanisms for the program to run
destructors synchronously.

In the Memory Pool System we use a message-passing interface:
[http://www.ravenbrook.com/project/mps/master/manual/html/top...](http://www.ravenbrook.com/project/mps/master/manual/html/topic/finalization.html)

------
charlieflowers
Call Scott Meyers.

So you cannot safely assume your destructors will be called, and you need to
proactively take steps like this in case they aren't. Nothing in the natural
use of the language cues you to do this, so you need a moral guide like
"Effective C++" for Rust.

This was the kind of accidental complexity that Rust had hoped to avoid. It
seems like Rust _has_ avoided a lot of it. But I guess it was entirely too
optimistic to hope Rust could avoid all of it.

~~~
Gankro
Note that this is only something you _really_ need to think about when writing
`unsafe` code, which basically unlocks a whole new set of rules that you need
to worry about (zero-sized types in offsets, manual allocations, safety
boundaries, GEP/aliasing rules, uninit memory, double destructors, etc). This
is a drop in the bucket that anyone venturing to use `unsafe` will have to
face in some kind of "guide to unsafety".

~~~
mcguire
The problem is that all of this is _safe_ code. The problem, to my mind, is
that you _can write mem::forget_ (as in the code here's safe_forget)* without
ever using the unsafe keyword*.

Edit: I was wrong. The problem isn't with forget. This is hairy.

~~~
steveklabnik
The 'mem::forget can be written in safe code' meme comes from
[https://github.com/rust-lang/rust/issues/24456](https://github.com/rust-
lang/rust/issues/24456)

Which, as you can see, uses Rc, which uses unsafe code in a way that leads to
the soundness bug. So that's not strictly true, or rather, doesn't really
change anything, as it relies on the same bug.

~~~
mcguire
nikomatsakis, in that thread:

" _1\. There were other ways to forget even before Rc, at least in some cases.
For example, if T:Send holds, you could send the value to a thread that runs
an infinite loop, or which is deadlocked on a port._

" _2\. It is true that one cannot assume that a destructor will run, and hence
that forget is not itself unsafe (rather, it is unsafe to write a dtor that
must run)...._ "

And alexcrichton:

" _I commented on #24292, but the gist is that there are multiple ways to leak
memory today (e.g. #14875 and #16135), so a targeted solution at Rc may not
cover all use cases. Although as I mention in #24292 these other bugs can also
be considered separate bugs on their own which need to be fixed regardless
(but sometimes is quite difficult to do so)._ "

I especially liked dgrunwald's comment:

" _A safe mem::forget has the advantage that it makes it easier to write the
counterexample proving thread::JoinGuard unsafe. Safe mem::forget makes it
more likely that people will know that destructors are not realiable, so they
can avoid repeating this mistake._ "

But then, "Put a big freaking spike in the middle of the steering wheel and
get rid of the airbags and seat belts" has always appealed to me as an
automotive safety approach.

~~~
steveklabnik
Yeah, I mean, I guess that leaks in general are possible, which would still
let you write `forget`. You're right about that.

But the unsoundness RC bug still relies on unsafe code which was written
incorrectly.

~~~
sirclueless
I think you need to split the unsoundness bug, which happens because
thread::scoped uses unsafe code, from the "you can write mem::forget in safe
code" bug, which is arguably not a bug but is at least an enormous footgun
waiting to happen.

When the footgun goes off in unsafe code in the standard library, you get use-
after-free and memory unsoundness. When the footgun goes off in safe code
written by mere humans, you leak arbitrary resources.

------
VeejayRampay
A quick observation from an outsider: The article goes into a lot of depth,
seems highly technical, well crafted and researched but the whole "pre-poop
your pants" thing just ruins all that and makes it looks childish. I
understand that not everything about development has to be serious, but I also
think that aligning form and content is the surest way to make sure your
message comes across.

~~~
mrbig4545
I couldn't agree more. the article is well written, but the title puts me off
a lot.

How am I going to show this to my boss, and have him take it seriously with
such an infantile title?

~~~
nosefrog
You have the right to your own feelings, and I agree that the title may be off
putting. The audience for the article is rust core contributors and rust fans,
and they're a very informal bunch. If rusters are like similar communities I
know, talking about poop makes a blog post _more_ likely to spread.

EDIT: I do not want to argue with Dewey2, so I'm putting this here: the women
in the rust community that feel uncomfortable when they are addressed as
"guys" have _every right to feel that way_ , and I commend the rust community
on making that stand.

~~~
kzrdude
There's nothing to suggest all other rust devs like this way of presenting it.
I don't. I agree it's off-putting even if the issue itself is interesting.

------
zenojevski
Why on Earth are we still using such unsafe languages?

If the author had written this in a modern, safe language like Rust this would
not have happened.

~~~
Ygg2
Funny.

But the problem here is complex and hard to solve. Cycles and memory leaks are
nigh impossible to solve completely without sacrificing on that altar ability
to create custom data structures or no-GC by default. Which was non-starter
for Rust.

~~~
krick
Well, I'm still not quite proficient with Rust, so maybe my feelings are
misguided, but the whole situation seems really terrifying to me. Like, way
too terrifying than that we could just make a point about "not doing like that
again" and calmly move on to the 1.0.

After all, why Rust? Because we're all tired of problems with C++ and stuff.
And the point isn't really about making a language that would be safe "as
often as possible", it would be quite terrible goal, actually. The point is, I
believe, to make it nearly impossible to write unsafe code _accidentally_ ,
without even noticing it.

And again, maybe it's just me, but what I understood from that post is totally
non-intuitive to me. I don't really understand what are the guidelines and how
many more gotchas like that there could be. "Not to mess with unsafe code" is
not really good guideline, because it turns out that it can be not quite
obvious that the code _is_ unsafe.

~~~
Ygg2
Well, the thing is, until Rust is formally tested, as in a subset of Rust is
proven in CoQ to be safe, you don't have any guarantees.

I assume much of it is safe, with few probably hidden caveats, hidden in more
complex logic. In C++ writing code that causes this behavior is trivial, here
you need to jump through a lot of hoops. So it offers you comparatively more
safety, but this kind of bugs are potential gold mine for crackers.

Anyway core developers are on the scene deciding how to tackle this issue.

~~~
krick
Oh, sure, but let's be reasonable. If it's something like the notorious sort
problem in Java, I'd say it's ok. It's preferable when things are simple
enough that somebody with a good understanding of the whole system can make
assumptions about its safety without using CoQ. If things are complicated
enough that we actually need Agda/CoQ/Idris to feel safe — seems that things
went really wrong at some point.

> but this kind of bugs are potential gold mine for crackers

Exactly! And I'm just saying that a problem which is "not very likely to
trigger, but is very hard to find if happens" is far, far more dangerous than
a problem which is "easy for a novice to miss, but every experienced developer
would notice".

~~~
Ygg2

          > If things are complicated enough that we actually need Agda/CoQ/Idris to 
         feel safe — seems that things went really wrong at some point.
    

I disagree. Complex software/hardware is by its nature full of bugs. What Rust
set out to do is complex and it's bound to have bugs, but doesn't mean it's
reason to abandon it, no more than we should discard LHC or ITER because they
had issues starting up.

Whether it's a wrong invariant on TimSort or a really obscure multi-threading
case, or a weird glitch in time library, its still an error. Each one is
exploitable.

ALL software could use some coverage/tests and theorem proving. I just wish
Rust started with a proven safe subset.

    
    
          > Exactly! And I'm just saying that a problem which is "not very likely to 
         trigger, but is very hard to find if happens" is far,   
         far more dangerous than a problem which is "easy for a 
         novice to miss, but every experienced developer would notice".
    

Again, I disagree. The matter of security is one about making the assailant
less likely to breach your software. If you take two comparable pieces of
code, the one with more holes will take less effort to break.

If your language offers fewer avenues of exploit, makes these exploits more
expensive. A rare bug is worth more than a run of the mill bug. Which
prohibits the number of assailants which could purchase it, which reduces
number of potential attackers.

------
Gankro
Boring edition for people who have feels about maturity:
[http://cglab.ca/~abeinges/blah/everyone-
peaches/](http://cglab.ca/~abeinges/blah/everyone-peaches/)

( people getting mad about PPYP is really a victory, though; the second last
bullet was the most important:

> It's vaguely incoherent and meaningless on its own: this is a property
> inherited from its connection to the venerable Resource Acquisition Is
> Initialization (RAII) pattern.

I'm mad about RAII and not going to take it anymore! )

------
sirclueless
_Putting stuff in a HashMap that you 'll never ask for again is a kind of
leak. Allocating stuff on the stack and then looping forever is a kind of
leak. The case where you put it on the heap and forget to ever free it is a
specific kind of leak that people seem particularly terrified of for whatever
reason._

I too am particularly terrified of this kind of leak. The reason is that it is
the only kind of leak mentioned that is not clearly a programmer mistake.

C++ has a whole class of "mistakes" that one cannot be aware of simply by
reading the language specification. Common ones such as writing a constructor
and destructor but not a copy-constructor are now compiler warnings, but many
will pass under the radar. As a result, to write correct code one needs to be
aware of multiple conventions and rules and patterns that are external to the
language. When reviewing code one needs to do more than observe that it
compiles and accomplishes its stated task, there are probably several
organizational standards that need to be satisfied as well to be safe.

This is precisely the kind of thing that Rust was supposed to solve. The goal
was to have a language that didn't require an "Effective <xyz>" book to be
memorized to do code review. A language where any rule like "You should always
free resources in advance in case the destructor doesn't run" is captured
statically by a compiler error.

So yes, I am particularly put off by this error.

~~~
pcwalton
> A language where any rule like "You should always free resources in advance
> in case the destructor doesn't run" is captured statically by a compiler
> error.

This only shows up if you're specifically opting into unsafe code by using the
"unsafe" keyword. Rust doesn't (yet) try to ensure safety of unsafe code,
because it's unsafe. As long as you don't step into unsafe code, you can
ignore this entire article.

~~~
mcguire
Which unsafe code is that? In the article, I don't see any use of "unsafe" in
safe_forget[1] or either main[2] in the first part of the article.

[1] Which creates a reference-counted cycle containing the thing-to-be-
forgotten, ensuring the finalizer is never called.

[2] The first main simply demonstrates that the destructor is never called,
which is ok in itself but causes problems when you're doing smart things in
the finalizers, as in the second main, where the code uses the destructors to
ensure the threads are terminated and the programmer is assuming the borrow
checker is going to warn them if the threads aren't terminated.

~~~
steveklabnik
[https://github.com/rust-
lang/rust/blob/master/src/liballoc/r...](https://github.com/rust-
lang/rust/blob/master/src/liballoc/rc.rs#L408)

~~~
mcguire
Making mem::forget safe makes that point moot.

[https://github.com/rust-lang/rfcs/pull/1066](https://github.com/rust-
lang/rfcs/pull/1066)

~~~
steveklabnik
Forget isn't the reason that that block is unsafe.

~~~
mcguire
Right. That block is unsafe because it's doing unsafe things. (ptr::read and
alloc::heap::deallocate)

But the use of Rc is kind of a red herring; if mem::forget is marked safe,
then you don't need safe_forget because you can forget things (fail to execute
their destructors, specifically) safely anyway.

The problem is that destructors aren't guaranteed; the bugs in the thready
thing (and potentially unbounded other things) are symptoms. Drop needs big,
red warning signs.

The power of a notation, and a type system, is in what it lets you _not_ think
about. The fact that destructors may not be called _is_ , unfortunately,
something you have to think about.

------
aturon
I want to clarify a couple of points:

Most important, _this API problem does not break Rust 's basic safety
guarantee_. That is, you must write _unsafe_ code to violate memory safety.

 _If Rust 's basic memory safety guarantee were at risk, the core team would
absolutely consider delaying the release, or doing whatever else was needed to
address it!_ We must never allow memory safety violations in safe code, no
matter how obscure the bug that leads to them.

So this is a question of writing "safe" APIs that hide uses of `unsafe`. What
precise assumptions are you allowed to make within such code?

As it is, we are taking this API issue very seriously, and making sure that we
explore all of the options available. We have already yanked the affected
APIs, while we determine the best way forward.

There are known ways to safely re-introduce the (relatively few) places in the
standard library where unsafe code was using this pattern. Gankro talks about
one in the post; you can see a proposal here ([https://github.com/rust-
lang/rfcs/pull/1084](https://github.com/rust-lang/rfcs/pull/1084)) for the
`scoped` API.

The basic issue here is about a tension between a couple of different APIs, as
Gankro explained in the post. In particular:

\- `Rc` has been part of Rust for a long, long time, and is a fundamental
systems programming tool.

\- APIs like `scoped` and `drain_range` would like to use RAII for ergonomics
and consistency, but this is either not possible (`scoped`) or subtle
(`drain_range`) given `Rc`. _In either case, these APIs use `unsafe`
internally, so it 's all about what that unsafe code can assume_.

Furthermore, there is not a strong consensus about how best to resolve these
issues, even if we had all the time in the world. Currently there are at least
three camps:

\- Stick with the status quo (perhaps marking `mem::forget` safe to reflect
that reality). Safe code can assume no leakage, but unsafe code needs to take
extra care (and sometimes avoid the RAII pattern). This is already the case
for a large number of other properties, as Gankro points out. _Leakage in safe
code can never lead to memory unsafety_ ; it is just a vanilla bug, and Rust
doesn't prevent you from making bugs.

\- Introduce something like `Leak`, meaning that you have to explicitly ask
for a given type to be guaranteed not to leak through things like `Rc` cycles.
While that allows you to write APIs like `scoped` and `drain_range` easily
using RAII, you need to be aware of the marker, and make sure to use it for
such types. Worse, though, is the interaction with trait objects: depending on
the design, you may have to write `Box<MyTrait + Leak>` to be able to store
the trait object behind an `Rc`, and that `Leak` bound needs to be present for
the entire chain of APIs leading up to that point.

\- Restrict `Rc` in some way, perhaps to `'static` data, thereby ruling out (a
class of) leaks for all types. It's not completely clear how much fallout this
would involve. The compiler currently relies on non-`'static` reference cells,
and this change would likely force channels to use a `'static` bound as well,
thereby defeating much of the purpose of the `scoped` API in the first place.

 _Finally, it 's worth noting that at least one version of the `Leak` proposal
can be added backwards-compatibly, later on, so there is potentially plenty of
time to explore that approach if we feel the complexity is worth it._

I believe that Niko Matsakis (also on the core team) is planning to write a
blog post explaining all of the above in much greater detail.

------
mcguire
Largely unrelated question: In the code about a third of the way into the
article, there is,

    
    
        impl<'a> Drop for Foo<'a> {
            fn drop(&mut self) {
                *self.0 += 1;
            }
        }
    

What does the line " _self.0 += 1; " do? Specifically, the ".0"? Foo is a
basically a reference to an i32, so I would have thought that would be "_self
+= 1;".

~~~
steveklabnik
The .0 is a tuple access. Foo is a 'tuple struct' of length 1:
[http://doc.rust-lang.org/nightly/book/tuple-structs.html](http://doc.rust-
lang.org/nightly/book/tuple-structs.html) (though the syntax is explained in
the section on tuples: [http://doc.rust-lang.org/nightly/book/primitive-
types.html#t...](http://doc.rust-lang.org/nightly/book/primitive-
types.html#tuple-indexing) )

~~~
mcguire
Ah! Cool. Thanks!

------
bgdnpn
I really like the title and the first sentence. "Much existential anguish and
ennui was recently triggered.."

------
radulemnaru
This title made my day!

