
Unsoundness in Pin - hu3
https://internals.rust-lang.org/t/unsoundness-in-pin/11311
======
withoutboats
(I was the primary designer of the Pin API.)

Since (to a first approximation) every individual who has the expertise and
contextual knowledge to really evaluate this issue is a poster on
internals.rust-lang.org, its pretty surprising to find this thread on the
front page of Hacker News. I imagine some Hacker News users who upvoted this
link did so out of technical interest, but I suspect a large portion of the
attention comes from some combination of these misconceptions:

\- The misconception that this could have a practical impact on users (the
code being discussed on the thread is all obviously pathological & contrived).

\- The misconception that Rust's type system and standard library never
contain soundness issues and that this is an exceptional event (in fact we
have a number of longstanding soundness issues).

We have a policy of fixing all soundness issues, so this issue will be fixed.
In the meantime, while we decide the best solution, it will have no practical
impact on Rust users. And none of the solutions we are considering would
involve significant breakage to users, or invalidate any real code.

At a high level: the soundness issue occurs because the Pin API was designed
based on certain reasoning about the behavior of pointers. This reasoning
would be sound but for the fact that we have allowed certain exceptions in
relationship to pointers to what are called the "orphan rules" (which usually
enable local reasoning like this). These exceptions allow users to introduce
code which, while contrived, allows them to violate the guarantees of the Pin
API. Such is life.

~~~
temac
> The misconception that this could have a practical impact on users (the code
> being discussed on the thread is all obviously pathological & contrived).

Famous last words?

I mean I'm not an expert in Rust and even less in Pin, but I've seen my share
of theoretical bugs thought of not possibly having any impact in the real
world because of too theoretical. In other areas, when you debug a _triple_
segfault and you understand the crazy conditions that lead to it, or when you
render a piece of C++ code _conforming_ instead of technically UB and it then
starts to crash when in its UB form it worked perfectly, you start to consider
that everything is possible :)

~~~
withoutboats
This is exactly the kind of comment that makes me dislike Hacker News so
strongly, because I find it endemic here.

The idea that someone could think an issue is theoretical and then discover it
is practically significant is obvious and no insight at all - it reduces to
"people are sometimes wrong." I am declaring based on my significant relevant
expertise that this issue is not practically important. Your comment
contributes nothing but baseless contradiction.

------
mauricioc
Simon Peyton Jones initial intuition [0] was that Rust would be full of
soundness holes due to not starting with a full formalization. Derek Dryer,
Ralf Jung and others in the RustBelt project [1] did amazing work in
formalizing Rust's safety guarantees, showing that Peyton Jones' guess was not
accurate.

There is nothing wrong about having bugs, of course, but the reaction to the
bug in this thread shows that mathematical correctness is not as universally
valued as I thought it would be. I agree with the sentiment that this is a
bigger issue that shouldn't be discussed in a technical thread about a
specific bug. However, after reading this, it is unclear to me whether
mathematical correctness is regarded in the Rust project as an explicit goal,
an explicit non-goal or an unessential nice-to-have. (I do not mean to
insinuate anything with the word "unclear", as I believe all three options are
valid and appeal to different use cases. Almost all of the popular languages
don't care about this, for instance.)

[0] [https://youtu.be/t0mhvd3-60Y?t=130](https://youtu.be/t0mhvd3-60Y?t=130)
[1] [https://plv.mpi-sws.org/rustbelt/](https://plv.mpi-sws.org/rustbelt/)

~~~
steveklabnik
> However, after reading this, it is unclear to me whether mathematical
> correctness is regarded in the Rust project as an explicit goal, an explicit
> non-goal or an unessential nice-to-have. (I do not mean to insinuate
> anything with the word "unclear", as I believe all three options are valid
> and appeal to different use cases.)

It's a balance. We cannot drop everything, nor stop all development, in order
to get proofs. However, we are actively working toward formalizing the
language itself, because we do see value in it.

~~~
mauricioc
Totally reasonable stance, in my opinion, since Rust moves quickly and
formalizing things takes time. But what is the stance on soundness bugs? Are
they sometimes acceptable as permanent parts of the language, or is it a goal
to eventually fix any soundness bugs that show up?

~~~
steveklabnik
Soundness bugs are the only thing we make an exception in our stability policy
for. In general, unsoundness is treated as a serious issue. However,
"unsoundness" can mean a few different things; in the "semantics of the
language" sense, they're taken very seriously. In the "there's a compiler bug"
sense, they're taken seriously, but in accordance with how likely they are to
affect end users, and how much work they are to fix. For example, an incorrect
type signature in the standard library was fixed one day after release by
changing it to the correct one and re-releasing; an issue where casting
between floating points and integers causes some problems has still not been
fixed, because it partially has to do with LLVM's semantics combined with our
semantics, and it's not clear how to do the right thing while not seriously
regressing performance. (That being said that particular issue has had a
flurry of activity in the past week...)

The GitHub label for these bugs is A-Unsound, which has the description "A
soundness hole (worst kind of bug), see:
[https://en.wikipedia.org/wiki/Soundness"](https://en.wikipedia.org/wiki/Soundness").

~~~
mauricioc
Compiler bugs are not what I had in mind, so this answer is very reassuring!
The original thread led me to believe this was not an absolute, official
policy of the Rust project, but I am happy to hear it is.

~~~
steveklabnik
For more details, here's the written part of this: [https://github.com/rust-
lang/rfcs/blob/master/text/1122-lang...](https://github.com/rust-
lang/rfcs/blob/master/text/1122-language-semver.md#soundness-changes)

Please note that threads on internals are able to be posted to by anyone;
things expressed there may not represent the way that the team feels about
things.

------
est31
For the uninitiated: Rust differentiates between safe and unsafe code (safe
being the default and unsafe needing the unsafe keyword). The unsafe code is
like C/C++: it's more powerful than safe code, allowing you to do more things,
including allowing you to trigger undefined behavour. Any mistake can possibly
mean UB with all the consequences of crashes, wrong behaviour and nasal
demons. Safe Rust on the other hand, is not supposed to be allowed to trigger
such UB, no matter what you write. If you _are_ able to trigger UB in 100%
safe code, it's considered a bug in the language, an unsoundness bug.

Note that it was known that Pin/Generators had soundness problems when you
turned on noalias optimization flags [1]. They have however decided to
stabilize it without fixing those problems as they could be fixed afterwards
as well. I'm not entirely sure yet about the impact of this one, will have to
read the thread I guess.

[1]: [https://github.com/rust-
lang/rust/issues/62149#issuecomment-...](https://github.com/rust-
lang/rust/issues/62149#issuecomment-521149328)

------
jeltz
Personally I am not convinced Pin is a good idea as it works now. Not because
of this soundness bug but because it is too hard to understand. Just look at
how much explanation is required in the manual and how large the code examples
are.

[https://doc.rust-lang.org/beta/std/pin/index.html](https://doc.rust-
lang.org/beta/std/pin/index.html)

~~~
kibwen
Agreed that `Pin` is imposing by dint of trying to provide "more fundamental"
guarantees than Rust was originally designed for (see my sibling comment on
this topic). Though ideally in the long run I'd hope that no regular user
would ever have to encounter it (right now I think it may show up in some
contexts related to futures, but as Rust's async story matures I would hope to
see that need diminish).

~~~
fluffything
Pretty much anybody writing `async` code, using `Futures`, etc. needs to
deeply understand `Pin`, pinning, etc.

Otherwise it is impossible to become productive with that part of the
language.

~~~
steveklabnik
End users will almost never need to deal with pinning whatsoever; they'll be
writing code with async/await.

~~~
the_mitsuhiko
Even with async await it’s not hard to run into it. The moment you end up
wanting to box a future you run into pin.

------
staticassertion
Unsoundness in Pin aside, I think the conversation between Boats and Ralph is
more interesting - are soundness issues going to be so strange and edgecasey
that they can be addressed as one-offs, or does rust need a more formal model
to ensure edge cases are covered?

I have no answer, it's just the only interesting part of that discussion imo.
Everything else is just details of an issue that's too weird to really fully
understand for me, and not novel or interesting enough to invest into
understanding, anyway.

------
svnpenn
I am new to Rust, but this:

    
    
        for<'a, T: ?Sized> &'a T !: DerefMut 
    

Seems like a hunk of unreadable code. Is this valid Rust?

[https://internals.rust-lang.org/t/unsoundness-in-
pin/11311/3...](https://internals.rust-lang.org/t/unsoundness-in-pin/11311/33)

~~~
Rusky
No, it's bits of valid syntax thrown together informally to express an idea.
There is no `!:` in Rust, and that `for` quantification never applies to types
or bounds that way.

~~~
bascule
`for` used in a bound (in conjunction with a lifetime) is the syntax for
Higher-Rank Trait Bounds (HRTB):

[https://doc.rust-lang.org/beta/nomicon/hrtb.html](https://doc.rust-
lang.org/beta/nomicon/hrtb.html)

~~~
Rusky
Yes, but it's not used that way in this snippet...

------
kibwen
For those who haven't heard of `Pin` before (which is reasonable even for Rust
users, for whom this is a rather obscure corner of the standard library): it's
essentially a trait that indicates that a value cannot be moved from its
current location in memory.

Here's an analogy: Rust already differs from most languages in that the
behavior that you get "by default" when defining a new type is very limited.
For example, if I define a type `struct Foo`, then by default this type can't
be copied around: `let x = Foo; let y = x;` is a simple operation that in most
languages would involve copying memory, but in Rust involves a _move_ instead.
You can think of a move as a copy where the original value is no longer
accessible (whether or not any bytes in memory _actually_ get copied as a
result of a move is an implementation detail, and is (hopefully) often
optimized away). In order to make our type copyable, we would add an
implementation of the `Copy` trait.

The observation is that moving is a more fundamental operation than copying,
therefore, it is easier for types to opt _in_ to copying than to opt _out_.

When Rust was released in 2015 this was a relatively radical concept for a
language targeting a mainstream audience (we're essentially describing Rust's
entire concept of ownership here, after all). But it turns out that it's
possible that Rust may not have been radical enough! Consider: what if you
want a type that not only can't be copied, but can't be _moved_ as well?

Before Rust 1.0 it wasn't clear that such a concept of "unmoveability" would
be generally useful. There were certainly cases, such as self-referential
structs, that could have benefited from such a concept, however making
moveability opt-in rather than opt-out would have required all types that _do_
want to be moveable to explicitly announce that fact (via something like
`#[derive(Move)]`), which is an annotation burden on all other code that must
be considered.

It wasn't until the async/await work that another, more critical need for
unmoveability was found: if the generators interally produced by `async` are
capable of moving, then that means that generators are incapable of containing
references, which means that async/await would become drastically less useful;
there would be an entire fundamental part of the language that simply couldn't
be used with it. Thus `Pin` was born in order to denote things that cannot be
moved (and whose design is itself a very, very long discussion).

For this reason I'm somewhat amused by one of the comments in the OP asking
something like "if `Pin` had remained unstable for an additional year, would
this particular instance of unsoundness have been caught?" Because,
conversely, we could ask whether Rust itself could have remained unstable for
another five years and found a design that would have obviated `Pin` entirely
by making moveability opt-in. However, of course, such things are easy to ask
in hindsight, and stability is a prerequisite for having a broad base of
adoption. Finding a balance between immediate stability and eventual
perfection is the holy grail of industrial language design.

(Regarding the bug in the OP itself, I think it's unfortunate but I'm not
especially worried by it at this juncture. The fact that Ralf Jung's team is
looking into it gives me confidence that formal methods will eventually be
applied here to more thoroughly explore the soundness of `Pin` in general
(Ralf being one of the people who shaped `Pin` originally), and in the
meantime I wouldn't be opposed to a band-aid fix, given that working
extensively with `Pin` in the way shown in the OP is rare for normal users.)

~~~
jimmaswell
> Before Rust 1.0 it wasn't clear that such a concept of "unmoveability" would
> be generally useful

I'd like to ask, why is moving generally useful? Why would I want to move x to
y and invalidate x?

~~~
steveklabnik
We can go back to the reason that Pin even exists in the first place. With
Rust, you create a computation, and it implements the Future trait. When you
want it to execute, you give it to an executor. This often moves the data
structure that represents the computation to the heap, so that it has a stable
address, and can continue running after the current stack frame is over. So,
before the future starts executing, it can be moved around in memory (and
probably will at least one time), but after it starts executing, it must never
move again.

Does that make sense?

~~~
jimmaswell
So the main use is convenience to seamlessly make stack variables permanent?
Intuitively that doesn't sound worth the language design troubles compared to
making the programmer declare these things on the heap manually. I'm not that
familiar with Rust though.

~~~
steveklabnik
That’s one use case, it’s not the only one. And it’s not always “put it on the
heap”, so you can’t solve the issue that way.

------
eximius
I didn't follow all of it, but it sounds not so bad.

1 & 2 both involve unsafe code that, while possibly reasonable in a complex
application, is obviously wrong in the simple case. Of course turning a
reference into a mutable reference will cause trouble. Was Pin SUPPOSED to be
resilient to unsafe code? In any case, seems like a bug in DerefMut and Clone,
not Pin.

The others are a bit more esoteric and do seem potentially concerning, but I'm
not sure.

But still, I'm left with my earlier question: just how resilient is Pin
supposed to be?

~~~
steveklabnik
> Was Pin SUPPOSED to be resilient to unsafe code?

Remember, unsafe does not mean "I can do whatever I want," it means "I am
promising to uphold some guarantee on my own." That is, let's assume we have a
function:

    
    
      /// Makes a new Foo.
      ///
      /// # Safety
      ///
      /// x must never be greater than 5
      unsafe fn new(x: i32) -> Foo {
    

and you write

    
    
      let x = 6;
      unsafe {
          new(x);
      }
    

You have a bug, even though new is marked unsafe.

In my understanding of the unsafe examples above, the unsafe code was
upholding the invariants that it was required to uphold, and so would be more
like having written `let x = 2;` in the above example, if that caused an
issue, clearly there's a bug.

~~~
eximius
I think this is a little too dumbed down for me to see the relevance to the
actual problem.

Also, I believ 1&2 both require unsafe code to implement the trait that allows
the unsoundness.

For what &T can I safely implement DerefMut? Is it the fault of Pin that a
type has DerefMut implemented that the unsafe block of that implementation
doesn't uphold safety?

~~~
steveklabnik
The relevance to the problem is that even if these examples needed unsafe
code, that doesn't mean that there's no problem. "resilient to unsafe code" is
not the right way to think about unsafe.

> For what &T can I safely implement DerefMut? Is it the fault of Pin that a
> type has DerefMut implemented that the unsafe block of that implementation
> doesn't uphold safety?

It is the fault of Pin's safety guidelines, which suggest a contract that is
not actually enough to uphold safety. That's why this is considered an
unsoundness in Pin, and not a problem with the code written with unsafe. That
is:

> that the unsafe block of that implementation doesn't uphold safety?

is not correct, the unsafe block does everything it's supposed to to uphold
safety.

~~~
eximius
I can see the argument that Pin should be safe if you meet the contract it
claims to need. The requirements are not sufficiently tight. Fair enough.

