
Rust pointers for C programmers - eatonphil
http://blahg.josefsipek.net/?p=580
======
evmar
When I was learning Rust I leaned heavily on C-based analogies like this but I
found they ultimately ended up being kinda harmful, because I was thinking (in
C mode) "do I want to pass a pointer or a value to this function" rather than
(in Rust mode) "do I pass ownership or a borrow to this function".

For an example of the difference, here's a function that consumes a 1kb
buffer, returning the sum some of its contents. No pointers or references at
all, everything "passed by value".

[https://godbolt.org/g/icoVvz](https://godbolt.org/g/icoVvz)

To my C programmer's eye this looks wrong, but from the output you see there
is no memcpy. (Edit: this is wrong, both C and Rust do memcpy, see below
comments, but I think my point stands.)

I'd be shocked to see a C programmer write the equivalent code ("passing by
value a big struct to a function"), while in Rust it's not only idiomatic to
write that code but frequently also necessary, depending on whether the
function wants to take ownership its argument or some of its contents.
(Imagine a more complicated 'stuff' struct that contained embedded pointers to
other things.)

I'm a little embarrassed to say I ended up looking at the output on
rust.godbolt.org a lot to convince myself that the bits of code I was writing
was "ok", which is something I never did when learning e.g. Haskell or Go. I
think the reason here is that Rust feels so close to C that there's a bit of
an uncanny valley effect, where you can mostly pretend it's C up until you hit
a wall.

~~~
Rusky
This is wrong. Passing a large struct by value _does_ introduce a memcpy... in
the caller: [https://godbolt.org/g/26annS](https://godbolt.org/g/26annS)

The same is true in Rust, though it's trickier to show on godbolt because the
equivalent of `extern` is less accessible in that context.

Rust programs often put those large structs in Boxes when they need to
transfer ownership of something large, which is something straight out of "C
mode."

~~~
noncoml
You are absolutely right. The Rust book makes it sound like that passing
structs is free, but it’s absolutely not. It’s always bit-wise copy. Relevant
Reddit and Stackiverflow discussion with godbolt examples:

[https://www.reddit.com/r/rust/comments/8ts6b4/is_anyone_else...](https://www.reddit.com/r/rust/comments/8ts6b4/is_anyone_else_worried_of_performance_issues_due/)

I felt a bit betrayed when I read them..

~~~
steveklabnik
RVO and NRVO are supposed to happen, as far as I know, which is why I said it
in the book. Compiler bugs do happen.

But also, from that thread:
[https://www.reddit.com/r/rust/comments/8ts6b4/is_anyone_else...](https://www.reddit.com/r/rust/comments/8ts6b4/is_anyone_else_worried_of_performance_issues_due/e1bt47w/)

For example: [https://godbolt.org/g/DTMQM5](https://godbolt.org/g/DTMQM5) (or
[https://godbolt.org/g/LHMgHr](https://godbolt.org/g/LHMgHr) with C side-by-
side) (and yes, I know that char != u8, this is just for demonstration
purposes)

here, you have

    
    
      mov rdi, rsp
      call example::hello@PLT
    

not a memcpy. Right? Looks the same as the C version that takes char*.

~~~
cwzwarich
[stupid comment]

~~~
steveklabnik
Isn’t that the array initialization?

------
steveklabnik
This is a great post! One additional thing:

> The only differences between borrowed references and raw pointers are:

There’s one more: &mut T is restrict, in C terms, as long as T doesn't contain
an UnsafeCell<T>.

~~~
cwzwarich
Actually, &mut isn't equivalent to restrict, since two restrict pointers are
allowed to alias in C as long as none of the intervening accesses are writes.

~~~
steveklabnik
Isn’t that the same in Rust? For example, split_at_mut?

(I mean, it’s not like this is 100% nailed down yet...)

~~~
Jweb_Guru
split_at_mut doesn't allow you to read through aliased "active" &mut pointers
either (I would assume that restrict pointers that cannot be either read from
or written to don't count).

~~~
steveklabnik
Right, but they both exist. I guess that’s what I’m trying to say... it’s the
active bit that matters.

~~~
Jweb_Guru
Hm, in that case you don't need to appeal to split_at_mut--reborrowing is
sufficient to demonstrate that point. I think what czwarich is saying is more
subtle--that restrict allows "multiple readers, exclusive writer", which means
that both shared pointers without UnsafeCell _and_ mutable references in Rust
are restrict in this sense, while &mut is more restrictive.

~~~
steveklabnik
That’s fair, but I feel like split_at_mut is far more well-known than
reborrowing.

------
wcrichton
Another important detail around pointers in Rust: it's common for "smart
pointers" like reference counting (std::rc::Rc in Rust, std::shared_ptr in
C++) and even boxes to be structs that contain pointers plus other information
(like a reference count).

However, for ergonomic reasons, Rust wants you to be able to use a Box<T> or
Rc<T> in the same way you would a &T, i.e. dereferencing with *t returns the
underlying value in every case. Therefore, Rust allows you to overload the
dereference operator with the Deref trait, e.g. the implementation of Box [0]
does a double dereference to access the inner struct member.

Note that this is distinct from "auto-deref" (or deref coercion), another Rust
feature that will automatically dereference pointers as necessary in certain
cases, like calling a method on a struct (so there is no arrow operator "->"
in Rust like in C++). See the Book [1] for more details.

[0]: [https://doc.rust-
lang.org/src/alloc/boxed.rs.html#536-542](https://doc.rust-
lang.org/src/alloc/boxed.rs.html#536-542)

[1]: [https://doc.rust-lang.org/book/second-
edition/ch15-02-deref....](https://doc.rust-lang.org/book/second-
edition/ch15-02-deref.html)

~~~
Diggsey
That's not quite true: the reference counts for `Rc` and `shared_ptr` are
_not_ stored adjacent to the pointer. In rust these types are guaranteed to be
pointer-sized (when the target is Sized) or fat-pointer-sized otherwise.

The reference counts are necessarily stored on the heap at the target of the
pointer (the reference count must be shared between all the pointers)
pseudocode:

    
    
        struct RcBox<T> {
            strong_count: usize,
            weak_count: usize,
            value: MaybeInit<T>,
        }
    
    

Additionally, when you use `into_raw`/`from_raw` you're actually getting a
pointer to the value member of the `RcBox` struct (ie. the reference count is
stored at a negative offset from that pointer).

~~~
wcrichton
Sure, agreed that they aren't literally adjacent in the struct. My broader
point is just that there's a level of indirection which is abstracted over by
the Deref trait.

~~~
Rusky
> there's a level of indirection which is abstracted over by the Deref trait.

There isn't, though. Box and Rc have the same number of levels of indirection
as the built-in pointers: one.

The double * you see in the Box source you linked only arises because the
deref method takes self by reference, the same way unique_ptr's operator*
takes this as a pointer. (And further, that implementation looks circular? I
believe that part of Box is still built-in.)

These are both functions that should always be inlined (in fact the unique_ptr
implementation I'm looking at goes through _at least four_ functions to
retrieve the actual pointer) so any extra address-taking and dereferencing you
see is merely compile-time bookkeeping.

~~~
steveklabnik
It is! [http://manishearth.github.io/blog/2017/01/10/rust-tidbits-
bo...](http://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-
special/)

------
ainar-g
See also, the Rust container infographic[1], and the /r/rust thread about
it[2].

[1] [https://i.redd.it/moxxoeir7iqz.png](https://i.redd.it/moxxoeir7iqz.png)

[2]
[https://www.reddit.com/r/rust/comments/74yrdp/rust_container...](https://www.reddit.com/r/rust/comments/74yrdp/rust_container_cheat_sheet_reposted/)

~~~
RaleyField
There's newer version[1].

[1] [https://docs.google.com/presentation/d/1q-c7UAyrUlM-
eZyTo1pd...](https://docs.google.com/presentation/d/1q-c7UAyrUlM-
eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)

------
Jweb_Guru
> The simple answer here is that you cannot make a [T]. That actually makes
> perfect sense when you consider what that type means.

While this is true, I believe there's ongoing work to allow allocating [T]
(and other dynamically sized types) on the stack under certain circumstances,
using alloca. Which is quite nice since this was an area where Rust lagged
behind C.

~~~
PeCaN
Ada can do this conveniently and it can be nice for efficiency and embedded
systems (which may not allow heap allocation). It would be nice to see the
same feature in Rust.

------
dochtman
I'm not sure why the distinction for Box between a pointer and a struct with a
single pointer in it matters. For all intents and purposes, aren't they the
same thing?

~~~
steveklabnik
At the binary level, yes, they're the same thing: a struct with one member is
the same as just the member.

However, in the end, it's all just binary: that doesn't mean that using
different phrasing doesn't help understanding. If it helps it make sense to
the OP I'm all for it.

------
psyclobe
Hope Rust runs on Solaris, for Josef's sake at least lol.

~~~
steveklabnik
It’s tier 2 [https://forge.rust-lang.org/platform-
support.html](https://forge.rust-lang.org/platform-support.html)

------
Koshkin
Sometimes I have a feeling that there is something wrong with the world in
which the tools are much more complex than the things you make with them.
(This not always has been the case; these days the signs of over-engineering
and over-design are everywhere.)

~~~
kybernetikos
It's sort of the opposite. If something is very simple, it's likely it is
simple because you're standing on top of a massive tower of complexity and not
worrying about it.

Rust is complex because it makes you worry about a bunch of stuff that other
languages allow you to forget. But this also means that it allows you the
control to create simpler things than can be created with the tools that make
you feel like everything is simple.

------
brian-armstrong
I really want to like Rust, but coming from C++, the language just feels
incomplete. There’s a lot you should be able to do safely but the compiler
just isn’t there yet to figure it out. I was surprised to find you can’t
safely pass ownership of a Box through a channel for example (that is, Box
doesn’t implement Send).

I’m also not a fan of the crazy chained functions everywhere, and making ? do
an implicit return makes it hard to visually scan code for control flow.

I’m hoping Rust matures into a better language. For now I think C++ is still
far more productive and ergonomic, and there are ways of making it safe too.

~~~
Rusky
Box totally implements Send: [https://doc.rust-
lang.org/std/boxed/struct.Box.html#syntheti...](https://doc.rust-
lang.org/std/boxed/struct.Box.html#synthetic-implementations)

Also I'm not sure how ? obscures control flow to the same level as C++, which
propagates exceptions with _no_ syntactic marker at all.

~~~
brian-armstrong
Well to be fair, I would never advocate using exceptions.

Also I’m quite sure Box doesn’t. I’ve seen compiler errors when attempting to
pass one via mpsc.

~~~
steveklabnik
Your parent’s link says Box implements Send only when it’s T is Send, which
makes sense.

~~~
brian-armstrong
I guess? If you’re transferring ownership of something to another thread,
that’s perfectly safe. Needing to implement (??) Send for a struct or pod type
is cumbersome when it’s something you’d do pretty often.

~~~
GolDDranks
You don't need to implement Send for POD types – it is automatically
implemented. Almost all types implement Send. There are only a few exceptions
that I know of: Rc (the non-atomic reference counted pointer) doesn't
implement Send because it doesn't support atomically updating the count. Also
raw pointers don't implement Send.

~~~
brian-armstrong
The fact that we’re even having this conversation, caused by compiler error
ambiguity, demonstrates my point about ergonomics nicely :)

~~~
steveklabnik
Please file bugs if the errors are confusing!

~~~
brian-armstrong
Eh, they’re probably fine for regular users. They are just nonsensical when
you’re starting out.

~~~
steveklabnik
We care just as much for people just starting out as we do for regular users;
possibly even more!

