
A Rust curiosity: pointers to zero-sized types - luu
http://www.wabbo.org/blog/2014/03aug_09aug.html
======
pcwalton
> In fact, it’s not clear to me why this code insists on transmuting a non-
> zero value, because doing this seems to work just fine:

Because it needs to compare nonequal to the null reference.

~~~
ajb
So 'kam' had a reply to this which is now dead, and I don't understand why
because it looked informative to me. Could someone explain what was wrong with
it?:

"To expand on that: Rust references are guaranteed to be non-null. This allows
the compiler to optimize Option<&T> to be the same size as a pointer by
encoding None as 0 and Some(p) as the pointer value. When unsafe code violates
that guarantee by transmuting 0 to a reference, Some(p) becomes None! Example
here: h t t p : / / i s . g d / S I o g w i" (link expanded in case that was
the issue).

~~~
Guvante
Link shorteners are frowned upon. There is no need to have a short URL because
you have infinite characters and they can lead to bad behavior.

I know reddit filters them, apparently HN does too.

~~~
steveklabnik
It's been awkward. The reason that is from a shortener is because the program
is encoded in the url. The actual URL there is

[http://play.rust-
lang.org/?code=fn%20main%28%29%20{%0A%20%20...](http://play.rust-
lang.org/?code=fn%20main%28%29%20{%0A%20%20%20%20let%20unsafe_reference%3A%20%26%28%29%20%3D%20unsafe%20{%20%3A%3Astd%3A%3Amem%3A%3Atransmute%280u%29%20}%3B%0A%20%20%20%20let%20x%20%3D%20Some%28unsafe_reference%29%3B%0A%20%20%20%20println!%28%22{}%22%2C%20x%29%0A)}

HN displays that kinda okay, but if you don't have it looking like HN does, it
can get to be almost two thousand characters.

~~~
codemac
Not only that, but HN parses the full link incorrectly (the last } isn't
included in the link) such that the only way for the link to have worked is to
use some indirection.

Like you know, a url shortener.

~~~
Sniffnoy
Well, you can %-escape the '}'...

[http://play.rust-
lang.org/?code=fn%20main%28%29%20{%0A%20%20...](http://play.rust-
lang.org/?code=fn%20main%28%29%20{%0A%20%20%20%20let%20unsafe_reference%3A%20%26%28%29%20%3D%20unsafe%20{%20%3A%3Astd%3A%3Amem%3A%3Atransmute%280u%29%20}%3B%0A%20%20%20%20let%20x%20%3D%20Some%28unsafe_reference%29%3B%0A%20%20%20%20println!%28%22{}%22%2C%20x%29%0A%7D)

------
steveklabnik
I'd like to mention that we've been recently discussing ways of getting rid of
the `::<>` syntax. Then

    
    
       mem::size_of::<T>()
    

would be

    
    
       mem::size_of<T>()
    

which is less visually noisy. This syntax is probably the noisiest thing we
haven't cleaned up yet.

~~~
MichaelGG
For a constant like size_of, can you not eliminate the function call and make
it a value? Just mem::size_of<T> ?

~~~
pcwalton
Not without associated constants, which Rust does not yet support (and likely
will not until after 1.0).

------
StefanKarpinski
Cool. Somewhat different but similar... In Julia, types with no fields are
automatically singletons and an array size-zero immutable values takes no
storage:

    
    
        julia> sweet_nothings = Array(Nothing,typemax(Int))
        9223372036854775807-element Array{Nothing,1}:
         nothing
         nothing
         ⋮
         nothing
         nothing
    
        julia> sizeof(sweet_nothings)
        0
    

This seems like a cute, useless trick but sometimes it's quite useful. For
example, the Set{T} is implemented by wrapping a Dict{T,Nothing} and the array
of values in the Dict object takes no storage and of course all access is
optimized away.

------
the_mitsuhiko
I actually like that C++ enforces a struct to be at least a byte long because
it means you can have pointers to them and they will compare as non equal even
if you do not know the type. That is actually quite handy sometimes.

I wonder why Rust does not do that.

~~~
AnimalMuppet
IIRC, C says that you get a real pointer back from malloc(0). Repeated calls
return distinct pointers.

For whatever that's worth...

~~~
pbsd
Not exactly; from §7.20.3 in C99:

    
    
        If the size of the space requested is zero, the behavior is implementation-
        defined: either a null pointer is returned, or the behavior is as if the size were some
        nonzero value, except that the returned pointer shall not be used to access an object.

~~~
mcguire
" _or the behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object_ "

I love the C standards. Is that defining the behavior of malloc or the
behavior of the code calling malloc?

~~~
pbsd
The caller code. It is saying that the caller cannot dereference the returned
pointer, under pain of undefined behavior.

~~~
AnimalMuppet
In particular, you can't access (read _or_ write) off either end of an
allocated block. Well, if the block is of (nominal) length 0, there is nowhere
to access safely.

------
dsymonds
It's interesting that this is one of the things that Go and Rust have agreed
upon: that the addresses of zero-sized values are the same (in Go we spell a
zero-sized type as "struct{}").

It has some unexpected surprises though. If you write `var key struct{}` in a
package and then try to use &key as a map key (c.f.
[http://blog.golang.org/context](http://blog.golang.org/context)) then you'll
collide with other packages. You've got to use a non-zero type instead.

~~~
pcwalton
I don't think we actually defined the addresses of zero-sized values to be
equal; I believe it's actually unspecified. I don't think this has ever come
up in practice, though; we could define them to be equal if we wanted to.

That issue hasn't come up in Rust, probably because equality and hashing look
through references (per the definitions of Hash and Eq in the standard
library), so you'd really have to go out of your way to try to use unit as a
way to make package-specific keys like that.

------
ch
> As you might imagine, this operation is HIGHLY DANGEROUS.There’s nothing to
> stop you from transmuting 0 into a borrow, &T and then trying to dereference
> it. Voila, null pointer dereferencing in Rust!

Is this right? Why doesn't this function need to be marked unsafe?

~~~
fjh
transmute is marked as unsafe, so it can only be used in unsafe blocks or
functions.

~~~
ch
My mistake. I clicked through to the docs, and didn't see any annotation.

[http://static.rust-
lang.org/doc/master/std/mem/fn.transmute....](http://static.rust-
lang.org/doc/master/std/mem/fn.transmute.html)

~~~
chrismorgan
Erk—that’s a bug.

