
Abstraction without overhead: traits in Rust - steveklabnik
http://blog.rust-lang.org/2015/05/11/traits.html
======
haberman
Great article. The main bit of new information for me was that while Rust
supports dynamic dispatch, its implementation has a noticeable difference from
C++. In C++, the vtable pointer is in the object itself. In Rust, it's stored
inside what is essentially a "fat pointer." Pointers to traits ("trait
objects") are actually two pointers: the pointer to the vtable and the pointer
to the actual object.

This seems to have one major downside (pointers are twice as big), but lots of
upsides:

    
    
        - allows traits to be implemented for existing types, as opposed
          to C++ where the type's declaration has to list all base classes.
    
        - allows a type to be used through dynamic dispatch while allowing users
          who don't need this to avoid the vtable overhead.
    
        - one less indirection in the call sequence for dynamic dispatch.
    

I like it a lot overall, though the idea of 16-byte pointers on 64-bit
architectures does make me slightly queasy.

~~~
pcwalton
It also makes multiple inheritance (which Rust has, through Java-like
interfaces) easy to implement, and fast at runtime. The virtual inheritance of
C++ is a real mess, by contrast [1].

[1]:
[http://www.phpcompiler.org/articles/virtualinheritance.html](http://www.phpcompiler.org/articles/virtualinheritance.html)

Edit: I don't mean to bash C++ here, BTW; the skinny-pointer approach has a
lot of benefits when all you need is single inheritance (and there are early-
stage proposals to add it to Rust too). But I don't think it works well for
multiple inheritance.

~~~
fmstephe
I apologise in advance that this may not be most appropriate place to ask this
question.

I am looking for a Rust tutorial. It looks like there was one, but it was
deprecated in favour of 'the book'. But 'the book' doesn't seem to have a
tutorial yet.

Are there any tutorial's running through how to build some small piece of
working software.

I found the Golang tutorial, where you build a very basic blog extremely
enjoyable. Does Rust have anything similar?

[https://golang.org/doc/articles/wiki/](https://golang.org/doc/articles/wiki/)

Thanks

~~~
steveklabnik
The book basically has two sets of tutorials: the "Syntax and Semantics"
section is a bottom-up tutorial, and the "Learn Rust" section is a project-
based, more top-down one. It's true that only one chapter of Learn Rust has
landed at the moment. It's basically what I'm doing right now. Should have two
or three more chapters over the next few days.

~~~
fmstephe
Great, thanks Steve.

------
coolsunglasses
Typeclasses benefit from the same "zero runtime overhead" in Haskell (but not
Scala). This is particularly important in non-strict languages where inlining
is a more prominent aspect of making code performant. Fortunately, GHC is a
lot easier to understand WRT optimizations than gcc.

You revert to the OOP-style vtables if existential quantification is
introduced because you have to pack code with the data, rather than being able
to inline the data from a statically known index of methods into the concrete
call-sites.

As it stands, I'm more likely to want Clean-style uniqueness typing in Haskell
for a non-GC'd life-cycle for memory than I am to use Rust, but it's nice to
see what we can accomplish with linear types baked into the compiler. Wish
regions hadn't been abandoned, but it seems like somebody wanted Rust pushed
into becoming a product quickly.

~~~
kibwen

      > Wish regions hadn't been abandoned, but it seems like 
      > somebody wanted Rust pushed into becoming a product quickly.
    

No idea what you're referring to here. Lifetimes are entirely based on the
regions literature, and four frigging years of design iteration is hardly
"pushed into becoming a product quickly". :P

~~~
coolsunglasses
Lifetimes aren't regions.

4 years when it is a recapitulation of existing technology and hasn't had time
to push anything forward and we already have PLs with similar facilities?
That's a product-oriented rather than research-oriented direction whether you
think it's too fast or not. For comparison, Haskell dates to the early 90s,
based on non-strict FP languages from the 80s. ML itself started in the 70s.
Much of what makes Haskell nice today happened because it had a decade to
gestate without the demands of industry coming first. Applicative wasn't
discovered until _2008_. Those discoveries are a big part of why I happily use
Haskell for work today.

This is emphatically _not_ a value judgment, I will likely not use Rust in
anger so I'm not your customer anyway at least WRT the programming language.
Representations of linear types embeddable in dependently typed languages
(such as Brady is figuring out in Idris) will probably be the next step.

In my ideal universe, there's a language for people to experiment with
theoretical models and practical applications of linear typing such as Idris
provides for DTPLs. This is particularly appealing as it could enable
programmers to define their own linearly typed models for the compiler to
enforce.

~~~
sixbrx
I don't think Rust was ever intended to be primarily a research language. From
what I gathered even from early docs (in Graydon's reign), it was intended to
implement ideas previously introduced in research languages in a practical
_systems_ language. They even made mention of adopting only _safe_ ideas* (and
I'm too lazy to look for those quotes). I think they've had to innovate some
but that wasn't the primary intent.

~~~
coolsunglasses
That would explain a lot. How do the goals contrast with C++ then? Smaller
language?

~~~
kibwen
For reasons of backwards-compatibility, it's basically impossible to design
extensions to C++ that make the language memory-safe by default. C++11/14 are
admirable best-effort approaches to doing so, but they provide only tools for
helping to enforce memory safety without providing any actionable guarantees.
Rust guarantees memory safety, with the only possibility of unsafety being
relegated to blocks of code specifically denoted as `unsafe`.

In addition to memory safety, Rust's other goal was to improve the ability of
programmers to reason about low-level concurrency, motivated by the enormous
pain that both the Firefox and Chrome developers are currently experiencing by
trying to adapt their browsers to a multicore world. The serendipitous
discovery was that the same mechanism used to guarantee memory safety will
also statically guarantee that your program is free of data races.

TL;DR: Rust's goals are guaranteed memory safety, guaranteed freedom from data
races, and zero runtime overhead relative to C++.

~~~
asadotzler
kibwen, this whole reply should be an FAQ somewhere if it isn't already. Nice
precision!

------
bkeroack
There is _always_ overhead when adding abstractions--the only question is
whether you pay at runtime or at compile time. C++ (and presumably Rust)
choose the latter, Python and Go choose the former.

~~~
steveklabnik
Yes, in these discussions, the overhead being referred to is runtime overhead.

There's also overhead in the sense of complexity for the programmer, which
isn't really either of those two.

~~~
humanrebar
> There's also overhead in the sense of complexity for the programmer

Well, from the programmers' perspectives there are both read-time and write-
time overheads. In C++-land, the discussion about the new (to C++) 'auto'
keyword is about the trade-offs between the two.

------
megaman821
For being a lower level language, Rust's abstractions really make feel closer
to a higher level language than to C. It will be even more so if HKT's land.

~~~
ajanuary
HKT = Higher-Kinded Polymorphism [1]

[1] [http://www.hydrocodedesign.com/2014/04/02/higher-kinded-
type...](http://www.hydrocodedesign.com/2014/04/02/higher-kinded-types/)

------
pron
> What you do use, you couldn’t hand code any better.

... given you have no information about runtime behavior other than the static
code.

How to achieve "zero-cost abstractions" is, as usual, a design tradeoff.

Rust -- like C++ -- lets you choose in your code whether you'd like to pay for
an abstraction or not. This does produce good machine code, but has two costs:
1/ the language gets more complicated and the programmer needs to be aware of
what she's paying for, and 2/ you might end up paying for stuff you end up not
using (for example, all button listeners might end up being the same type, and
a vtable adds unnecessary cost), so that the generated code is only optimal if
you have no other information about runtime behavior.

There's another way to add zero-cost abstractions: have a single simple
abstraction and a JIT to figure out at runtime -- based on observed behavior
-- what the optimal machine code is. This is what the JVM does. All method
calls are virtual from the programmer's perspective, and no choice needs to be
made ahead of time. At runtime, the JIT views the class hierarchy and usage,
and decides whether a specialized, inlined version of a function is produced
or whether a vtable is actually necessary (HotSpot even makes a special case
when there are exactly two implementations, replacing the vtable with an
`if`). If runtime behavior changes (a new type of listener is added or even
new code is loaded at runtime that adds another implementation), the JIT will
notice, reconsider and recompile. So in Java, virtual or even interface method
calls are also zero-cost, even though you have no choice about using them; the
decision on how to implement the abstraction is done by the (JIT) compiler _at
runtime_. This, too, is a tradeoff -- a simpler language and truly optimal
code taking into consideration not only static consideration but actual
runtime behavior -- at the cost of a possibly significant warmup time and
possible non-optimal "mistakes" by the JIT.

~~~
Narishma
I wouldn't call that zero-cost. It's more like variable-cost, which could be
even worse than always doing virtual calls in some application types.

~~~
pron
The average case is always as expensive or less than a virtual call. But yes,
the JIT most certainly introduces unpredictability, which may be unsuitable
for hard realtime applications (hard realtime Java programs employ AOT
compilation for those classes that require absolute predictability).

~~~
Rusky
It's a problem for more than just realtime applications. Requiring a JIT means
requiring a runtime, and that makes it much harder to do thing like embed Rust
libraries in scripting languages or expose a C interface.

~~~
pron
JIT has little to do with interoperation. It's quite simple to generate C
symbols pointing to stubs. What makes interoperation hard is usually a GC
rather than a JIT (it's just that most JITted languages also employ a GC; but
if you look at, say, Go, it's just as hard to embed or link against than as
and it doesn't employ a JIT at all -- its runtime "just" performs scheduling
and GC).

\---

BTW, Java can be embedded in scripting languages because those languages run
on the platform itself and _share_ the runtime. Because of the JIT -- that
optimizes across libraries and languages -- the interoperation is cheaper than
with C. So much so, that you get the following story: As part of the work
being done at Oracle on Graal, HotSpot's next-gen JIT, they've ported various
scripting languages to the new JIT, among them Ruby. They've found[1] that if
they interpret/JIT the _C code_ of the native Ruby extensions they get better
performance than a "plain" Ruby runtime calling into statically compiled C,
because the JIT is able to optimize across the language barrier.

[1]:
[http://www.chrisseaton.com/rubytruffle/cext/](http://www.chrisseaton.com/rubytruffle/cext/)

~~~
Rusky
Right, GC is hard to embed and inlining is a powerful optimization. But the
runtime that both JIT and GC require makes things hard for JIT on its own.

------
EugeneOZ
> Traits are interfaces

So why not use "interface" keyword?

~~~
ajross
Probably for the same reason that they call their unions "enums" or their heap
blocks "boxes". Standardization of jargon has never been a thing with rust,
just roll with it.

~~~
steveklabnik
Well, enums aren't unions, they're tagged unions. And other languages call
boxed values by that name too.

~~~
ajross
Stop it, they're unions. :) Among the target market, people are going to
intimately familiar with the use of "enum" and "union" from the C language.
Rust's concept of a single object that can store exactly one of several types
of sub-objects matches one quite closely, and not the other. Having a runtime
tag and affirmative checking doesn't change the nature of the thing. We don't
call "cars" something different when we add cruise control or anti-lock
brakes.

"Enumerate" in English is just a fancy word for "count" \-- it means to assign
numbers to a bunch of things. Which is exactly what the C concept did. The
Rust usage (to mean "something that can be in one of a few different states")
is new, though Java has something fairly close too.

It was a poor choice, sorry. Likewise being deliberately difficult with
"trait" vs. "interface" (picking Self's jargon instead of the term that
literally everyone already knows from decades of OOP) didn't serve you well.
Thus we have blog posts like this needing to tell us what we probably should
have been able to figure out from context.

Finally, regarding "box" vs. "block". Other languages (C# is the only one that
comes to mind off-hand) have used the idea of "boxing" to imply the allocation
of space for and copying of pass-by-reference data. That's sort of a different
notion than simple heap allocation, so it sort of gets its own jargon I guess.
I didn't complain anyway. But with Rust, a "box" really is used to refer to a
dynamic heap block in any context. We sort of already had a perfectly good
word for that.

Pretentious jargon isn't the worst crime in the world, but I do think Rust
seems needlessly complicated in the way it likes to play Shakespeare with
existing concepts.

~~~
TorKlingberg
I'll have to admit enums are where I gave up the first time I read the Rust
tutorial. They seem almost completely unrelated to enums in C and other
languages.

~~~
ajross
Right, because they're unions. :)

Just go back to that tutorial with your C hat on and substitute the word
"union" for "enum" and I promise it will all make sense. All your intuition
about C unions will cross over just fine, and the new Rust rules (they're
tagged at runtime and the compiler enforces that you can only ever use fields
of a runtype-checked subtype) are straightforward extensions.

Likewise the linked blog post begins, comfortingly, with "Traits are
interfaces". Once you get beyond the new jargon, you find it wraps a concept
which is 95% compatible with something you've been using for years.

That the Rust team seems to find no value in this kind of naming, preferring
the excess precision that comes with Create-Your-Own-Name, is what I was
calling "pretentious jargon" in a previous post in the thread. It's really not
that bad (I mean really, they're just names), but it doesn't speak well to
where the designers heads were when they invented this stuff.

Really, that's what's starting to creep my out about Rust. Just like C++ 30
years ago, it seems like Rust has caught itself up in an internal rush (among
its rock-star language nerd designers) for Masterpiece Status and sort of
forgotten the goal of creating a practical tool for working programmers... At
some point in the near future I have to wonder if we're going to start seeing
blog posts about choosing a "sane subset" of Rust with which to develop
software.

~~~
kibwen

      > it seems like Rust has caught itself up in an internal 
      > rush (among its rock-star language nerd designers) for 
      > Masterpiece Status and sort of forgotten the goal of 
      > creating a practical tool for working programmers
    

This is complete hogwash. Just because you disagree with the chosen
terminology doesn't justify attacks on the character of the Rust developers.

~~~
ajross
Sigh... it's an opinion. I even used "seems". I was around when we all watched
C++ go from "exciting new tool we should all use" to "wait, does anyone else
understand that new stuff because I don't anymore". This feels exactly the
same.

I'm no dummy, yet Rust is just confusing as hell sometimes. And you guys
_frankly don 't seem to care_ (again: note marker "seems" to indicate a
personal opinion and not a "character attack"). That turns me off. It turns
lots of people off. And I don't see any significant effort being made at
making it an easy tool to learn and use.

~~~
pcwalton
> And I don't see any significant effort being made at making it an easy tool
> to learn and use.

Just to name a few off the top of my head:

1\. Lots of focus on friendly compiler error messages, including typo
correction, automatic lifetime suggestions, and error message explanations.

2\. A strong worse-is-better approach in many aspects of the language design,
such as preventing reference-counted cycles (we don't try to), numeric
overflow (we don't try except in debug mode), typeclass decidability (it's
deliberately undecidable in corner cases to avoid complex rules), prevention
of deadlocks (we don't try), userland threads (we don't implement them
anymore), asynchronous I/O (it's out of the domain of libstd for now), etc.

3\. Blog posts like this one to introduce aspects of Rust, as well as the
tutorial.

4\. The Cargo package manager, as well as crates.io.

5\. Naming conventions designed to fit well with C, for example choosing
"enum" over "data"/"datatype" as in ML, "trait" over "class" as in Haskell
(since the latter means something totally different), but modified in some
cases to avoid leading programmers of C-like languages astray (for example,
"interface" changing to "trait"). This naming process has taken time, but I
think Rust is in a pretty good place now. There are obviously disagreements as
to the naming, but we can't please everybody.

Certainly we weren't perfect, but there was a lot of effort put into making
Rust as easy to use as possible.

------
nathan_long
"new traits can be implemented for existing types (as with Hash above). That
means abstractions can be created after-the-fact, and applied to existing
libraries."

Sounds a bit like a Ruby monkey patch. What happens in case of conflict - my
trait adds a .hash method, but there already was one?

~~~
phaylon
Trait methods are only visible when the trait is in scope. So a conflict would
only appear when both traits are imported and the method is called, in which
case you'll have to disambiguate.

~~~
nathan_long
Cool, thanks for explaining. :) Would it be a compile-time or runtime error?

~~~
phaylon
It would be a compile-time error saying (in essence) "multiple applicable
methods in scope" and giving you a list of the available methods.

~~~
nathan_long
Awesome!

