
What Rust Can Do That Other Languages Can't - tatterdemalion
http://robert.ocallahan.org/2017/02/what-rust-can-do-that-other-languages.html
======
ed_blackburn
I spent an afternoon messing about with Rust and found it infuriating. The
compiler kept (quite rightly) telling me how crap my code was and wouldn't
compile it for me. I found it frustrating but thoroughly educational.

For systems programmers, Rust looks a fantastic option. For line of business
apps, it is inaccessible and the safety dial is turned up too far. But then I
guess that is because Rust isn't intended to be a general purpose language for
writing line of business apps?

I can see it being useful for writing platform neutral, gnarly code and offer
easy hooks for popular, managed languages such as JavaSript, C#, Java et al to
hook into.

Outside of systems programming, I wonder if Rust will be part of a silent
revolution?

~~~
tatterdemalion
I am a volunteer who works on Rust & I do almost all of my open source work in
Rust. I am paid as a Ruby web developer. I personally have little interest in
systems programming & want Rust to be a productive language for application
development.

I frequently witness this rush to judge Rust as too complex to be used for
business applications based on a few hours of hacking. For this to be a valid
judgment, I would think one of two things would need to be true: a) Rust
remains this difficult to use once you have experience with it, or b) most
business applications are worked on for only a few hours.

In my experience, neither of those premises hold.

I believe that the rules of Rust not only give you exceptional performance
without memory issues (as this article tries to demonstrate), but also that
its rules engender a higher degree of correctness and 'well-factoredness' than
most languages. I believe Rust is an excellent language for writing
applications at every level of the stack when 'craftmanship' matters.

~~~
jonathanstrange
For me personally Rust is not very interesting because it doesn't have a
garbage collector. Manual memory management is just a pain in the ass and
really not worth it for the vast majority of applications - games and realtime
audio processing are an exception.

Besides, if I don't want automatic memory management, I could also use Ada or
Freepascal, who are both reasonably safe, though not generally as safe as
Rust, _and_ are mature and have a well-established toolchain.

~~~
foepys
I have never done anything with Rust but I was under the impression that
Rust's borrowing and ownership system made manually freeing memory more or
less obsolete except for a few edge cases where the compiler isn't intelligent
enough. Is this impression false?

~~~
adrianN
Manual resource management in Rust is about as convenient as manual resource
management in modern C++. That is, you don't really do any "manual" memory
management at all in >90% of your code, it's all taken care of by RAII.

~~~
pjmlp
Yes, but with the added benefit not requiring external tooling to validate
those RAII pattern are used correctly and forcing not so security savvy
developers to actually use them.

~~~
kazagistar
So it's better, safer, but still manual memory management.

~~~
pjmlp
Not quite, because the rules are enforced by the compiler.

In C++ the rules are enforced by optional tooling that most developers don't
use.

At CppCon 2015, only 1% of the audience acknowledged using such tools at
Herb's talk about the core guidelines.

------
WalterBright
In D, the example looks like (if I understood it correctly):

    
    
      struct X {
        Y y;
        ref Y getY() return { return y; }
      }
    

This tells the compiler that getY() returns a referenced based on the implicit
'this' reference. It's available now (DIP25 was the proposal for it).

We're now implementing DIP1000, which adds similar support for pointer values
(DIP25 only dealt with references).

To see it in action:

    
    
      alias Y = int;
    
      struct X {
        Y y;
        ref Y getY() return { return y; }
      }
    
      ref Y foo()
      {
        X x;
        return x.getY();
      }
    

Compiling it yields:

    
    
      test.d(11): Error: escaping reference to local variable x

~~~
tatterdemalion
All due respect, but I would be extremely surprised to learn that this feature
is as capable as Rust's lifetime system. Whiel you've shown that you can't
have an escaping reference to a local variable, what Rust provides goes far
beyond that. You can return the references to Y (for example if your function
has a reference to X as an argument), you can store them in the heap, you can
reassign them to be references to different Ys, etc, all so long as the Ys
they point to live longer than they do.

~~~
WalterBright
> I would be extremely surprised to learn that this feature is as capable as
> Rust's lifetime system.

So would I. It isn't. But we're working on much more than this in order to
provide guaranteed memory safety (although using a very different approach
than Rust's).

It's just that Rust does not stand alone with this particular example.

~~~
tatterdemalion
I think a key fact about this example is that `&Y` is just another type of
value which can be used like any other; if you can't return it our put it in a
vector or whatever its really not comparable.

------
qznc
D is working on it, if I understand DIP1000 [0] correctly. The example would
look like:

    
    
      struct X {
        Y y;
        scope ref Y getY() { return y; }
      }
    

[0]
[https://github.com/dlang/DIPs/blob/master/DIPs/DIP1000.md](https://github.com/dlang/DIPs/blob/master/DIPs/DIP1000.md)

~~~
pavanky
How would it be used from user code ?

~~~
qznc
See Walter's reply. He is actually doing it.
[https://news.ycombinator.com/item?id=13578315](https://news.ycombinator.com/item?id=13578315)

------
akie
I have no idea what this means, and I have a masters in Computer Science and
20 years of industry experience. Enlighten me?

~~~
m1el
Let's translate this code to C++:

    
    
        class X {
          public:
            Y* getY(void) {
              return &y;
            }
          private:
            Y y;
          };
        }
    

`getY` returns a pointer to the struct field, this field is allocated as a
part of the struct. There is no way (in C++) to verify that this pointer does
not outlive the struct itself. In other words, your pointer may become stale,
and the compiler has no way to check for it.

Rust has a resource ownership model (borrow checker) that allows it to check
at compile time that references do not outlive the data.

~~~
humanrebar
A little closer would be to use a shared_ptr<X> and shared_ptr<Y> that share
the same reference count (1). It's not exactly the same thing, though, since
Rust gives you some safe usage checks that you don't get with shared_ptr. That
is, vanilla Rust objects have lifetime semantics closer to unique_ptr than
shared_ptr.

(1): See constructor number 7 here, which was designed for this purpose:
[http://en.cppreference.com/w/cpp/memory/shared_ptr/shared_pt...](http://en.cppreference.com/w/cpp/memory/shared_ptr/shared_ptr)

~~~
dbaupp
It also isn't the same in that it comes with the cost of allocation and
reference counting.

------
ngrilly
> Most other languages simply prevent you from giving away an interior
> reference, or require y to refer to a distinct heap object from the X.

Go lets you return an interior reference. But Go uses a garbage collector
instead of lifetime checks at compile time (which is an advantage or a
drawback, depending on your requirements).

~~~
tatterdemalion
I don't know much about the Go compiler internals, but it sounds like in most
cases (possibly all) the object being referenced in that way will be heap
allocated; the entire point of this article is that no matter what (across
function boundaries, the reference itself put into the heap, whatever) the
member `y` of `X` will be inline with its stack representation.

[https://golang.org/doc/faq#stack_or_heap](https://golang.org/doc/faq#stack_or_heap)

~~~
topspin
Go does escape analysis like most (all?) good GC implementations to detect
optimization opportunities. You can allocate an array of objects, for
instance, and Go will try to make a contiguous, dense allocation for all the
members, enabling extremely efficient pointer arithmetic to resolve member
offsets thus avoiding an indirection, and precluding the GC bookkeeping for
individual elements, and perhaps improving processor cache utilization.

But the key point stands wrt Go; Go is among the languages that "can't do
this." First, there is no way to get a reference to a field inside a struct in
Go (the workaround is reflection which has runtime overhead, and therefore
doesn't count) and if you could get such a reference the compiler would be
forced to reference count the member via the GC to satisfy the Go memory
model.

~~~
EdiX
> First, there is no way to get a reference to a field inside a struct in Go

what?

[https://play.golang.org/p/1OJ5U5ZwFJ](https://play.golang.org/p/1OJ5U5ZwFJ)

> if you could get such a reference the compiler would be forced to reference
> count the member via the GC to satisfy the Go memory model

no reference counting in go.

~~~
saghm
I'm not that familiar with Go, but my guess is that "reference count" here
refers to the action that the garbage collector does when it pauses the world
(i.e. "count as a reference") rather than the actual paradigm "reference
counting".

------
zrm
Can anyone explain the reason why C or C++ compilers can't do this? Obviously
the language specs allow you to do the unsafe thing, but suppose we add some
"-Wreference-lifetime" flag to gcc that warns if it can't statically verify
that a reference or pointer doesn't outlive the referenced object, and then
compile everything with "-Wreference-lifetime -Werror" from now on.

What aspect of the language in particular makes that impossible? Or is it?

~~~
m1el
In principle, nothing stops C++ compilers from implementing this feature.

However, this

\- requires additional input from the programmer, such as lifetime parameters
in Rust.

\- C and C++ don't have ADTs which help working with lifetimes A LOT (e.g.
having Option<Box<T>> as nullable pointer).

\- will reject 99.9% of useful libraries and programs written in C.

\- would be equivalent to using Rust.

~~~
lmm
For the record C++17 does introduce std::variant. It's horribly cumbersome but
it does at least exist.

~~~
m1el
It's horribly cumbersome AND it doesn't solve the same problem.

    
    
        // Rust
        let v: Result<i32, i32> = Ok(1);
        // std::variant (doesn't compile)
        std::variant<i32, i32> v = 1;

~~~
wyldfire
I had to RTFM a bit to understand this example, so here's a more explicit
description of this case for those playing along at home:

AFAICT the idiom for Result<i32, i32> is a pair of "numerical-result-if-
success"/"errno-if-failure". Seems like a great idea, allows us to not
overload a simple integer result and end up ignoring the failure path by
accident. "Ok(n)" would get assigned to the "numerical-result-if-success" and
"Err(m)" would get assigned to the "errno-if-failure". I think this means that
"v.unwrap()" would panic if it had been assigned "Err(m)", which is what most
people would want/expect.

AFAICT std::variant can't do overload resolution correctly when both of the
union types are the same. I'll wager that if you use an aliased type it will
still choke on this ambiguity. So if you wanted something similar in C++ maybe
you'd have to wrap it so it's not POD?

~~~
m1el
It's not quite a "pair". It's a "tagged union" in C-speak, "enum" in Rust-
speak and "sum type" in CS-speak. In this case, numerical result and errno
will be located in the same memory, but have different tag.

------
amelius
There's one area where Rust's typesystem will probably be counterproductive,
and that's flexible and efficient graph algorithms. (Someone please prove me
wrong).

~~~
rkangel
Which is where the 'get out of jail' card in the form of 'unsafe' code is
useful. You can always fall back to not having those guarantees checked if
you're writing core, low level, well tested code. You can then ensure the
right guarantees are made by typing at the interface, and ensure the code is
used correctly.

This is why writing data structures in Rust is considered an 'advanced' topic.

~~~
bsaul
This last sentence sounds a bit worrying. Writing data structures such as
graph is just as common as writing if then else... if this is an advanced
topic, then maybe an intermediate library providing common patterns for
building elaborate structure would be welcome ( that's usually what stdlib
types are about, but with the borrow checker, maybe those aren't enough).

~~~
plinkplonk
I've been wrestling with this very concept. I often find myself writing custom
variants of datastructures and relatively uncommon datastructures, and a
language that makes this difficult isn't very motivating to get into.

What is really needed is good documentation on this specific aspect.

 _If on the hand if the messaging is "if you find yourself reaching for
unsafe, it is likely you need to rethink your code's design" and then also
"common datastructures are hard to write without unsafe" there is some slight
dissonance there (imho, ime)._

I'm not looking for 'blessed libraries' of prebuilt datastructures. I want to
code up datastructures (e.g quadtrees, graphs etc), and good guidance on when
exactly unsafe is the only way to get this done.

I don't really want to pause in the middle of a project and have to spend
serious amounts of time coding up a (custom/unusual) datastructure just
because there is no way without using unsafe (etc).

Right now there doesn't seem to be good _guidance_ on _this aspect_ (writing
datastructures in Rust) which is presented as an "advanced topic" and "the
wrong way / the hard way" to learn Rust. (there is the linked list book, which
serves as a starting point, and using Cell etc seems to work - I'm still
struggling - which is fine, that is how people learn.)

PS: None of the above is really a criticism of Rust the language or the team.
I think Rust is great, I'm just a bit frustrated with the 'build common
datastructures being "advanced" ' aspect, but in the end it is probably just
that I haven't been able to wrap my head around Rust yet

~~~
Manishearth
I think the messaging is like this because folks approach Rust and often try
to write Rust code as if it were C. You can do this in C++; you can't do this
in Rust.

unsafe is highly discouraged, but designing the innards of abstractions like
datastructures is one of the places where you basically need it.

Be aware that good generic datastructures are just as hard in C++. See
[https://news.ycombinator.com/item?id=13580418](https://news.ycombinator.com/item?id=13580418)
. Naive datastructure implementations often have strict aliasing bugs or mess
up on the destruction behavior. Using unsafe to write datastructures in Rust
isn't particularly hard; it's just as tricky as it is in C++. Having to use
unsafe is annoying, but it's a minor annoyance. The nomicon helps teach how
you're supposed to use unsafe.

------
digi_owl
Seeing the comments here i fear that Rust will end up binned alongside the
likes of Ada. Because people want to write code that runs and thats it,
correctness and safety be damned...

~~~
falcolas
So long as correct programs are incorrectly rejected, and it takes extra time
to fix those mis-matches, many people will simply avoid the pain in the first
place.

At the end of the day, many of us are paid to ship features. For many
industries, correctness and safety just can't compete with new features. It's
the reason Lisp, Python, Ruby, and Go are so popular - you can quickly write
programs which are both fast enough and memory safe.

The gaming industry, one of the bigger C++ consumers, will also probably not
move to Rust for similar reasons: hitting a schedule is much more important
than not crashing or not having memory leaks. You can always patch a game.

~~~
ccostes
The counter-argument to this being that the safety checks imposed by Rust
prevent errors in the first place, saving debugging time and getting features
out more quickly.

~~~
falcolas
> the safety checks imposed by Rust prevent errors in the first place

Not all errors, only a particular class of memory errors. Most of those memory
errors are those that a GCed language doesn't have to worry about, and C/C++
have a bevy of tools to find those errors before they leave the developer's
hands as well.

That said, Rust's borrow checker will also help protect against a class of
shared memory mutation errors as well. Is the time cost of pleasing the borrow
checker for every memory allocation worth this benefit? Probably something
only individual developers can answer. My answer is, for now, no.

------
sampo
> _pointer addition_

Isn't it a bit confusing to talk about pointer addition when the example
struct is just

    
    
        struct X {
          y: Y
        }
    

and thus the addresses &x and &y are (probably) the same. Maybe the author
thinks that &y is obtained as &x+0, but.

~~~
Manishearth
The author is generalizing to when there are more fields here.

------
augustk
In what situation would you want to do this?

~~~
ekidd
This like this are extremely common in performance-critical code. In this
case, you want to:

1\. Allocate both X and Y in the same block of memory, maybe on the stack,
perhaps in a heap. In inner loops, making calls to malloc is pure performance
poison. Most common garbage collected languages are completely unable to do
this, with the admirable exception of the C# and the other .NET languages (and
possibly some others).

2\. You want to access Y by reference, so that you can work with it without
needing to make a copy. Again, this is very common in performance-critical
code. You can do this in C, C++ and other existing "systems" languages.

3\. You don't want to accidentally keep using Y once X (and hence the
underlying storage for Y) is destroyed. This is a subtle and vile bug that can
memory corruption, once-a-week crashes on production, and week-long debugging
sessions, among other headaches. This is where C and C++ fail, and where Rust
nails it.

~~~
rbehrends
> In inner loops, making calls to malloc is pure performance poison. Most
> common garbage collected languages are completely unable to do this, with
> the admirable exception of the C# and the other .NET languages (and possibly
> some others).

Modern GCs use bump allocators and (ideally) combine allocations of objects
and their subobjects where that is possible.

~~~
ekidd
Even with a bump allocator, you're still putting large amounts of data into
the nursery generation. If you're in a performance critical inner loop (for
each line of a 50 gigabyte CSV file, or for each scan-line in an image codec),
this is still a bad idea, performance-wise. You still can't beat static stack
allocation, where you increment a pointer once to create a stack frame, and
decrement it to free memory.

In really fast Rust code, my goal is usually to entirely eliminate all heap
allocation in favor of zero-copy parsers. This can be tricky—especially for
streaming I/O using buffers, because a single chunk of input might get split
over two buffers—but Rust's borrow checker makes it possible to maintain
references into other people's buffers without shooting yourself in the foot.

~~~
jerven
With java scalar replacement (possible from escape analysis/partial or not)
calling new Point(x,y) in a hot loop does not allocate an object on heap at
all (or even the stack) unless it escapes.

In the newer JITs that escaping allocation will be deferred until the last
possible point in time.

This is actually a nice point about JVM languages, _new_ has defined behavior
and can thus be elided if the effect is not observable. malloc (or new in C++,
although I am not sure about the spec there) do not and eliding a call to new
may break the language specification as function is not called that should
have been.

I don't know what Rust does here, and what is allowed in regards to the new
operator. i.e. must a local variable be on the stack or can it generate direct
values in machine registers?

~~~
Rusky
Rust doesn't have a new operator (its equivalent to malloc is Box). When
that's used, it forces the value onto the heap- but that's not what you would
write most of the time.

Local variables that are not explicitly heap-allocated (and don't have their
address taken) can be on the stack or in registers, and this is true of both
Rust and C/C++.

------
Yan_Coutinho
Rust seems to be a good language for game development. I watched this guy
([https://www.liveedu.tv/gexon/videos/evAbX-dotakiller-
gamedev...](https://www.liveedu.tv/gexon/videos/evAbX-dotakiller-gamedev-
indiedev-onedev-rust-42)) and it made me want to try it.

------
kutkloon7
C can do exactly the same, it just won't perform the lifetime checks for you.
I haven't written much Rust, but if I understand correctly you can write code
that is valid, but won't compile because you're not 'following the rules' of
the constraints/type checker.

I find Dafny to be a more elegant and nice solution, since this language
allows you to actually prove your code is valid when it's not obvious from the
static analysis. But this might be because I'm somewhat more mathematically
inclined (and i admit that Dafny is still by no means easy to use)

------
dorianm
Other languages can definitely do it:

Ruby:

    
    
        class X
          attr_accessor :y
        end
    

Crystal (full working example):

    
    
        class Y
        end
    
        class X
          def initialize(y : Y)
            @y = y
          end
    
          def y
            @y
          end
        end
    
        y = Y.new
        x = X.new(y: y)
        puts x.y

~~~
willvarfar
The article is not about returning a reference to a struct member, its about
returning a reference to something on the stack rather than heap.

~~~
topspin
Rust ownership applies to heap allocated objects as well; if you return a
reference to a member of a heap allocated struct that reference cannot outlive
the struct allocation. The point of the article is more general than "stack vs
heap;" other languages are either incapable of expressing a reference to a
member of a heap allocated struct or incapable of ensuring the memory safety
of such a reference.

~~~
willvarfar
As the article notes, garbage-collected languages can ensure the memory safety
of a pointer into a heap object.

~~~
TuringTest
But not at compile time, which is what Rust provides.

------
mnw21cam
> What Rust Can Do That Other Languages Can't

But other languages are Turing-complete too...

~~~
amelius
Yes, but Rust might still be able to do what other languages can't. For
example, to express a given program using at most N input symbols.

~~~
wtallis
Perhaps more relevantly, we can talk about the relative sizes of the sets of
invalid Rust programs that rustc will fail to reject, and invalid C programs
that gcc will fail to reject. The halting problem tells us that neither set is
empty, but there's still a lot of room for differentiation.

(Actually, upon further thought, it may be that rustc rejects all invalid Rust
programs and some valid ones. gcc certainly accepts some invalid C programs.)

~~~
tatterdemalion
My rough approximation is that the difference between a sound type system and
an unsound type system is that a sound system rejects some valid programs and
unsound system accepts some invalid ones.

------
shadowmint
I don't think this is a particularly useful example of what rust can do.

Other languages can return inner pointers; its not an interesting feature.

The only interesting thing is that it is both safe and has no gc... but
really, this is just another example of safety in rust, via a relatively
obscure code snippet.

If you want to pitch rusts safety features, we can do something a bit more
interesting, surely?

~~~
roca
Most safe languages are unable to return inner references. Heap-allocating the
inner object doesn't count; object inlining is important for systems
programming.

You're absolutely right that there are much more interesting examples, e.g.
[https://t.co/HJV7G4n0PD](https://t.co/HJV7G4n0PD). Doesn't fit in six lines
though :-)

