
Rust Optimizations That C++ Can't Do - adamnemecek
http://robert.ocallahan.org/2017/04/rust-optimizations-that-c-cant-do.html
======
wfunction
I find it funny and think it's pretty telling that there are bold claims made
that are only crossed out after the author is corrected. I feel like a C++
compiler (or any compiler, really) could do practically any optimization that
any other one can do, unless that language has explicit restrictions
prohibiting such an optimization (e.g. like if the language prohibits the
implementation from generating multiple copies of code).

The question is just how smart the compiler would need to be, and that changes
as a function of how strict the language is. The more loose the language is,
the more smart the compiler needs to be. The more information you encode in
the type (or make easy to infer), the less smart the compiler needs to be.
Whether it's practical to develop a smart-enough compiler in a reasonable
amount of time is probably the real issue.

~~~
dbaupp
_> I find it funny and think it's pretty telling that there are bold claims
made that are only crossed out after the author is corrected_

... What else would you expect? The author observed something, drew a mistaken
conclusion, wrote it up, and then had the mistake pointed out, and so
retracted. I can't imagine the author realised their claims were wrong before
it was pointed out, so of course they're only going to cross them out after
being corrected. Of course, it's rather unfortunate that an incorrect write-up
is being upvoted to the top of HN but that's not the author's problem (other
than the choice of the rather clickbaity title...).

For the specifics of this post, there's some subtle differences between how
Rust and C++ behave that could easily explain how the author observed
differing behaviour and thus jumped to the conclusion, as explored on /r/rust:
[https://www.reddit.com/r/rust/comments/63ijkw/rust_optimizat...](https://www.reddit.com/r/rust/comments/63ijkw/rust_optimizations_that_c_cant_do/)

~~~
kibwen
_> I can't imagine the author realised their claims were wrong before it was
pointed out_

It should be noted that the Rust community team has an open offer to proofread
the draft of any Rust-related blog post to help ensure technical accuracy,
specifically to help prevent embarrassing things like this. :P Preventing the
generation of misinformation is cheaper than curing its dissemination, after
all. I've already reached out to the author to make sure they're aware of this
offer for the future.

~~~
TomasSedovic
Is this described anywhere? I've been using Rust for a few years, been hanging
around it's subreddit for about as long and this is the first time I hear
about it.

This is a fantastic idea but maybe we need to make it more visible?

------
DannyBee
Errr, C++ can do this. It just requires analysis to do it in some cases (not
all).

As a trivial example, the point about const is wrong (at least as far as i
understand it). you can const cast stuff, but you still can't modify it, to do
so is UB: Modifying a const object through a non-const access path and
referring to a volatile object through a non-volatile glvalue both result in
undefined behavior. So yeah.

As for the abi, any abi arguments are silly. really. the compiler can always
clone the function, rename it, and do what you want. or inline it. or ....

If it really matters, you'd also just change the ABI. There is no C++ standard
ABI. There's one a lot of compilers use. If we could get x% better performance
using a different one, people would do that :)

The corollary is "If you think you can only get x% better performance by doing
that, you are probably wrong".

(I'm excluding things that are low level abis, like x32, etc, since we are
talking about language abis)

~~~
int_19h
The point about const is not wrong. What you said about const is true if, and
only if, the object was originally const - i.e. you can't do this:

    
    
       const int n = 1;
       const_cast<int&>(n) = 2;
    

However, when the function receives a pointer or reference to const, it
doesn't know whether the referenced object is originally const or not. It only
knows that _it_ cannot write through that pointer; but someone else still
might be able to do so. Example:

    
    
      int n = 1;
      void foo() { n = 2; }
      int bar(const int& x) { int y = x; foo(); return y + x;  }
      bar(n);
    

Even though x is reference to const here, the compiler has to do two reads
from x here, because foo can - indeed, does - change the value referenced by
x.

Worse yet, the function can often change it itself via aliasing:

    
    
      int n = 1;
      int bar(const int& x, int& y) { y = x; return x; }
      bar(n, n);
    

Again, this is perfectly legal.

The problem is that both C and C++ lack a way to say "this is a pointer or
reference to something that is immutable", as opposed to "... something that
you can't mutate". The other problem is that C++ (but not C, thanks to
"restrict") lacks a way to say "the object referenced by this pointer or
reference is not aliased in any other respect that matters to this function".

Rust, OTOH, does let you say that something that's referenced is immutable. In
fact, it is the case by default. So the compiler can aggressively optimize
around the fact.

In C++, they can only optimize as well when the compiler can do a full program
analysis, and determine that the referenced value is immutable in practice.
Which can be hard to do for non-trivial programs, and outright impossible to
do across dynamic library boundaries.

~~~
gpderetta
template<class T> struct const_ { const T value; };

const_<T>& freeze(T&); T& thaw(const_<T>&);

do_something(const_<T>& x) { /* the language guarantees that x can't be
mutated from anywhere */ }

~~~
pjmlp
You can cast away const.

~~~
gpderetta
'value' is a const qualified object, not a const qualified reference. If you
cast away the const and write to the object you are into UB, ergo the compiler
can assume that isn't done.

Note that the trick still doesn't help much in practice: GCC conservatively
still won't optimize based on const qualification of values.

~~~
pjmlp
GCC is just one C++ compiler among many, but yes we are in UB territory.

~~~
DannyBee
Any compiler that likes can place the object in readonly memory.

So you'd discover your error pretty quickly if it did :P

------
socmag
I think this is a very objective and in-depth article on the matter.

"Criticizing the Rust Language, and Why C/C++ Will Never Die"

[https://www.viva64.com/en/b/0324/](https://www.viva64.com/en/b/0324/)

Not sure why certain members of the Rust community aren't happy with just
having a neat language. Making one false claim after another isn't doing them
any favors. It's just aggravating, especially when people who should know
better start to believe the hype.

FAKE NEWS!

~~~
kibwen
_> certain members of the Rust community_

Fortunately the overwhelming majority of the community isn't afraid to call
this post out, judging by the comments at
[https://www.reddit.com/r/rust/comments/63ijkw/rust_optimizat...](https://www.reddit.com/r/rust/comments/63ijkw/rust_optimizations_that_c_cant_do/)
:) I've never seen the author engage with the Rust community in any great
capacity (they're not a regular in any forum that I frequent, and I can't name
any libraries that they've written), so I wish he'd take the time to learn
Rust more before blogging about it.

~~~
socmag
Thanks for posting this because this has been getting out of hand recently,
and it's very good to see some sanity.

There is an opportunity to do some very cool things with Rust and I think most
people would love to see a "better together" mindset being shared.

~~~
dbaupp
It's worth noting that the sanity you're appreciating is, IME, very typical of
the Rust community. Mistakes/overblown claims will get called out (if someone
notices them), and "better together" is something people generally embody:
e.g. even the language itself incorporates good ideas from "competitors".

------
joaodlf
I love new languages (unlike a lot of HN, I'm excited about Rust AND Go), but
please remember that C/C++ compilers have been around, sometimes for more than
30 years. Do not think new compilers will just waltz in and suddenly topple
all this previous work - That will be a long march.

~~~
Manishearth
Alias analysis has been the holy grail of compiler development for a while. In
C++ it's a global, imperfect, and expensive analysis. In Rust you get it for
free.

C++ can't do the same thing Rust does because it would be a breaking change to
have finer-grained distinctions between the types of references.

This is a matter of language semantics, not compiler implementation. rustc
isn't doing anything new here, LLVM already knows how to make this
optimization and is making it.

~~~
tmccrmck
What specifically makes you think alias analysis is the holy grail of compiler
optimization?

~~~
Manishearth
I spent some time studying the subject a few years ago. Most of the current
efforts I saw were about improving alias analysis. Could have been a skewed
view of the thing, but the researchers I talked to seemed to feel like it was.
Some compiler devs I've talked to do too.

In general it does unlock a lot of opportunities when it comes to
optimization.

------
kccqzy
> a "const" value can be modified through other mutable references, so copies
> induced by pass-by-value could change program behavior.

This doesn't mean the compiler can't make this optimization. Unless we are
talking about multiple threads here (which we aren't), there is no other code
that is executing that can change what this const reference is pointing to. If
we are talking about threads, then data races like this are UB in C++, and the
compiler can do anything it wants, including emitting nonsense code.

~~~
Ace17
> Unless we are talking about multiple threads here (which we aren't), there
> is no other code that is executing that can change what this const reference
> is pointing to.

... except if your code calls potentially 'impure' functions.

void f(const int& value)

{

    
    
      const int a = value;
    
      g();
    
      const int b = value;
    
      assert(a == b);
    

}

One can't tell if the assertion will succeed.

Indeed, 'value' might have been modified as a consequence of calling 'g'.

You don't have this uncertainity if 'value' is marked as immutable.

~~~
DannyBee
"One can't tell if the assertion will succeed. "

If value was originally a const object (IE immutable), it is well defined and
guaranteed to always succeed.

IE const char* myString = "Hallo World!"; const_cast<char*>(myString)[1] =
'e';

This is illegal, the same as anything modifying value would be if it was
originally const.

If value was originally not a const object, it wasn't immutable, so anything
goes.

"You don't have this uncertainity if 'value' is marked as immutable. " You
can, in C++ mark the original immutable in a way that you can tell if the
assertion will succeed.

~~~
int_19h
The _compiler_ can't tell the difference (in general), because what it sees is
a const int&, but the actual referenced object may be a non-const int.

~~~
DannyBee
Only in a world where people who care about performance don't hand the
compiler enough of the program to make that determination (which only requires
being able to see the original allocation site and the call)

That's crazytown.

"Please optimize this as hard as you can with two hands tied behind your
back".

So like i said, i don't disagree that it's a useful thing to do, i just am not
sure i believe it matters in practice to people who care about performance.

IE the purported benefit is already achievable and achieved, quite regularly.

Again, doesn't make it less useful in general, i'm just not sure i believe it
is somehow amazing for performance to people who care about performance.

------
simion314
AFAIK there are optimization that Java/C# can do at runtime that compiled
languages can't do.

------
hellofunk
Someone should fairly offer as a counterpart any article showing the myriad of
reasons why C++ generates much faster code than Rust in most cases. There are
many such articles out there I've seen from time to time.

~~~
AstralStorm
Typically by not generating code, such as when more advanced templates are
used or constexpr, or by using a specialized or partially specialized version.

Traits, while cool, ate not potent enough at times.

~~~
hellofunk
It's more than that. While I understand that technically it should be possible
to write code in either language that creates the same machine code as the
other, the common idioms of each language lead to, in most cases, the C++
generated code being faster. I think it is a classic tradeoff in the safety a
language provides vs. its overall performance. For example, the Mozilla team
has to use unsafe blocks in the Rust codebase to get the same performance as
C++ for those areas where it is important. That is not idiomatic Rust. Mozilla
resorts to unsafe blocks just to handle basic doubly-linked lists, as well as
optimized hashtables, in order to get similar performance to C++ data
structures. My point is that when you stick to the philosophy of Rust, you
will in general have slower code that when you write normal idiomatic C++.

~~~
kibwen
_> That is not idiomatic Rust_

I'm not sure where one would get the idea that `unsafe` would be entirely
forbidden in idiomatic Rust. "Idiomatic" Rust acknowledges the danger and
subtle pitfalls of unsafe code, and thereby discourages the use of `unsafe`,
but also acknowledges that it is still possible to build safe abstractions out
of internally-unsafe code.

 _> in order to get similar performance to C++ data structures_

By Rust's definition, every line of code in a C++ codebase counts as unsafe
code. :P If Rust can get equivalent performance to C++ while only using
exposing 1% of the code to unsafety, that's an enormous win.

------
dkarapetyan
It's not just about the compiler being smart. There is nothing more you can do
with x -> 2 + x other than compile it as is. Now if I somehow prove that x
will never be greater than 10 then all of a sudden the compiler can in theory
just turn that addition into a lookup table or do all sorts of other things I
couldn't think of.

So it's more than just sufficient cleverness. It is also about the language
providing you the tools to express certain invariants that the compiler can
mechanically verify and then use in an optimization pipeline.

Edit: This is mostly out of context now and can be ignored.

~~~
wfunction
> It's not just about the compiler being smart. There is nothing more you can
> do with x -> 2 + x other than compile it as is.

> It is also about the language providing you the tools to express certain
> invariants that the compiler can use.

Compiled code is useless if it is never called. The moment it's called,
whether the caller is known during compilation or not, the caller can pass
down any additional information it has, whether for that call or to help the
callee optimize future calls. Heck, if it really wants to, the compiler could
simply let the "compiled" code be the AST or source code itself (plus a
JITter), and make the program generate code on the fly when the caller calls
it, optimized using whatever information the caller is willing to provide.
Again, nothing fundamentally in a language preventing this. It a tradeoff
between implementation difficulty, execution latency, program size, and the
like.

~~~
dkarapetyan
That language with your described properties already exists and is called SBCL
([http://www.sbcl.org/](http://www.sbcl.org/)). I'm not sure what you're
arguing at this point though and don't quite get what point you're making.

Compilers, optimizations, type systems, and proof systems are all very closely
related and if you want correct optimizations then you must annotate your code
with types that the compiler can verify and use as assumptions in an
optimization pipeline. These things are not easy to engineer. Saying just JIT
it doesn't make much sense.

------
filmor
Please drop the ?m=1 from the URL such that the correct page version is
resolved on Desktop.

