
Making Rust Fly with MIR [video] - bluejekyll
https://air.mozilla.org/making-mir-fly/
======
kibwen
This is actually a rather light introduction to MIR, I'm not sure if we have
an ideal summary document somewhere but for now the original RFC may be more
instructive:
[https://github.com/nikomatsakis/rfcs/blob/mir/text/0000-mir....](https://github.com/nikomatsakis/rfcs/blob/mir/text/0000-mir.md)

------
unfamiliar
I'm interested in things like how the intermediate representation allows
easier optimisations, the control flow graph optimisation algorithms, and the
general optimisation passes. Can anyone suggest a good resource where I can
learn this stuff well enough to implement it? Where do the people working on
this compiler tech get their background/base knowledge from?

~~~
Retra
Generally speaking, the Rust language is quite rich, so it's difficult to for
the computer to reason about performance, but easy for it to reason about the
high-level semantics. On the other end, the LLVM 'assembly' is very primitive,
so it's easy to reason about performance, but not in a way that would preserve
the desired semantics.

An intermediate language allows you to describe the high-level semantics in a
more primitive form, so it is easier to both manipulate and reason about while
preserving the high-level semantics. In this case (as far as I understand it,)
MIR needs to be as primitive as possible while still modeling Rust's scoping,
safety, and type rules. This is an optimization design pattern that occurs
quite frequently in practice.

~~~
jakub_h
"but easy for it to reason about the high-level semantics."

Is it? Does it make it possible to argue about such things as that mappings
can be fused, for example? I keep wondering about the possibility of very-
high-level algebraic transformations of programs (although admittedly, this
could be shifted into some kinds of CASE tools instead).

~~~
Retra
It makes it easy to verify that the high-level semantics are only those
allowed by the language. It's fairly easy to look at an AST and tell if it is
a legal construct in your language.

>Is it? Does it make it possible to argue about such things as that mappings
can be fused, for example?

It makes it possible to know that you are "fusing mappings." In the assembly
language, you would have a hard time knowing that this is what you were doing,
since it communicates semantics with too much granularity.

------
saosebastiao
Are all types resolved before the Mir stage?

~~~
tatterdemalion
Yes, though lifetimes are not.

~~~
saosebastiao
Interesting. From what I understand, lifetime resolution is pretty
deterministic...mallocs always happen at the beginning of a lifetime and frees
always happen at the end of the lifetime. Would it be possible during the Mir
stage to perform analysis and optimization to modify lifetimes? (ie. Array
preallocations, chunked frees, etc.)

~~~
tatterdemalion
The short answer to your question is 'possibly.'

Ownership and borrowing are actually two complementary systems, not one.
Because of the ownership/initialization rules Rust has we know that mallocs
occur when the object is constructed and that the data is freed when it goes
out of scope, but for the borrowchecker dynamic memory has really been
abstracted away - and the borrowchecker is guaranteeing the validity of
pointers into the stack just as much as pointers into the heap. In fact a
language could have ownership without having borrowing (at one point, Rust
did).

The main thing performing borrow checking on MIR allows for is actually a more
expressive lifetime system, particularly what are called single entry,
multiple exist (SEME) or non-lexical lifetimes - that is, knowing that in one
branch a borrow lasts longer than in another branch. The classic example of
how this is useful is that the naive way to write "get or insert" for a
hashmap is a borrow checker error today. With non-lexical lifetimes it won't
be an error, because it will know that you are only performing the "insert"
when you didn't get a reference from the "get."

Its possible that because the MIR has stronger guarantees than LLVM IR can
normally assume, there will be optimization passes that make sense on it to
reorder codepaths to be more optimal, like what you were suggesting, but I
don't know.

