Hacker News new | past | comments | ask | show | jobs | submit login
Miri: Interpreter for Rust's mid-level intermediate representation (github.com/rust-lang)
218 points by ingve on March 27, 2019 | hide | past | favorite | 38 comments



As of today, you can easily install miri on the nightly channel, which is why this is popping up today.

  $ rustup component add miri —toolchain=nightly
You can then “cargo +nightly miri test” to run your tests under miri. You probably should “cargo clean” first.

Of course you can skip the explicit nightly flags in those commands if your default toolchain is nightly.


A resource for those trying to install nightly on OSX is the page tracking component status for nightly builds: https://rust-lang.github.io/rustup-components-history/x86_64... (the page also contains links to the status of components for all other platforms)

Unfortunately both Clippy and RLS have failed to build since 3/24 but 2019-03-23 should work.

    rustup install nightly-2019-03-23


Would it be accurate to say that running Miri against your code might catch some issues in unsafe code that the standard compiler will not?


Pedantically, it's running your code via Miri. Miri is an interpreter, so you have to trigger undefined behavior during a specific run of Miri for it to be reported.


I think that’s a fair point. It sounds like it’s possible with Miri to write unit tests for specific UB concerns, and it’s more likely to catch them than standard unit tests. This will be useful.


I haven't put any thought or time into it yet, but I assume you could combine it with a fuzzer. cargo-fuzz [1] already does this with LLVM's sanitizers, but I'm thinking something like proptest [2] could be used to help find those specific inputs and run under miri.

It may even be as simple as creating the normal proptest tests and then doing `cargo miri --test` or something similar!

[1]: https://github.com/rust-fuzz/cargo-fuzz

[2]: https://crates.io/crates/proptest


Fuzzing + miri sounds like it will be incredibly slow. Probably slower than just using asan.

The good thing about UB in rust is it's easy to target - you don't need to explore the whole space, just module boundaries that encapsulate that space. So something more fine grained should be fine with miri, and fuzzing can capture more complex errors like panics.


Yes, that's what I mean about using proptest (if you are unfamiliar, it's akin to QuickCheck). You'd write the proptest functions for your dedicated unsafe section.

As an example, I use this with Jetscii [1] in an attempt to throughly test SIMD vs non-SIMD code. If Miri detected errors with SIMD (I don't think it does now), then I could run those tests inside of Miri, having them pull double duty.

[1]: https://github.com/shepmaster/jetscii/


Oh, yes, agreed that proptest is way more viable here and probably a good solution to pair with miri.


It would be very nice if cargo-fuzz targets could be swapped to work with proptest, quickcheck, and miri. It's something I've wanted to do for a while, but I don't have as much time to maintain cargo-fuzz.


Absolutely, that is in fact the point :)


This is giving me an error:

    > rustup component add miri --toolchain=nightly
    info: downloading component 'miri'
    info: installing component 'miri'
    info: rolling back changes
    error: component manifest for 'miri' is corrupt

If Miri as in interpreter, are there any plans to bring a REPL to Rust? It's one thing I sorely miss.


Are you on the latest nightly? It will only work on the latest nightly, and going forward into the future.

There’s another sub-thread on HN about that, check it out :)


FWIW, I manually attempt to update to nightly (nightly-x86_64-apple-darwin) once a day, not because I need to, but out of curiosity. I find that clippy often prevents me from updating for a day or two before working again.

Today happens to be a day (and I think yesterday was, too) when I get: "error: component 'clippy' for target 'x86_64-apple-darwin' is unavailable for download for channel 'nightly'" and sure enough, I get the misleading "manifest for 'miri' is corrupt" when I try to add miri.

Surprisingly, I used "rustup component remove clippy" and still couldn't update to nightly, still with a complaint about clippy.

I'm not mentioning any of the above because it bothers me; I expect things to work tomorrow or the next day. I just assume this might be biting a lot of people (perhaps everyone on MacOS).


Yep, that's the issue with nightly builds: you're living on the edge, and sometimes, you get cut :/

Totally, and I appreciate it. Good luck. :)


Yeah, the tools team is looking into providing a nightly-with-tools experience that's smoother about this stuff.


Namely here, in case they drift apart: https://news.ycombinator.com/item?id=19501277 :)


I often find myself yearning for a Rust REPL. I wonder if it'd be possible to run the rust compiler to lower the code down to MIR, and then run that in the Miri interpreter?


You might be interested in issue #511: Building a REPL on top of miri [1]

If you just want to get the MIR, you can do that today. In the playground, expand the "Run" menu and select "MIR".

[1]: https://github.com/rust-lang/miri/issues/511


I think the problem is not really evaluating Rust code on the fly: you could just as well compile it on the fly and execute is, this is what the Haskell interpreter ghci does.

The difficulty is more in defining what it means to enter code one line at a time, what about context, scope, lifetimes, all these stuff.

If a more knowledgeable Rust team member can confirm…


I think some of the typing would be problematic, as it allows action at a distance.

e.g.

  fn bar(i : &u16) {
     i + 1;
  }

  let a = 8; // a is a u8.

  bar(&a); // Now a needs to be a u16,
           // but we executed the line of code
           // above, which forced it to be u8.
           // Is this a type error?
You can probably solve this, by requiring all blocks of code to be executed repl style to be standalone.

i.e. the above code is entered as:

  fn bar(i : &u16) {
     i + 1;
  }

  ## Execution happens in the REPL

  {
      let a = 8;

      bar(&a);
  }

  ## Execution happens in the REPL.
And make the far above give "a not defined in this scope"


(small note, a would be an i32 here, not a u8. your point stands though)


I think you solve this by just giving an error

In [0] fn bar(i: &u16) { i + 1 }

In [1] let a = 8; // a is a i32.

In [2] bar(&a)

Out[2] Type error: a is a i32 and it needs to be a u16

In [3] let a: u16 = 8; // The original a is shadowed by the new a, the new a is a u16

In [4] bar(&a)

Out[4] 9

Or a similar situation where there isn't enough information to infer a type

In [0] let x = Vec::new();

Out[0] Type error: Need to know the type of x

In [1] let x: Vec<bool> = Vec::new();

In [2] x.push(true); // Note that if we had entered this at the same tiem as Vec::new() the type could have been inferred.


That forces you to write very unnatural Rust - it makes it a much clunkier language.


The Playground [0] and cargo-script [1] are the best options atm. In some respects, I think I prefer these. The only downside is the lack of introspection which is diminished by the use of static typing.

[0] https://play.rust-lang.org/ [1] https://github.com/DanielKeep/cargo-script


Introspection in the Playground is solvable with two things, a) everything defining debug, and b) `let _ : () = a; ` forcing Rust to tell you what type it's inferred.


It'd probably work best in a 'notebook' format. Take something like Jupyter Notebooks, throw a rust backend on it with each cell being a compilation unit... either throw them all back into a global scope or make each cell a module, I'm not sure. But repls make sense for python or something where you can realistically do line-by-line programming. Rust... I very frequently want to have an expression extend across multiple lines or define structures, etc, that a notebook just makes more sense.


Related is how use rust "as a library". I wish I could "compiler to rust":

https://internals.rust-lang.org/t/pre-rfc-first-class-suppor...


Are there any plans to use an interpreter towards having a fast compile cycle. My reference point is Haskell's ghci which reloads changes very quickly (whereas the normal compiler is similar in speed to Rust).


Not particularly; or rather, this doesn't help with that directly. There are other, bigger fish to fry first.


For a quick way to play with Miri, you can use the Rust playground [1]. Enter some code and then select Tools > Miri.

As a starting example, here's some code that Miri reports as violating Rust's rules for references [2]:

    fn main() {
        let mut i = 42;
        
        unsafe {
            let a = &mut *(&mut i as *mut _);
            let b = &mut *(&mut i as *mut _);
    
            println!("{:p}, {:p}", a, b);
        }
    }
The output is:

    error[E0080]: constant evaluation error: Borrow being dereferenced (Uniq(1797)) does not exist on the borrow stack
        --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/fmt/mod.rs:2050:24
         |
    2050 |         Pointer::fmt(&(&**self as *const T), f)
         |                        ^^^^^^^ Borrow being dereferenced (Uniq(1797)) does not exist on the borrow stack
         |
         = note: inside call to `<&mut i32 as std::fmt::Pointer>::fmt` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/fmt/mod.rs:1016:17
         = note: inside call to `std::fmt::write` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/io/mod.rs:1266:15
         = note: inside call to `<std::io::StdoutLock as std::io::Write>::write_fmt` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/io/stdio.rs:493:9
         = note: inside call to `<std::io::Stdout as std::io::Write>::write_fmt` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/io/stdio.rs:737:9
         = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/thread/local.rs:299:16
         = note: inside call to `std::thread::LocalKey::<std::cell::RefCell<std::option::Option<std::boxed::Box<dyn std::io::Write + std::marker::Send>>>>::try_with::<[closure@DefId(1/1:1025 ~ std[82ff]::io[0]::stdio[0]::print_to[0]::{{closure}}[0]) 0:&std::fmt::Arguments, 1:&fn() -> std::io::Stdout], std::result::Result<(), std::io::Error>>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/io/stdio.rs:731:18
         = note: inside call to `std::io::stdio::print_to::<std::io::Stdout>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/io/stdio.rs:753:5
    note: inside call to `std::io::_print` at <::std::macros::println macros>:2:3
        --> src/main.rs:8:9
         |
    8    |         println!("{:p}, {:p}", a, b);
         |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         = note: inside call to `main` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:34
         = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:53
         = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:293:40
         = note: inside call to `std::panicking::try::do_call::<[closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:289:5
         = note: inside call to `std::panicking::try::<i32, [closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe]>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panic.rs:388:9
         = note: inside call to `std::panic::catch_unwind::<[closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:25
         = note: inside call to `std::rt::lang_start_internal` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:5
         = note: inside call to `std::rt::lang_start::<()>`
         = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

[1]: https://play.rust-lang.org/

[2]: https://doc.rust-lang.org/stable/book/ch04-02-references-and...


> As a starting example, here's some code that Miri reports as violating Rust's rules for references

So...this was determined statically? Or "statically"? If there's no/few false positives could this be used as a check in rustc? Or clippy?

BTW for this one specifically, why can't we look at the AST and see that a and b alias?

EDIT: I guess it's because it's lost in the deref/take addr?


Miri is an interpreter, and so it works dynamically. Think ubsan.

> why can't we look at the AST and see that a and b alias?

The exact aliasing rules for raw pointers are under discussion at the moment. We may be able to look at the AST in this case, but in some cases, that requires control flow analysis. ASTs aren't great for this, but MIR is built around it, hence miri. (That said, I don't think that's the primary reason Miri works on MIR, it's more that MIR is kinda like "core Rust", the simplest possible expression of the language itself.)


It was determined dynamically, determining it statically with no false positives is impossible.

Try sticking a `println!("Hello");` statement at the start of main, it will get run before we run into this error.


> It was determined dynamically

So it looks like the defects that miri can detect overlaps heavily with the existing sanitizers (but I see at least one language-specific constraint that isn't covered by sanitizers). I suppose it's nice not to have the hassle/complexity of the sanitizers (especially nice not to have to deal with the ODR noise from UBSan), but is the intent that miri could find much beyond what the sanitizers could?

Regarding aligned accesses: does miri consider the alignment constraints of the most restrictive target? Or are there language specifications regarding alignment?

To my uneducated eye, it looks like many/most of these defects that miri detects require unsafe to cause (barring compiler defects). Is that accurate?


> is the intent that miri could find much beyond what the sanitizers could?

Miri serves two purposes. One of them is completely unrelated to this stuff, and that's to deal with const functions.

This stuff, on the other hand, yes. Those tools don't know anything about the language themselves. Miri is able to model the exact rules that we lay out for pointer usage. This means that it should be a lot better, not only at finding things, but also stuff like error messages, eventually.

> Regarding aligned accesses: does miri consider the alignment constraints of the most restrictive target? Or are there language specifications regarding alignment?

Because it's an interpreter, miri can run like any target. That said, I'm not 100% sure that 'cargo miri' exposes this easily yet; I gave it a shot but it's giving me an interesting error. The idea is that you'd specify the target you're testing for.

> To my uneducated eye, it looks like many/most of these defects that miri detects require unsafe to cause (barring compiler defects). Is that accurate?

Yes. Safe code should never produce undefined behavior, that's a hard design constraint of Rust.


> is the intent that miri could find much beyond what the sanitizers could?

I believe so. AIUI miri is checking that the code follows a formally specified definition of what is defined behavior, at least as far as the memory model goes (it doesn't check data races for instance). While it can't say that on any input your program doesn't hit undefined pointer behavior, it can say with certainty that the execution it ran through did not (w.r.t. the memory model).

Here's an example of some code that miri does catch, and where ubsan doesn't catch the closest translation I could make of it to C, despite both containing undefined behavior. If you just run the code (at least with debug settings on the current stable) it does what you would naively expect despite the undefined behavior.

https://play.rust-lang.org/?version=stable&mode=debug&editio...


I initially misread the headline and thought that the Machine Learning Research Institute had made progress in deobfuscating AI decision processes for human scrutiny.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: