As someone who's relatively new to Rust, I'm curious: what is an example of a si...

brundolf · on Nov 4, 2020

Rust's key feature - the borrow-checker - relies on the idea that each value has a single "owner" at any given time. This owner can be a function, another value (a parent struct), etc. You can put these values on the heap, but if you use Box (the go-to for heap allocation), that pointer still has to have a single logical "owner". Under idiomatic Rust, each value effectively lives in one single "place". This allows the compiler to determine with 100% confidence at what point it's no longer being used and can therefore be de-allocated.

Now, these values can be lent out ("borrowing") to sub-functions and such via references (mutable or immutable). Multiple immutable references can be handed out at once, but a mutable reference to a value has to be the only reference to that value of any kind, at a given time.

The problem is, some domains really don't lend themselves to this restricted model. No two objects or functions can point, mutably, to the same object at the same time. You simply can't create a graph of inter-referenced objects where a single value may have multiple "parents". And sometimes even with a perfectly tree-like ownership structure moving values around can get complicated, because Rust has to know for sure that the ownership model is adhered to. This is where explicit lifetimes and such can come into play. Even writing a linked-list in Rust without using unsafe { } (or Rc's) is hard (https://rust-unofficial.github.io/too-many-lists/).

In Rust, Rc's are kind of an admission of defeat. You're telling Rust not to perform its normal "compile-time" automatic deallocation, instead having it track references at runtime (which comes with overhead) to know when to de-allocate. What this buys you is basically an out from the ownership system: instead of handing off a plain reference to multiple places, which Rust may not let you do, you just clone the Rc and hand off that "new" value which can go anywhere it wants. That Rc is then what gets tracked by the ownership system and de-allocated, and when de-allocated it decrements the count (again, at runtime), and eventually that runtime mechanism (hopefully) decides the real value can be de-allocated.

Basically any part of your code that uses Rc/Arc is giving up one of the biggest features of Rust. Which is totally fine, if you're reaping those advantages elsewhere and you just need to bridge a gap where ownership is too limiting. But if heap-juggling is going to be primary thing your program is doing, you'll probably have a better overall time with a GCed language.

steveklabnik · on Nov 4, 2020

Here's an example. You want to do some computations on an array of values:

    fn main() {
        let mut v = vec![1, 2, 3];
        
        for i in &mut v {
            *i += 1;
        }
        
        println!("v: {:?}", v);
    }

They want to speed this up with threads. So they ask "how do I do threads in Rust" and get pointed to std::thread. So they write this code:

    use std::thread;

    fn main() {
        let mut v = vec![1, 2, 3];
        
        for i in &mut v {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

and they get this error message:

    error[E0597]: `v` does not live long enough
      --> src/main.rs:6:18
       |
    6  |         for i in &mut v {
       |                  ^^^^^^
       |                  |
       |                  borrowed value does not live long enough
       |                  argument requires that `v` is borrowed for `'static`
    ...
    13 |     }
       |     - `v` dropped here while still borrowed

(there's more to the error message but I'm cutting it to the start)

So they ask "hey how do I make v live for 'static" and someone says "you use Arc" so they write this:

    use std::thread;
    use std::sync::Arc;

    fn main() {
        let v = Arc::new(vec![1, 2, 3]);
        
        for i in v.iter_mut() {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

and get this error:

    error[E0596]: cannot borrow data in an `Arc` as mutable
     --> src/main.rs:7:18
      |
    7 |         for i in v.iter_mut() {
      |                  ^ cannot borrow as mutable
      |
      = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::sync::Arc<std::vec::Vec<i32>>`

So then they ask "hey I have an arc, but I want to mutate things inside of it, how do I do that?" and the answer is "use a mutex", so they write this:

    use std::thread;
    use std::sync::{Arc, Mutex};

    fn main() {
        let v = Arc::new(Mutex::new(vec![1, 2, 3]));
        
        for i in v.lock().unwrap().iter_mut() {
            thread::spawn(move ||{
                *i += 1;
            });
        }
        
        println!("v: {:?}", v);
    }

but this still doesn't work, because the lock is held during multiple threads of execution. So they figure out that they can do this:

    use std::thread;
    use std::sync::{Arc, Mutex};

    fn main() {
        let v = Arc::new(Mutex::new(vec![1, 2, 3]));
        let mut joins = Vec::new();
        
        for i in 0..3 {
            let v = v.clone();
            
            let handle = thread::spawn(move ||{
                v.lock().unwrap()[i] += 1;  
            });
            
            joins.push(handle);
        }
        
        for handle in joins {
            handle.join().unwrap();
        }
        
        println!("v: {:?}", v);
    }

I've skipped a few iterations here because this comment is already too large. The point is, they've now accomplished the task, but the boilerplate is way way way out of control.

A more experienced Rust person would see this pattern and go "oh, hey, these threads don't actually live forever, because we want to join them all, but the compiler doesn't know that with thread::spawn because it's so general. What we want is scoped threads" and writes this:

    use scoped_threadpool::Pool;

    fn main() {
        let mut pool = Pool::new(3);
        let mut v = vec![1, 2, 3];
        
        pool.scoped(|scope| {
            for i in &mut v {
                scope.execute(move ||{
                    *i += 1;       
                });
            }
        });
        
        println!("v: {:?}", v);
    }

and moves on with life. Way more efficient, way easier to write, extremely hard for a new person to realize that this is what they should be doing.

psadauskas · on Nov 4, 2020

This is exactly what I was struggling with over the weekend in a side project. My "vec" is lines from a file read from the filesystem, but my real goal is for it to be lines in the request body from an HTTP POST. As a Rust beginner, I get to go through these exact steps all over again but with tokio-flavored error messages instead, and its at least 2x more complicated. Like you said, its "extremely hard for [me] to realize [what it is I] should be doing."

steveklabnik · on Nov 5, 2020

Sorry to hear that. This is partially why there's a culture of helping people with questions; ideally when you run into an issue, you should be able to hop onto the fourms or discord and get help, and people should be able to help suss out context. It's not always easy though :/

brundolf · on Nov 5, 2020

If you want an easier to use web framework, might I recommend Rocket https://rocket.rs/k