Hacker News new | comments | show | ask | jobs | submit login

One of my favorite things about Rust is one of the practical applications of the safety. Specifically, it's that I can write multithreaded code without fear because the compiler won't let me get it wrong. It's far far far too easy to screw up multithreaded code if you're using any kind of shared data, and Rust is the only language I know of that truly makes it safe without compromising on performance.

As a trivial example, some time ago I fixed a subtle threading bug in fish-shell. The code used RAII and lock guards, which is good, but in this particular case the lock guard was created using the wrong lock. So it was locking, it just wasn't locking the correct lock, meaning the data it was mutating was subject to a data race. As I fixed that, I found myself wishing the program had been written in Rust, because that sort of bug simply won't happen in Rust.




That's one thing I hate about Go, is how easy it is to shoot yourself in the foot with it.

Really, it has a decent way to manage inter-thread communication - channels. But the language still permits me to read/write a "non const" global variable from a goroutine.

This is horrible, especially when refactoring.

Each goroutine should have it's own scope (unless explicitly defined).


More than once I've also had to kick myself for making the rookie mistake of closing over a loop variable. So something like:

    for _, v := range values {
      go func() {
        // Do stuff with v
      }
    }
Instead of:

    for _, v := range values {
      go (func(v string) {
        // Do stuff with value
      })(v)
    }
It's rare enough, and subtle enough, that every instance tends to result in 5-10 minutes of puzzled debugging with stdout printing statements until I realize what's going on. What does Rust do here?

I like to say that Go's strictness is unevenly distributed. Unused imports are illegal, but Go is perfectly happy to let you shadow variables, have closure use loop variables, or reassign the built-in values ("nil", "true", etc.). It's like a parent who locks the scissors away in a drawer but doesn't mind leaving drain cleaner on the kitchen table.


> What does Rust do here?

This code results in a compilation error:

    for v in values {
        spawn(|| {
            // Do stuff with v
        });
    }
Error output as follows:

  error[E0373]: closure may outlive the current function, but it borrows `v`, which is owned by the current function
   --> <anon>:7:15
    |
  7 |         spawn(|| {
    |               ^^ may outlive borrowed value `v`
  8 |             v;
    |             - `v` is borrowed here
    |
  help: to force the closure to take ownership of `v` (and any other referenced variables), use the `move` keyword, as shown:
    |         spawn(move || {
You can play with the code here: https://is.gd/f4guCw

As the error message says, the real problem with this code, given Rust's semantics, is that the closures are being given pointers to memory that might not be valid by the time the closure gets around to executing (unlike in Go (or any other GC'd lang), the mere existence of a pointer is not enough to keep memory alive). And as the help text at the bottom of the error message describes, one solution is to have the closures themselves assume ownership of the data via the `move` keyword on closures.


Excellent, thanks. Great error message.


Go might have some complex concepts initially and be a big language but its error messages are impressively helpful and precise.

I'm very thankful to that level of dedication.


The error message above was from Rust, not Go, is that what you meant? :P


Fortunately, that mistake isn't possible in Java with lambda expressions or anonymous classes. It is a compile error if any captured variables are not "effectively final".

Prior to Java 8, which introduced lambdas, variables used with anonymous classes were required to be final. That restriction is now relaxed as the compiler can infer it.

Interestingly, the for-each loop variable is effectively final, unless you explicitly modify it, so this code is legal (and correct):

  for (String v : values) {
      executor.execute(() -> System.out.println(v));
  }


I like that in C++11, you specify what should be captured and how it should be captured (by value or by reference).


Rust is the same, though it's less flexible[0]: it doesn't have capture lists, only [&] and [=], and they're very slightly different:

* the default is similar to [&] but will infer the capture mode and use the "simplest" possible one (reference, mutable reference or value) depending on use on a per-value basis.

* `move` closures are similar to [=] but will use Rust ownership semantics, so they will copy Copy values and move non-Copy values, it can capture external references (mutable or not) by value so if you needed to explicitly tweak your capture that's the one you'd use.

[0] OTOH it's more readable


I'm not sure that "less flexible" is accurate; we can accomplish the same things, but through different means.


It might have been unclear, but I meant the closure syntax/capture itself is less flexible. You can get the same result e.g. using a move closure and declaring references outside the closure then capturing that by value, but I'm sure you could also do that in C++.


Ah, yes. Cool :)


I can never decide if enforcing (effectively) final is a good feature or a bad one. Sure, you cannot shoot yourself in the foot with these limits, on the other hand you loose all the power of real closures.


You can easily modify variables outside of the stream. For example, to increment a counter declared outside of the lambda, use a final AtomicInteger or similar class instead of an int. For any other value, use a final class that wraps it.

The point of enforcing final variables is to prevent programmers from accidentally modifying things they don't want to. It does not prevent programmers from modifying variables intentionally.


I know very well that I can use inner mutability to work around that restriction, that doesn't change the fact that it is a workaround, nothing more.


It's not a workaround. It was designed to work that way on purpose. The people who developed this feature could have easily prevented programmers from using any variables outside of the stream, final or not, but they chose not to because they wanted to allow programmers to use final variables in this way.

Furthermore, it does not cause you to lose "all the power of real closures" like you said it does. All you lose is the ability to use a closure around non-final variables, which is a trivial drawback in any use case I've ever come across. You lose only a little bit of the power of closures.


There are no final variables in Go. Anybody who pretend Go has a good type system is a liar. Go type system is broken. It doesn't make the language bad, it makes it a missed opportunity.


>There are no final variables in Go.

const?


const aren't final variables, you can't have a const pointer or a const array with Go.


Shadowing a variable really ought to be an error. At least, you shouldn't be able to hide a local variable with another local variable. Yes, sometimes you have to write "vv" in the inner loop instead of "v", and not feel as l33t. Deal with it.

(A shadow variable problem in C just turned up in firmware for a surface mount reflow soldering oven I have. The variable name is "avgtemp". This may explain why some ovens scorch PC boards.[1] Read the discussion. Note that none of the people writing about the issue understand that "static" would confine the scope to one file, and that uninitialized variable declarations at top level are global to the whole program. That's probably because they came up from Arduino land, where people are not taught to think about that stuff. Arduino land is really full C++ using gcc, but it's not taught that way.)

[1] https://github.com/UnifiedEngineering/T-962-improvements/iss...


>Note that none of the people writing about the issue understand that "static" would confine the scope to one file

That seems a bit surprising, unless they did not know C even reasonably well. IIRC (though I haven't used C much lately, I did a lot with it earlier), that is a not-too-advanced feature of the C language. I think it is covered in the K&R C book, near the middle or in the latter half (don't have it handy right now to check).


I don't think the parent is talking about problems with shadowing.


They wrote "but Go is perfectly happy to let you shadow variables...", although their main issue was with closures.


Whoops, right you are!


I feel like this specific issue Go just got wrong. 99% of the time you want a new variable in the range loop.

I think Go's strictness is very practical -- it's strict where they saw real problems and the cure was easy.

Something like threading race conditions is a real problem, but not easy to fix.

Go does give you a pretty good runtime tool with the race checker, though. It would quickly catch something like the fish bug of grabbing the wrong lock.


>Something like threading race conditions is a real problem, but not easy to fix.

It is. It's called channels.


Sure, you could only use channels and you'd never get races, but in practice that'd be unwieldy and slow, which is why people write traditional lock-based sync stuff in Go all the time, and depend on old-fashioned debugging (and the race-checker).

Rust really fixes this (with its ownership system), but it wasn't easy.


> Sure, you could only use channels and you'd never get races

Go allows sending pointers to non-locked structures over channels, so it's quite easy to "only use channels" and still get race.

You can also hit that issue if you spawn multiple goroutines sharing the same initial lexical environment if they make use of a mutable structure from that environment.


If you're accessing mutable data from multiple goroutines, then you're not only using channels! :)


Of course you are. Go has essentially no support for immutable data structures, and even if you send large structures by value (which can get expensive) they themselves probably embed pointers to mutable data, and you're back at square one.


I didn't say it was a good idea (in fact, if you read far enough back up the comment chain, you'll see I was saying it's a bad idea), but you could, if you wanted to, restrict yourself to only communicating between goroutines via channels of immutable data and be confident that your code had no data races.


>unwieldy and slow

So force one to do it consciously (kind of like unsafe).


You should use go vet

https://golang.org/cmd/vet/#hdr-Struct_tags

(There seems to be a documentation error.)

And try the race detector:

https://golang.org/doc/articles/race_detector.html

Though of course, it's hard to argue that it would be nice to prevent these from compiling.


Indeed, it's invaluable. I use "go vet" and have enabled all of the Gometalinter linters [1] that make sense.

"go vet" never warned me about goroutine closure errors, not sure why, maybe I was running an old version. But I'm glad that it's supported.

That said, some of the things that "go vet" catches should, in my opinion, be errors.

[1] https://github.com/alecthomas/gometalinter


Well, that's an "easy" bug.

What about not-shadowing a "non-loop" variable?

Take a look at this question:

http://stackoverflow.com/questions/18499352/golang-cuncurren...

It's easy to mess up during refactoring, and there's pretty much no reason to allow it.


Rust will prevent you from doing this, mostly. Basically, the borrow checker will either force you to move the value, preventing it from being used elsewhere, or copy/clone the value which prevents any issues.


I don't know Go, so I don't really understand what the issue is with the code you posted. In Rust, the closure used to spawn a new thread needs to own its environment; if it does then there's no problem. If it doesn't that's a data race and you have a compile time error about it.


The error in the Go snippet is that "v" is a single memory location shared throughout the loop.

The closure doesn't get a copy, so what usually ends up happening is that every goroutine gets the last value (since the goroutines are probably not scheduled until the end of the loop).

This problem doesn't just affect loops, of course: A goroutine can access any local environment outside its scope, and local variables can mutate.

The worst surprise I ever encountered was this (simplified):

    func makeWorkersDocomplicatedStuff()
      ch := make(chan string)
      defer func() {
        if ch != nil {
          close(ch)
        }
      }
      for i := 0; i < numWorkers; i++ {
        go func() {
          for {
            select {
              case s := <-ch:
                // ...
            }
          }
        }
      }
      close(ch)
      ch = nil

      // ... More stuff ...
    }
What will happen here is that the goroutines will all block forever, because "ch" becomes nil, and in Go, polling a nil channel will block forever (it's an interesting design choice in such a strict language).

Rewriting and refactoring this was trivial enough, but catching it was time wasted. Lessons learned: (1) Be scrupulous about closure environments, (2) be super careful about nil channels, and (3) try to avoid defers whose concerns don't fully encapsulate the function body (so defers in the middle of a function is often code smell).


> The error in the Go snippet is that "v" is a single memory location shared throughout the loop.

In Rust this is not true. However, the same concept applies - say it was a variable from outside the loop. In that case, you would get a clear compile time error about moving the value into the thread's closure more than once.

A similar error would apply if you iterated over a container by reference (instead of by value).


I had a relatively simple CLI tool written in Go that nevertheless had a data race bug that I hit about once a month. I wasn't able to track it down until Go actually introduced a data race detector, at which point I found it immediately. The race occurred when I spawned a task and watched the task's stdout and stderr. I assumed it used one goroutine that listened on both pipes, but it turned out that the stdlib actually used one goroutine per pipe, meaning the shared buffer that I'd captured in both callback functions was being raced on. Of course, the stdlib didn't document how many goroutines it used in that scenario. So yeah, footgun, meet foot.


I was going to leave basically this same comment -- I like Go, but the fact that it doesn't have nice locking support always gets to me.

(Almost every Go program I've written uses goroutines so I end up using locks in almost every program I write)

There's no way to do this without language support or sacrificing perf with interface, so I get why nicer locks don't exist, I just wish they did.


Ooh, a chance to talk about Rust in the context of fish.

I did the initial introduction of pthreads to fish 1.x, when it was thread-oblivious, and it was very difficult because there was a lot of global data (it's a shell, after all). Races and deadlocks were initially very common, and Rust could have helped with at least the first problem. It would have been very valuable.

More generally, the Rust pattern of having a lock own the data it protects is really nice. This can be more-or-less implemented in C++11 via lambdas, which should solve the wrong-lock problem. (I didn't use that technique at the time because I was targeting C++98.)

It's fun to think of a shell in the context of Rust. One place where Rust's guarantees cannot provide much help is signal handling. Signal handlers require global data to do anything, and signals are incompatible with most of Rust's ownership machinery for globals, primarily locks.

Code that runs post-fork is similar: it must not allocate memory, and there's no protection against that in Rust (or in C++).

A second place is the system interface. fish often has to dip below the surface-level APIs. It can't use getenv() or strerror(), and instead has to access `environ` and `sys_errlist` directly. Many of Rust's safe interfaces would not be usable, and we would have to write replacements using unsafe code.

fish also uses shared memory (mmap, shm_open) in a few places. Rust is supposed to protect against data races, but I don't see how it can do so in the context of shared memory.

Last and probably most importantly, shells make use of a lot of the horrible termios stuff, which tends to vary across systems and make heavy use of the C preprocessor. This will be gross anywhere, but especially gross in anything that's not C or C++. I first attempted to write a shell in Go, and this is why I gave up.

Overall IMO Rust could provide a lot of value for a shell, but probably will require a fair amount of unsafe code at the edges and fiddling to support legacy interfaces.


We've not had any issues developing the Ion shell in Rust -- performs as well as Dash. Basically, you can't come at Rust with the mindset of object-oriented programming. Make your shell event-driven and you'll do perfectly well. Queue your signals and act on them at a later time when you can.

I've been particularly fond of Rust's Iterator trait which has been quite valuable for efficient parsing in Ion.


> I don't see how it can do so in the context of shared memory.

Basically, it does not let you access shared memory without _some_ kind of synchronization primitive in safe code. That might be an atomic variable, or maybe a mutex, whatever.


If two threads independently mmap the same file, what is the thing that Rust forces them to synchronize on?


The de facto memmap crate for Rust makes it unsafe to access the contents of a memory map: https://docs.rs/memmap/0.5.0/memmap/struct.Mmap.html#method....

This doesn't actually answer your question, but the presence of `unsafe` will at least warn you that you need to put some thought into what you're doing.


mmap can't be exposed with a safe interface, in my understanding.


Of course, you can access `/proc/self/mem` safely so technically it's all possible in safe code.

But that's not something Rust can prevent.


That's a known bug: https://github.com/rust-lang/rust/issues/32670

This bug was posted on April 1st, being a bit more ha-ha-only-serious than the emoji-based error handling one that the core team put on. AFAIK, it's the only WONTFIX safety bug in Rust's issue tracker.


FWIW, you can avoid allocation in Rust by limiting yourself to libcore, which doesn't know about allocation. Stick the #![no_implicit_prelude] attribute on your module and now you can't use anything from liballoc or libstd without an explicit `use` directive.


I just picked up a book on Rust yesterday and am looking forward to working with it. I'm glad to hear that Rust makes concurrency easy, since that is a huge requirement in my current project. Since you obviously know more about it than I do (I'm currently a chapter-one-level neophyte), does Rust's ease come via the compiler catching errors for you, or are there mechanisms in the language itself that protect you from the pitfalls of concurrent programming?

There are other languages that make concurrent programming easy and less error prone. Elixir is the one I'm probably most familiar with. It achieves this ease, in part, because it doesn't allow you to store state in objects, the way, say, Java does. Instead, you pass data through a function chain that will always produce the same result, unlike object-oriented code. There's more to it than that, of course, but that's a good starting point for understanding how Elixir helps with concurrency.

If you don't mind me asking, what about Rust makes concurrency less error prone?


I read this quote somewhere and its helpful:

The problem with threading is, shared mutable state. Most functional languages solve the issues with threading by eliminating, mutable. i.e. Only allowing "shared <strike>mutable</strike> state". Rust solves the issues with threading by enforcing mutual exclusion of either, shared xor mutable. i.e. Only allowing, "<strike>shared</strike> mutable state" or "shared <strike>mutable</strike> state".

> does Rust's ease come via the compiler catching errors for you, or are there mechanisms in the language itself that protect you from the pitfalls of concurrent programming?

Yes, the Rust compiler catches and responds with (very helpful) error messages, when "the rules preventing pitfalls of concurrent programming"[1] are broken. Rust has a powerful type-system, and a borrow checker[0], it does not have special mechanisms in the language related to concurrent programming. The powerful type-system + borrow checker, happen to additionally solve the pitfalls of concurrent programming.

[0] Both the powerful type-system, and the borrow checker, are extremely useful for other reasons not related to concurrent programming.

[1] "The rules preventing pitfalls of concurrent programming", are not defined by the compiler, but rather by the stdlib. Similarly, (session types)[http://munksgaard.me/papers/munksgaard-laumann-thesis.pdf] are a cool way to solve "the pitfalls of protocol programming". Again, nothing explicitly in the language regarding protocols, however, is solvable by a Rust library because of a combination of the powerful type-system + borrow checker.


Thank you. That is very useful.

Jessica Kerr posted on Twitter, "GOTO was evil because we asked, "How did I get to this point of execution?" Mutability leaves us with, "How did I get to this state?" which feels relevant to this discussion.


> quote somewhere

I've said this a lot. Here's one version of that talk, IIRC: https://vimeo.com/144809407


Thanks. I will watch this.


You might enjoy reading https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.h...

It's from before 1.0, so some of the method names have changed, and scoped threads are in an external library instead of std, but the principles are all the same.


Thank you!


Which book did you pick up?


A pre-release copy of Programming Rust, by Jason Orendorff and Jim Blandy.


Clang provides annotations that can do this for C++.


Which annotations are you thinking of?



Cool, was just curious. Thanks.


> the compiler won't let me get it wrong. It's far far far too easy to screw up multithreaded code if you're using any kind of shared data, and Rust is the only language I know of that truly makes it safe without compromising on performance.

While I am, in general, a fan of Rust's focus on safety, I think this particular feature (data race prevention) may actually be somewhat problematic in terms of making code safer. I'm worried that it may be sort of analogous to "all-wheel-drive" being marketed as a winter driving safety feature that ends up (perhaps apocryphally) causing more accidents because it instills a sense of overconfidence that results in drivers neglecting more important/effective safety practices (reduced speed, snow tires, etc.). I think it's beneficial to read one of the Rust issue threads, "Rust does not guarantee thread-safety #26215"[1], about why they stopped referring to Rust as "thread safe".

Data races only occur in the context of "casually" sharing objects between asynchronous threads. That is, accessing a shared object from asynchronous threads directly, instead of through a "fail-safe" access control mechanism. Some programmers may be of the position that directly accessing shared objects is perfectly fine in some contexts. In those cases Rust's data race safety feature is a plus.

But data races are really just a subset of race conditions, and Rust doesn't prevent those other race conditions. The practice of directly accessing shared objects is prone to both (low-level) data races and (higher-level) "non-data race" race conditions. I'm worried that the larger effect of touting/marketing Rust's data race safety is to "(over-)legitimize/condone" the practice of "casually" sharing objects asynchronously, resulting in the neglect of prudent access control mechanisms (even if only by the inexperienced), and an increase in "non-data race" race condition bugs.

So, in the interest of public safety, perhaps all-wheel-drive cars should be bundled with some sort of warning/notice that prudent winter driving practices (and speeds) should render all-wheel-drive almost irrelevant as a safety feature. And perhaps an analogous one for prudent asynchronous object sharing practices and Rust's data race safety.

[1] https://github.com/rust-lang/rust/issues/26215

And a related article on safer asynchronous object sharing in C++ (shameless plug): https://www.codeproject.com/articles/1106491/sharing-objects...


> I think it's beneficial to read one of the Rust issue threads, "Rust does not guarantee thread-safety #26215"[1], about why they stopped referring to Rust as "thread safe".

No they didn't. That issue was closed as WONTFIX and rust-lang.org still says to this day "… and guarantees thread safety".

More generally, Rust can't protect you from logic errors, but it does more than just guarantees freedom from data races. The very issue you referenced has a discussion on this topic, about how the phrase "thread safety" isn't well-defined, but that Rust does give you a stronger notion about consistency in a multi-threaded world than just freedom from data races.

I genuinely don't understand your all-wheel-drive comparison. You seem to be arguing that the consistency guarantees Rust provides are actually bad because it will trick users into thinking that they don't have to give any thought at all to logical races in threading. And that's nonsense. Users have to think about that regardless of the consistency guarantees the language provides. The fact that Rust does most of the heavy lifting for you makes it a lot easier to reason about the logical races, because you know you don't have to even consider the consistency issues that Rust protects you from, which means you have much less complexity to reason about. In addition, most of the synchronization mechanisms that you need in order to share mutable values across multiple threads will tend to protect you from logical races too. For example, if you have a value that you want protected by a lock, you can't just stick the lock in the value and lock/unlock it in every method, because that doesn't help you share the value itself across threads. So instead you'd probably wrap the value itself in a Mutex, which you can now share easily (e.g. via Arc), and now the Mutex guards the whole value instead of just guarding every function call, meaning you won't have logical race issues when calling several methods on the value in a sequence.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: