"One of the original designers of the Subversion version control system, [Jim Blandy is] a committer to the SpiderMonkey JavaScript engine, and has been a maintainer of GNU Emacs, GNU Guile, and GDB."
> But all those other languages include explicit support for null pointers for a good reason: they’re extremely useful. [....] The problem with null pointers is that it’s easy to forget to check for them.
There is no inherent reason for that; just that mainstream languages which support `null` haven't been checking `null` usage. Some of the recent ones do. For example, Kotlin has non-nullable types by default, and null types have to be explicitly marked and checked for null-ness.
Even for Java, null analysis is built into Eclipse with the help of annotations. Though, ofcourse, it would be far nicer to have it baked into the language.
And indeed, it's the representation Rust uses. When you have the type Option<T>, if Rust knows that a value of type T can never be zero (as is the case with pointer types), then the compiler uses zero as the machine-level representation for None, which indicates the absence of a value. I get into that a few paragraphs later.
What's important is that you can't use a value of type Option<T> as if it were T; you have to check it first. This is helpful for non-pointer types as well; I include an example of that.
It's different from a traditional Option type, though, in that Option<Option<T>> is inexpressible, and that the distinction between map and flat_map is reduced.
In case anyone thinks that's an academic point, this is exactly the reason who so many dynamic languages can't tell the difference between 'key is not in hashtable' and 'key is in hashtable with value null'. I constantly get bitten by this and similar confusions when writing javascript or clojure.
Removing a layer of indirection when you had a reference type anyway is an optimization that runs into the issue that you describe. The obvious solution is to not apply it when null is already a valid inhabitant of that reference type.
Without collapsing that indirection, Option<Option<T>> is perfectly expressible:
Just (Just foo): <ptr> -> <ptr> -> foo
Just Nothing: <ptr> -> <0>
Nothing: <0>
It's true that some languages expose "nullable references" as a distinct type and not a detail of representation, and it's true (... and sometimes obnoxious) that this doesn't layer as nicely (in particular, it's not a functor).
It may be worth pointing out that the `Option<T>` type in Rust doesn't inherently involve any pointers at all, assuming `T` is not a pointer type itself.
For example, `Option<i32>` is probably (the compiler gets to choose) going to be represented as two four-byte values: the discriminant, which distinguishes the `Some` and `None` cases, and then a space for the value `v`, for when the discriminant says we have `Some(v)`. Since zero is a perfectly fine value for an `i32`, we have to store the discriminant separately.
But note that this is just a flat eight-byte value. There's no heap allocation involved. It's just as if you'd written in C:
struct O { enum { Some, None } discriminant; int32_t value };
I compiled a program that uses `Option<Option<i32>>`, and looked at the DWARF debugging info to see what the compiler did with it. It seems to represent this as a twelve-byte value: four bytes for the discriminant for the outer `Option`, followed an eight-byte `Option<i32>` value laid out as before. Since you can get the address of a value held by an enum, I guess this makes sense; the compiler can't combine the discriminants or do anything clever like that.
Option<i32> could have discriminant values 0 or 1, and Option<Option<i32>> could use discriminant value 2 for None, and 0 or 1 mean the object's an Option<i32>.
Yeah, while the ability to provide a pointer seems meaningful, it might be illusory. If our value is None, then a pointer to the inside Option<i32> is... what?
Edited to add: With your proposed encoding, the memory contents of an Option<Option<i32>> is identical to an Option<i32> precisely when there is an Option<i32> to speak about. And when there is no Option<i32>, you can tell that with one comparison, rather than by backing out an unknown number of levels. I like it.
If one alternative is larger than the others and it has an enum tag or pointer inside of it, such that you can squeeze the other alternatives before/after that tag word, you're good to go. If you had multiple alternatives of maximum size, they'd need some word in them, at the same offset/size, such that they don't have conflicting representations. An enum tag could overlap with a non-null pointer, and two enum tags with user-declared tag values (such that they do not overlap) could work too. Or if you're crazy you could discard having each type's representation be computed solely as a function of what the type is made of, choosing their representations based on how well they pack into other types. (Or worse yet, if you've got a borrow checker and your types are memcpyable, you could pack things and rejigger bytes however you wanted and then unfurl them into a temporary whenever something uses the interior value, unless I'm overlooking something. So if you had Either<Either<u32, u32>, Either<u32, u32>>, you could represent that with a single tag whose value is 0, 1, 2, or 3, and when if the tag's 2 or 3 and you make a reference to the interior value, either fix the tag in-place and fix it back when you're done (if you're the sole owner of the Either<Either...>), or copy the value out and let the borrower borrow that (and copy it back in when it's done, if it was a mutable borrow).)
The `value` you are returning has type T. T is not the same type as Optional<T>.
With AnimalMuppet's approach, you would need to return a pointer to a T (so, `&value` in something Cish). Note that this is not the same thing as a nullable reference in, for instance, C#.
`null or T` is a valid representation only when T is a non-nullable reference. It doesn't work when T is optional, and it also doesn't work when T is a primitive type, a struct, &c.
That said, there's obvious reasons it might be wanted an optimization where it's possible. For that specific case, `Just value` could be specialized to `value`, but that can be done by the compiler (akin to automatic unboxing, elsewhere).
Yeah, I'm curious about this too. It seems like (continuing to use Rust parlance) your three examples are (respectively) `None`, `Some(None)`, and `Some(Some(some value of interest))`. There is no `None(None)` case, so there's no problem.
This is fair. I was thinking of this in the context of GC'd languages, where one instead has nullable types (or sentinel null values). Eg. Java, C#, Python, Ruby, Scala.
The C++ case where a distinction between `foo = null` and `*foo = null` exists is indeed much closer to a an option type. You're right to point it out.
The book is exactly what it says it is. "Why Rust", an argument for Rust. 50+ pages. It's a fun read, but not too useful. My own comment on Rust is that the borrow checker is brilliant, and we'll see that again in other languages. The type/generic system picks up where C++ left off, and is very complex. But that may get better as the mess settles down and gets better documentation.
We now need a "programming in Rust" book which is not by one of the Rust developers, who are too close to the design.
Jim Blandy may work for Mozilla, but otherwise he has no connection to the Rust project (nor Servo). He has not been involved in any of the design discussions, and his only commits to the Rust repo are four minor documentation fixes.
I've assisted Jim in giving presentations on Rust at OSCON this year, and trust me when I say that he has no qualms about criticizing the language when he sees fit. His perspective is definitely that of an outsider's, and like you I agree that that's ideal from a pedagogical point of view.
> The type/generic system picks up where C++ left off, and is very complex.
That's not entirely true. There are some mechanisms that you can't really do in C++, at least not without a lot of boilerplate. And Rust's generics aren't quite as powerful (as in metaprogramming) as C++'s yet. I hear they're working on it, though.
I know they're working on higher kinded types, i think there's some fancy template-template syntax you can use in c++ to get the same effect.
It's easy to define a list of integers in pretty much any language,
class IntList { void addInt(){...} }
Most languages let you make a generic List,
class List<T> { void add(T){...} }
The thing that's cool about HKT, is it lets you swap the other side,
class T<Integer> { T operate(Integer) }
So you can deal with a Set or a List or an Optional or whatever might want to deliver integers. It seems a little goofy, but it's sort of like a Super Lambda. an anonymous function can only really do its one thing, but this lets you bundle a bunch of related lambdas together, so if you have, say Pay, you can group together all the different types of tax that are applied to pay, calculate total tax, etc. And it works over whatever random data structure you want
It seems a little crazy at first, but with clean syntax, it's a fantastic way to split the work apart from the way the data is represented.
Thanks for clarifying that, I'll be excited to read his book when it comes out. Also looking forward to grabbing a paper copy of yours – thanks for making that happen!
This book may be useful to people learning C or C++, as a primer on the some of the issues you have to think about if your code in those languages is to be reliable.
I don't think he was implying that; he was saying that there needs to be a book "Programming in Rust" written by someone outside the Rust projects, e.g. like Dave Thomas and Andy Hunt did with the Pickaxe book for Ruby back in the early 2000s.
Then I guess the OP had another definition of "book" in mind. However, I'll say this: if Jim wants to write a complete book on Rust, I'd love to buy it. I'm more than halfway through his report and it has exactly the tone that I'm looking for in a programming book: clear, logical progression, to the point and no excessive humour that some books suffer from (I'm reading a book on Go (the game) at the moment and every paragraph contains a self-deprecating joke. It gets old after 2 or 3.)
To expand on steveklabnik's sibling comment, Jim Blandy's is writing a complete book on Rust, which will be published by O'Reilly, and is aiming for release by the end of the year. :)
It's actually gone entirely now, so using nightly doesn't help you there. And 80% of Crates.io runs on stable, it really depends on exactly what you're doing.
(The 'crossbeam' and 'scoped_threadpool' crates have different implementations of this idea, though.)
- I want to use Huon Wilson's SIMD crate, which requires nightly.
I'm not saying these things should be stable right now, but I am saying if you try to use Rust right now you _will_ encounter a lot of stuff marked as unstable and/or only usable on nightly due to XYZ. I still think Rust is amazing and everyone should give it a shot though, but it's not as stable/polished experience as I think it will be in say a year from now. The language is stable, but the ecosystem just isn't yet.
P.S. You've helped me personally all too much over the Rust IRC channel, so thank you a million times for that! :)
I also think that year-from-now Rust will be way nicer than at the moment, it's absolutely early days. But there's a lot you can do, even on stable, right now. This is why I said it depends on what you're doing. Some use cases have absolutely no choice but to use unstable things. Others have options, and others only use stable.
As to your three points:
1. You can configure things so that you only need nightly when doing benchmarking, and run on stable the rest of the time.
2. Compiling `rustfmt` needs nightly, but using it doesn't, as it's a binary.
3. Yeah, the SIMD stuff just happened, so it's gonna take some time.
It would be good if the core Rust team owned `rustfmt` the way the Go team owns their extremely opinionated format tool. It heavily enforces very consistent code formatting throughout all publicly available Go code.
Developers like to say they want choice when it comes to formatting, but take it away and you're left with resigned programmers who are very productive at reading code.
^^ Perhaps not the right outlet for this, but maybe you can point me to where this discussion is taking place.
Yes, our intent is to officially adopt rustfmt, and we expect that it will be widly used. It's just not ready yet. nrc, who's leading up the work, is a Mozilla employee, and in general, everyone wants it to be ready. Software just takes time :)
The first two are developer tools. It's fine to have to use nightly at dev time (better, in fact, due to compile time improvements!), and then ensure your build works on stable too.
Also, rustfmt is a binary which can work without nightly once you've compiled it.
Rust's users are all developers so there's no distinction between expert and novice users.
I built rustfmt with nightly Rust and it dynamically links all kinds of shared objects from the Rust install. This doesn't happen for a simple hello world so there must be something different in rustfmt's build.
There is an important distinction when it comes to stability. Stability for libraries is much more important that stability for devtools
Stability mainly matters when you release a library and it stops working on a newer compiler, breaking everyone downstream. That's a pain for the downstream users, because they need not necessarily understand your library internals, and the software they're building will be completely broken until you update it.
If an optional developer tool breaks, it's only you that's affected, not downstream users. You can wait for it to be fixed, no problem.
Rustfmt uses internal Rust APIs to parse and reason about the code. It could use something else (like syntex), but that would be more work.
Won't nightly be the new stable in about ~15 weeks? That is pretty damn amazing. Using multirust [1] makes it easy to track (cargo,rust) for stable, beta and nightly on a per directory basis, super handy.
Current nightly will be stable in 7 weeks, actually, there's a release a week from tomorrow.
That doesn't mean that everything that's available on nightly will be available in stable in seven weeks, of course. Everything lands as stable and then is made stable at some point in the future, at least one full cycle after it's landed. There's lots of pre-Rust 1.0 stuff that's still nightly-only.
I know this not the optimal place for this comment/question, but I think likely that many who are interested in this post would be able to speak to the post.
I'm also a huge fan of the Chapel Programming language. Anyone else think they're both hitting a sweet spot and would potentially have a beautiful love child?
There is nothing in Rust that would ease solving hard problems. Graphs are unsupported. No transactional memory. No distributed computing. No migration. No support for trying. No memoization. No proofs.
Depends on your definition of a hard problem. Most systems programming requires none of these things.
It may come as a surprise to some but most systems engineering has very little hard computer science involved and is mostly about achieving very simple tasks in reliable ways with very robust error handling.
You rarely implement novel data-structures or algorithms. Not shipping with cyclic data-structures in Rust is not really a problem for most systems applications. That said, you actually -can- build them in Rust, and it's likely there will be nice libraries with various different allocation/memory management and other runtime tunables and performance tradeoffs.
If you exclusively consider hard computer science problems as "hard problems" then no, Rust is probably not for you. Consider Haskell or Julia.
One might argue that "guaranteeing your web browser doesn't have memory leaks" is a "hard problem." Ergo https://github.com/servo/servo . I agree, though, that the problems you mention are all ones where I'd love to see a greenfield programming language with as much attention to detail and support as Rust has.
In what conditions could Rust code leak memory, excluding the use of unsafe code? What are the technical reasons you can't guarantee an absence of leaks?
> What are the technical reasons you can't guarantee an absence of leaks?
Well, 'leak' is one of those things that's easy for a programmer to understand, but a hard thing for a computer to understand, because it's really about intent. How long did you intend for some resource to live? Any global value is, in some sense, a leak. We had a long discussion about this, and, at least currently, we couldn't come up with a formal enough definition of 'leak' to even start tackling the problem of "how do we solve leaks." (It is entirely possible that I am unaware about research on this topic... but given that solving leaks wasn't a goal of Rust, fixing it would just be gravy anyway. You have to choose your battles, and Rust certainly isn't perfect.)
As for how safe Rust can leak:
let x = Box::new(5);
std::mem::forget(x);
which can itself just be implemented in safe code:
use std::cell::RefCell;
use std::rc::Rc;
fn forget<T>(val: T) {
struct Foo<T>(T, RefCell<Option<Rc<Foo<T>>>>);
let x = Rc::new(Foo(val, RefCell::new(None)));
*x.1.borrow_mut() = Some(x.clone());
}
or something like "I have a thread that is holding the receiving end of a channel that infinitely loops without reading anything off," in which case anything sent down that channel leaks.
That's a very intriguing block of code there. I'd like to think that while Rust may not have been designed to guarantee that leaks are prevented, it so happens that anything that could leak memory sticks out like a sore thumb in code review, since the code to get something to leak needs to be explicit. I'd consider a global variable or a channel that has the possibility of not reading all its values to follow that pattern. Whereas it's much easier to get something "stuck" in memory without a reference in C++. But you're absolutely right that memory safety does not at all imply "leak" safety, nor is the latter well defined at all!
It is technically possible to guarantee the absence of leaks by introducing a `?Leak` trait to the language. There was a point in time when many wanted Rust to go that way, but it was ultimately decided against (it complicates other things).
Note that the trait-based scheme referenced here would only prohibit specific kinds of memory leaks. "Leaking", in its broadest usage, isn't a solvable problem in a Turing-complete language since an infinite loop can be considered a leak.
In Servo, DOM nodes are Rust values managed entirely by the SpiderMonkey GC, and clever (and evolving) techniques are used to teach Rust to integrate reliably with SpiderMonkey. Rust has the flexibility (or will) to safely integrate external GC systems.
do you know enough about low-level computer architecture to speculate if mild-to-moderate changes would be warranted in things like, e.g., cache sizes on CPUs? Branch prediction was implemented on generalizations drawn in research papers analyzing procedural codes. I'm wondering if the concurrency abstraction will add a significant working-set to the runtime that adversely impacts existing prediction algorithms. For example, predicting in a for loop is one thing, but when the target of the branch is always some new portion of memory (graphic, text, etc), it'll always be a cache miss, and there'll never be branch history for it-- because neither have touched it yet.
this is probably a silly question because the content, though it might always be in RAM, will be branched to an order-of-magnitude less than it'll be used (once it's been loaded to cache)...but still, there should be an observable temporal-boundary to the context stored in cache and branch predictor in a procedural language, that would be manifest as an abstraction-boundary in a more parallelized environment
I don't really understand what you're trying to get at, but the concurrency abstractions/safety in Rust is static, there's little-to-no dynamic cost over the raw C/C++ APIs. And, in fact, the checks mean that one can sometimes be more aggressive about what designs can be used, without risking weird runtime corruption.
Rust is a way better replacement for C/C++. If you didn't need those then you don't need Rust. It will make your software faster while not exploding in your face like C/C++, but it's not meant for you. There are other languages solving the kind of problems you care about. See: Erlang, Haskell, Idris/Agda, Scala.
And frankly: Your definition of "hard" is a bit restrictive.
not a fan of C++ but I personally think having to wade through the building blocks of C was an immensely valuable reverse turing test (ability of human to exhibit human-like behavior? apparently this term is already taken but you get my idea) and I bemoan the future where everyone is protected from their mistakes. How else did one build motivation to do it right? The gamble of a mistake is much more fun than of a compiler error.
What a strange objection. In Rust, your mistakes are discovered at compilation time. In C++, your mistakes are discovered when your code blows up in your face at runtime (and cross your fingers that the failure is deterministic). Either way the programmer still making mistakes, but one of these failure modes is clearly preferable.
yes. I'm suggesting the capacity trained and required to write safe-C is the same capacity applied to, e.g., higher level functions of your mind-- such as conceiving of, not developing, but conceiving of some of the performant optimizations in Rust in the first place.
Like a fertilizer that stimulates growth while young but a crutch which stunts advancement in old age.
The primitives to build a Rust/OTP aren't there (specifically: blessed green threads or some kind of blessed event system + reactor, for better or worse)
Without those, we're going to see a hundred different solutions crop up, none of them being interoperable.
mio is doing pretty well as a most-popular event system and reactor. I wouldn't be surprised if it's blessed in the future, but at the moment the community seems to be there.
I'm also sort of hopeful that higher-kinded types can do something about writing code that can run on any such system. If you can have some functions on an abstract Promise<T> type, like fn then(Promise<T>, fn(T) -> Promise<U>) -> Promise<U>, it doesn't super matter which backend you're using. (As I understand it, JS is already here, because they can duck-type their promise spec.)
I'm not sure what you mean by "Graphs are unsupported," exactly, but in a language as low-level as Rust, all that other stuff are library features, not language features.
I assume the comparison is to environments like Coq, Agda, Isabelle/HOL, Idris, etc. But in most of those environments, there's still a distinction between the language itself and the proof assistant.
I would not be shocked if someone figured out a brilliant way to add dependent types to Rust 1.x within the next few years.
Yes. For instance, say you're implementing a red-black tree. In case your algorithms are as rusty as mine are: it's a particular form of binary tree where every node has a "color", red or black, the children of any red node are black, all leaves are black, and the count of black nodes from the root to any leaf (aka the "black-height") is the same. The intention is that the tree is roughly balanced by following these rules: the minimum height is the black-height (all black nodes), and the maximum height is double the black-height (alternating red and black).
You might want to check that the property "the number of black nodes from the root to any leaf is the same" is always valid. With unit testing, you'd implement the insertion and deletion functions, and write a unit test where you give them a few trees and make sure that the insertion and deletion functions don't break that property, probably by looping over every leaf and comparing the black-height. If this passes, you know that your functions are probably correct and at least not obviously broken, but you don't know that there isn't some edge case you haven't thought of.
With a language that supports proofs, you'd tell the compiler about the concept of black-height, and the type of a red-black tree would include the black-height. Therefore, in order to construct a valid red-black tree, you have to have constant black-height. Just as much as you couldn't construct a red node with a red child, you can't construct a node with two children of varying black-height. When you say that your insertion function returns a red-black tree, as part of type-checking, the compiler makes sure that these properties hold -- because in turn, every helper function you call, every constructor, etc. also requires these properties to hold so that they can return an object of type red-black tree.
The downside is that it's much more involved than a few randomly-selected test cases to convince the compiler that you're upholding the red-black tree rules in every single case. This textbook has an example of doing this in the Coq proof assistant (search for the section labeled "Dependently Typed Red-Black Trees"):
> It’s very difficult to write multithreaded code, which is the only way to exploit the abilities of modern machines.
QED? Really? I'm not so sure I trust the author to give rust fair treatment anymore. An operating system that does multithreading is not the same as a modern machine.
Edit: Any brave down voters want to explain why? Threads are a way to model concurrency. There are other ways.
I'm not the downvoter. But you may be getting the downvotes because you're throwing out snark ("Really? I'm not so sure I trust the author to give rust fair treatment anymore") and dogmatic statements ("An operating system that does multithreading is not the same as a modern machine") with no explanation or justification. It sounds like you're expecting us to agree with your statement as self-evident, or be relegated to the ranks of it ignorant if we don't. That is, it sounds like a one-up game ("I'm smarter/better informed than you"), rather than real conversation.
If you don't want to come across that way, give us some explanation of what you're thinking and why. Give us some evidence that your statement is true. Something besides just dogmatic assertions that you're right and the author of the article is stupid and/or biased.
Moving on...
In the context of the OS, multithreading (or something essentially equivalent) is the only way to exploit the abilities of "modern" machines. But "modern" doesn't mean very modern (at least, as I understand it). It means the difference between MSDOS and Windows: "Windows is multitasking, whereas DOS... DOS is serially multitasking." (I don't recall who said that, but it was brilliant.) Without this capability, you're only running one program at a time.
Now, I don't care if you implement this as "theads" or something else, but if you don't have it, your OS is pretty much worthless, and has been since about 1990.
Thanks for the advice on dogmatism coupled with the statement about having a worthless OS since 1990.
> In the context of the OS, multithreading (or something essentially equivalent) is the only way to exploit the abilities of "modern" machines.
I noticed that you qualified multithreading with "or something essentially equivalent". I guess you would agree that qualifying it this way is a good idea. Mr Blandy did not care to do that though: "[...] is the only way to exploit the abilities of modern machines."
I don't like that because it sounds like he's trying to popularize one approach to concurrency without even acknowledging that it is just one approach.
It's important to mention here that Rust's approach to multithreading changes the rules of the game. Multithreading has rightfully earned its popular ire over the past few decades, but once you have a typesystem that prevents data races it's almost like there's an entirely new paradigm at your fingertips. So if Rust folk seem insistent on mentioning the use of threads, it's not because they want to push some agenda, it's because they want to overcome the suddenly-obsolete stigma against multithreading.
Note as well that Rust intends to have world-class support for a wide variety of approaches to concurrency and parallelism. Threading and channels are already in the stdlib, and fork/join and SIMD are in the works now. The goal is to enable the programmer to be able to choose the best tool for the job.
What are the other ways you want to model concurrency? Message-passing without shared memory does not exploit the abilities of modern machines. Message-passing with shared memory counts as multithreaded for the purposes of this discussion (language-level tracking of who else can access memory), regardless of whether it's a thread or a process or a task or a coroutine or a whatever.
Yeah, that's well-suited to Rust's design (although Rust didn't ship with libgreen, the requirements of libgreen influenced the type system, and you can do a similar thing with a third-party library).
In any case the distinction between "multiple processes" and "multiple threads" here is fairly minimal, at least on Linux or OS X. As far as the kernel is concerned, they're the same object (Linux calls them threads; Mach calls them tasks). It's just that sometimes, groups of these things share pointers to things like memory map or file descriptor table, and sometimes they have their own pointers. There isn't a whole lot of difference, either at the language level or the kernel level, between running multiple processes from the same executable image that map a shared heap versus just running multiple threads.
Isn't that exactly what a thread is? Well, threads generally have some management by the OS to allow more threads than CPU/share tasks evenly, but at their core, they allow you to give each core some code to run on.
(And yes, Rust's threading safety applies equally well to that. It doesn't care about the details of how code is running concurrently/in parallel, just that it could be.)
If you're talking about systems programming threads are by and large the main mechanism for concurrency.
Only in really odd architectures do you see things like manually DMA'ing to separate execution units(ala PS3) and are certainly the exception to the rule.
I agree that threads are probably the dominant form of concurrency in modern programming, but it's a relatively recent phenomenon. In pre-2.4 Linux kernels, for example, the threading was inadequate (remember LinuxThreads?) and the dominant model was multiprocessing, as it was in most of Unix itself. There are plenty of servers that still fork by default (Apache).
I believe it was Windows that really popularised multithreading as a form of concurrent programming as process creation is so expensive on that platform. I recall Linus ranting against adding threading to the kernel on more than one occasion.
Threads were introduced in UNIVAC 1108 EXEC 8, first demoed in 1967 and running in production from about 1969. They were called "activities", but were created with "fork". The OS, which ran on multiprocessors, used them internally, and applications could use them as well. EXEC 8 also had synchronous and asynchronous I/O (callbacks). It's called OS 2200 today, and still running.
When I went from EXEC 8 to UNIX in 1978, my main observation was that the I/O was better but the CPU management in UNIX was much worse.
Shared-state threads in the same memory space. They could do I/O independently, lock and unlock critical sections, and block on semaphores. Worked fine. To create a new thread, you called FORK$. See section 8, page 2, of the 1966 EXEC 8 manual.[1]
Here are Djykstra's P and V functions for the UNIVAC 1108, to run in user space. Written in 1972.[2]
I once added support for threads to Pascal for that machine. Had to add per-thread stacks and manage non-contiguous stack growth.
I thought it was well-acknowledged that the threaded servers get better performance, even on Linux (with NPTL), than the prefork ones. Am I just confused?
No, you are probably right, at least in most cases, particularly where shared memory is involved. I was mainly responding to what seemed to be an assumption that multithreading is the norm and everything else is freakishly rare. It's not the case - even today, Python programmers regularly make use of the multiprocessing module rather than use threads in order to avoid the GIL.
Does the Rust stdlib include concurrent data structures that make use of hardware-supported compare and swap? I'm thinking of an analogue to java.util.concurrent.
Not yet, but Aaron Turon, working full time on Rust, has a personal goal to make Rust have awesome support for these sort of things (lock-free data structures and other abstractions for concurrent programming). I suspect that Rust could very well become the "best" in this area.
You can see Aaron's initial work on the basics (memory management) on his blog[1], and a higher-level introduction into the power of Rust's concurrency on the main Rust blog (also written by Aaron)[2].
Eliminating deadlocks in general isn't possible in a Turing-complete language. What's cool about Rust is that it does statically eliminate a certain subset of race conditions known as data races. So while deadlocks are still possible in Rust, the typesystem is guaranteed to prevent concurrency errors that would result in corrupted data.
> Eliminating deadlocks in general isn't possible in a Turing-complete language.
It is possible. For example, you could just remove threads and synchronization primitives. :)
But it's not easy -- IMO, compared to data races it's much harder to eliminate deadlocks statically without restricting expressiveness too much. It's even harder if you count I/O related deadlocks as deadlocks (e.g. deadlocking reads on Unix pipes).
Bah, I should have been more precise with my terminology. :P I tend to consider an infinite loop a form of deadlock, even if it only involves a single unit of execution.
Eliminating deadlocks with a type system is just as possible as eliminating data races (albeit maybe a bit more awkward). The question of Turing completeness doesn't really come into it, because you are rejecting some dynamically correct programs with a static type system.
To expand on my reply to pcwalton, I consider a deadlock to be anything that causes my program to become unresponsive from which it will never recover, which includes infinite looping (or its ilk, such as infinite (possibly mutual) recursion (and even if that blows the stack eventually in practice, I don't consider that a form of recovery :P )). Can you suggest better terminology for what I mean?
This is what's usually called a "liveness" property: eventually the system will do some particular thing. These two in particular are often called "progress" in some contexts, but that's rarely formal.
In general, any of these properties can be checked statically, just like czwarich said. You have to be conservative, but that's not any different than a type system, or Rust's borrow checker. There are certainly languages that enforce termination, and you can design systems that enforce higher-level progress properties (such as absence of deadlock).
The major difference between a liveness property and the other kind (called a safety property: at no point does this bad thing happen) is that you can't check for liveness properties dynamically.
Statically checking that progress is made does imply that the language is not Turing-complete if I understand it correctly, which is why Coq, the proof assistant language where all programs have to terminate, is not Turing-complete.
This is my understanding as well, and underlies my assertions above. After all, if a compiler can tell you for certain that any given program written in a Turing-complete language will terminate, then you've literally just solved the halting problem. :P
Of course, it's true that it can't be proven whether any arbitrary program will halt or not, but in practice most (possibly all) useful programs can be proven to halt or not halt.
This is not correct. It's possible to write a sound static checker for termination of a Turing complete language. There are some terminating programs on which it will have to say "I don't know", however.
"X isn't possible because of Turing completeness" is almost always wrong. GC is another "impossible" task, yet those are everywhere.
The real goal is to find a strategy that works well enough, often enough. Dynamic GC gives up perfect collection (aiming to be merely good enough). Rust's data race guarantees prevent safely expressing some subset of correct programs.
Whether deadlock prevention can be done nicely enough for general case usage is an open question.
GC works great in practice, but as you say it isn't provable. Likewise, Rust goes a long way towards preventing deadlocks in practice, but that also isn't provable. All I'm talking about here are guarantees. Rust's type system guarantees that there are no data races; that's firm bedrock that I can rely upon. I can't rely on there being no deadlocks just as I can't rely on a garbage collector actually returning any memory. That doesn't mean these tools aren't useful in practice, only that I have to take their failure modes into consideration if I want to write correct programs.
"Check out our awesome new car! It makes driving 'easy and safe'!"
"Does it prevent crashes?"
"No, that's impossible! But here, look at our onboard computer that prevents many types of driving errors."
I like Rust. I want it to succeed. But I think the rhetoric gets ahead of the language sometimes, and not prioritizing higher level concurrency tools because you have the borrow checker is a mistake.
It's incorrect to say that Rust hasn't prioritized higher-level concurrency tools. Many of the design decisions over the previous years have been made with an eye towards supporting every concurrency paradigm.
I'm also confused about your perception that the rhetoric gets ahead of the language. The type system does indeed make things easier and safer. When people ask what that means, we're eager to elaborate on the precise guarantees that Rust provides. Misleading people as to Rust's capabilities is not on the agenda.
>It's incorrect to say that Rust hasn't prioritized higher-level concurrency tools
I think it's a fair characterization given 1.0 shipped without them, and there doesn't seem to any timeline for standardization (please correct me if I'm wrong).
I don't think I have to tell you that concurrency is one of the most important challenges in modern programming. But all Rust gives you today (and for the foreseeable future) is a pthreads wrapper.
>I'm also confused about your perception that the rhetoric gets ahead of the language
The parent comment says verbatim "Rust goes out of its way to make it easy and safe to write multithreaded code". This apparently doesn't include preventing deadlocks, a problem that is common, hard-to-avoid, and difficult to recover from. Does that not strike you as a bit of an overreach?
> deadlocks, a problem that is common, hard-to-avoid, and difficult to recover from
Deadlocks are by far the easiest concurrency problem to debug, since it's very obvious when your program is deadlocked, and in most cases a stack trace of the involved threads is sufficient to debug and fix a deadlock. Also, if your program is deadlocked it won't corrupt user data.
Rust's std::sync::Mutex is fortunately non-recursive, which makes it easier to find deadlocks during testing.
Data races are far worse since they may cause arbitrary effects at a later point in program execution, so they take a lot of developer time to track down and very likely lead to data corruption.
If deadlocks are so easy to fix why do they happen so often? Seriously, how many applications have you seen hang in your life? (not to discount the role of other race conditions in causing hangs, which the borrow checker also can't categorically prevent).
I believe the comment reads that deadlocks are the "easiest concurrency problem" to debug -- but your question is moreso headed to the point that they are easy to introduce in a system.
From the example, from a deadlock the locked threads and their stacktraces can be determined, which will help in pointing towards where the cause of the problem is. Compared to situations where the problem occurs due to a data-race causing an unexpected/invalid state but a problem doesn't manifest until later. Fixing this type of problem is more problematic as usually any exception/stacktrace that might crop up does not relate any information as to how the state became invalid in the first place. I tend to see these types of bugs more than deadlocks and in my experience they're always more involved in debugging compared to deadlocks.
This is not always the case. Null pointer derefs are just indicators that the program got into an invalid state. Is the fix to provide some alternative pathway when the program got in a bad state or is it to prevent the bad state from happening in the first place? If the fix is to prevent the bad state, tracking down how that state occurred in the first place can often be time consuming in the case of data races.
And that's a great achievement! If there were a Nobel prize for practical use of a type system, the Rust team would get it.
But overselling it ("Rust goes out of its way to make it easy and safe to write multithreaded code" except oh yeah it can deadlock at any time) is a mistake.
You act as though deadlocks in Rust are trivial and pervasive, but they aren't. It just makes no guarantees that deadlocks will not occur. Show me a language that statically prevents deadlocks and I'll be quite curious to check it out. In any case, writing concurrent code in Rust is safe and easy. I suggest you try it. :)
>You act as though deadlocks in Rust are trivial and pervasive, but they aren't
Writing nontrivial multithreaded code using thread and mutex primitives is very hard to get right (not only due to data races, but also race conditions and deadlocks). Has your experience been different?
>Show me a language that statically prevents deadlocks
OK, idiomatic Go and Erlang will never deadlock. Sure, you can use mutexes in Go, but unlike in Rust they aren't the only means of achieving high levels of concurrency (in fact, they are explicitly discouraged).
I find this attitude from the Rust community disheartening. Not everything needs to be statically verified to be useful.
You really must not have done much concurrent programming if you think Go and Erlang won't deadlock. :P Throwing the "idiomatic" qualifier on there isn't a defense; "idiomatic Rust" doesn't deadlock either.
Let me be more precise then. Goroutines enable a fan out pattern in which each task, even small ones, is a separate asynchronous routine. Tasks fan out from a coordinator routine and their results fan back in:
>Forget to update that 3 after fiddling with the channels? Deadlock
To be fair, how long would that take to debug?
>Forget to unlock a mutex when dealing with shared data?
You'll note there are no mutexes. RAII mutexes are great (although less useful without exceptions). But the entire point of Go's concurrency model ("share by communicating") is that you don't need need to deal with them.
>Use mio if you want goroutine-like efficiency
For a personal project, I would. But will any commercial entity use an unstable, unportable library with a single maintainer for critical functionality in their app? Because that's the concurrent IO situation in Rust, now and for the foreseeable future.
Your example is tiny and trivial, the same thing written with locks/whatever would be equally easy to debug. Problems in a more complicated application would be harder to debug, no matter if you use either channels or locks.
> But the entire point of Go's concurrency model ("share by communicating") is that you don't need need to deal with them.
Exactly the same thing works in Rust, and works "better": the lack of other sharing (except by message passing) is enforced at compile time. Rust ensures that other options are available with as much help for correctness as possible.
> But will any commercial entity use an unstable, unportable library with a single maintainer for critical functionality in their app?
Concurrent IO is inherently unportable, and mio has support for the major platforms (OSX, Linux and Windows, with tests run on all) so I don't know what you mean by that. mio won't be the first or last lib with a single maintainer that a commercial entity uses.
(Unstability is of course a perfectly reasonable criticism, and I'm sure it'll disappear as the library ages.)
>Your example is tiny and trivial, the same thing written with locks/whatever would be equally easy to debug.
Write the version with explicit locks and condition variables and we'll see if it's as easy to debug. :P
>Exactly the same thing works in Rust, and works "better": the lack of other sharing (except by message passing) is enforced at compile time.
Which is one of the reasons why Rust can (should) eat Go's lunch. All it's missing are lightweight coroutines and concurrent IO.
And I didn't know Mio supports Windows now — that's good news! If Mio stabilizes and Rust gets lightweight coroutines (probably requiring compiler support), Rust could be the best of all worlds.
Both of those languages will semantically deadlock: forget to send a message on a channel and a different thread will sit there waiting forever. It's not a mutex-deadlock, but it achieves the same thing. Neither of those languages protects against this. (And Rust has exactly that level of "deadlock" freedom: it has channels too.)
In any case, you seem to be ignoring what everyone is saying: Rust doesn't guarantee deadlock freedom, but it still tries to help. Mutexes can be an important building for some things, but they're not the final story. There's atomics and channels in the standard library right now, and now that 1.0 is released, there'll be a growth of even better abstractions.
One of the people working full time on Rust has a PhD in concurrent programming, and has it as a personal goal to make Rust great at it. You can see his initial work on his blog[1], and read his thesis which introduces "reagents" (something he has expressed interest in implementing in Rust)[2].
Either way, redefining the word does nothing to make mutexes and native threads safer or easier to use in practice.
I can't emphasize this enough — data races are not the only kind of concurrent programming error, yet they are the only kind prevented by borrow checking. Go, Erlang, and Node have high level approaches to concurrency that reduce other categories of errors (in addition to providing massive performance benefits vs. naive native threading).
I think the time to add good concurrency abstractions to a language is before 1.0, but I'm in strong disagreement with the community there. And judging by this thread, concurrent IO isn't even on the core team's roadmap. This is not a good sign for a language that sells itself on concurrency!
Look at the situation 10 months ago. How much has it improved?
Channels are a step in the right direction, but are pretty limited without coroutines (channels without coroutines are just synchronized queues). Atomics are cool, but they address a totally different problem. Reagents, well, let's see an implementation.
C++ has needed a successor for many years now, and Rust is the best candidate. But lofty claims notwithstanding, the Rust concurrency situation is pretty dreadful.
> Putting the word "semantically" in front of deadlock doesn't change its meaning:
And focusing on hangs in concurrent programs that are caused by using a data structure called "mutex" doesn't stop one getting exactly the same symptoms via your apparently-perfect channels.
One can use channels to get all four of the conditions required for a deadlock, especially Go's synchronous-by-default channels.
Any time you have a protocol of multiple tasks communicating with each other in some structured way, it's possible to break that protocol and hence have tasks sitting around waiting for messages that aren't coming. Especially in languages like Go/JavaScript/... which aren't powerful enough to model things like session types in their type system, e.g.
> I can't emphasize this enough — data races are not the only kind of concurrent programming error, yet they are the only kind prevented by borrow checking
Yes, that's exactly why the whole Rust community tries to be very careful about using "data race" when talking about concurrency in Rust.
However, it is the case that data races are the worst sort of concurrency bug: they are undefined behaviour and so can lead to arbitrary memory corruption, possibly only appearing a long way from the actual place with UB. Deadlocks and other problems are, by default, much more controlled in their failure modes.
Rust focuses on truly outlawing large classes of horrible problems: dangling pointers, iterator invalidation, data races (and all without requiring a garbage collector, although a GC barely helps with the latter two). It also tries to help with other problems with fewer guarantees, but even just being memory safe is a huge step up from widely used low-level languages (i.e. C/C++).
All the languages you mention are quite opinionated in their concurrency, imposing costs that Rust doesn't and can't, for its target space. (And, Go certainly doesn't provide any guarantees at all, not even data race freedom.)
> in addition to providing massive performance benefits vs. naive native threading
Important qualification: for IO bound tasks. Which is perfectly fine, but it needs to be understood.
> I think the time to add good concurrency abstractions to a language is before 1.0, but I'm in strong disagreement with the community there
What's so important about being pre-1.0? You seem focused on it, but I don't understand why. What benefit does Rust gain by delaying the release of 1.0 for months/years just to get good async IO support? Why is this particular pet feature any more important than everyone else's pet feature? (There have been so many requests: "why couldn't X make it into 1.0?")
If you're concerned about theoretical fragmentation of the ecosystem... that's not a problem in practice: mio is the standard.
> And judging by this thread, concurrent IO isn't even on the core team's roadmap
Wrong, it's very much on the roadmap, e.g. Alex Crichton (core team member) has been adding windows support to mio himself.
> Look at the situation 10 months ago. How much has it improved?
A lot. There's a burgeoning ecosystem built around mio.
---
In any case, Rust has been stable for barely 3 months. Be patient, and give it time for the concurrency story to blossom from the seeds that have been sown so far. Based on the experience so far, I'm pretty confident that Rust can easily be much better (i.e. more performant and reliable) than both Node and Go and even Erlang. (Of course, it may be syntactically less nice, since those languages bake it in deeply, while Rust is less opinionated.)
I've never used Go, but it has some interesting concurrency ideas. So do Node and Erlang. I naively expect Rust to adopt the best ideas from each.
>One can use channels to get all four of the conditions required for a deadlock, especially Go's synchronous-by-default channels.
One can, yes. But it's pretty easy to avoid cyclic locking patterns when each request is handled by a separate goroutine, as is idiomatic. One thread per request in Rust will drag pretty quickly.
Yes, the borrow checker is an impressive achievement. But is it enough for Rust to succeed? Marketing yourself as safer C++ is what Java already did (with tremendous success) 20 years ago. And the market for systems languages has only shrunk since then (my phone runs Java).
>All the languages you mention are quite opinionated in their concurrency, imposing costs that Rust doesn't and can't, for its target space.
And yet Rust has already partially standardized channels. Finish the channels, add coroutines and you've implemented Go! (I'll note there are already coroutine implementations for C and C++, which do not limit their use as systems languages).
>(And, Go certainly doesn't provide any guarantees at all, not even data race freedom.)
Which, interestingly, hasn't hindered its ability to become a successful language! A lesson worth remembering.
>Wrong, it's very much on the roadmap, e.g. Alex Crichton (core team member) has been adding windows support to mio himself.
> I've never used Go, but it has some interesting
> concurrency ideas. So do Node and Erlang. I naively
> expect Rust to adopt the best ideas from each.
This is where your naivete shows. Rust originally did have the same thread model as Go baked into the language and standard library, and it labored for years to find a usable compromise between Go's green thread model and the native threading model. And a compromise is indeed necessary, firstly because we don't just need another Go, and secondly because Go's threading model imposes horrific costs when trying to interoperate with non-Go code (literally thousands of times the overhead that you'd expect). For a language like Rust that intends to interoperate with the native ecosystem, that overhead is unacceptable. After about three or four complete redesigns and rewrites the entire green threading infrastructure was chucked to the curb. Fortunately, Rust is low-level enough that libraries like mio can pick up the slack on their own, and in the meantime libraries that don't need green threads don't have to pay the price.
>Rust originally did have the same thread model as Go baked into the language and standard library
Ehhhhhh not quite. Go provides one threading API, and it's green threading. Rust tried to provide the both green threading and native threading using identical APIs. That was a unique and in retrospect quixotic decision. Most of the problems identified in the RFC stem from the unified API issue:
You're the third person in this thread to tell me that Go-like concurrency requires a big runtime, and it remains false. Here are analogous concurrency implementations in C and C++:
(Concurrency in C and C++ is also third-party libraries... The whole point of languages like Rust and C++ is that powerful functionality like this can be built externally, so that different trade-offs can be made. Languages like Go and Node force one approach, and so when you need something outside it, you're forced to do something suboptimal.)
Without the documentation, stability, portability, quality guarantees, and compiler support (that's a big one — code generation for coroutines needs to be good) of a standard library.
>Concurrency in C and C++ is also third-party libraries
C++ is on track to standardize concurrent file and network IO. Draft specifications have already been published, and Microsoft shipped coroutines in VS 2015. It would be a damn shame if C++ got concurrent IO before Rust.
I would like to use Rust professionally, and I'm sure you do/would as well. But no one can possibly sell their boss on using a project with a single part time maintainer to provide critical functionality.
The C/C++ libraries you're holding up as examples get no compiler support.
That said, I do think a much better style of coroutines for Rust would be a C#-esque async/await transformation, converting stack-frames/local variables into an enum, allowing literally zero-cost coroutines (all the state is stored inline, no need to allocate a separate stack). Relevant issues:
I'm pretty sure this is quite non-trivial to implement automatically.
---
C++ has had 20 years of stability, Rust only 3 months. Rust will get concurrent IO before C++ has on that time scale.
The goal with 1.0 was to stabilise enough of the language that people can start using it to write libraries that work into the forseeable future, allowing them to seriously explore the space of, for example, concurrent IO in Rust. Once enough exploration has been done (maybe you think enough has been done for async IO now), the functionality can start to become more official.
>The C/C++ libraries you're holding up as examples get no compiler support.
You can open VS 2015 today and use C++ coroutines backed by Microsoft (and their compiler, which is developed alongside their standard library).
And I am by no means saying that Goroutines are the final story in concurrent IO. Stackless coroutines in Rust would be a dream.
>C++ has had 20 years of stability, Rust only 3 months. Rust will get concurrent IO before C++ has on that time scale.
Concurrent IO is a hell of lot more important than it was in the 1990s, and the relative timescale is irrelevant for people choosing between Rust and C++ today (or Go, Scala, Clojure, C#, etc.).
>Once enough exploration has been done (maybe you think enough has been done for async IO now),
Exactly the opposite — I think the number of developers working on this (the Mio author plus Alex Crichton, maybe some offshoots) is far too few.
And the attitude I'm seeing from some core developers in this thread (concurrent IO is a "pet" feature that the community will someday deliver fully formed and ready for "blessing") is a huge disappointment.
> I think it's a fair characterization given 1.0 shipped
> without them, and there doesn't seem to any timeline for
> standardization (please correct me if I'm wrong).
The release of 1.0 was not an indication that the language was 100% complete, or that the stdlib was 100% comprehensive. The 1.0 release represented a stable foundation upon which to build an ecosystem. Going forward, the Rust developers absolutely do care about providing more concurrency primitives. Here's Aaron Turon's recent work on implementing epoch-based memory reclamation for implementing lock-free data structures: http://aturon.github.io/blog/2015/08/27/epoch/ . Of the two interns the Rust project was granted this summer, one of them spent the entirety of their time working on supporting SIMD in the language (http://huonw.github.io/blog/2015/08/simd-in-rust/) while the other spent their time specifying the behaviors that are allowed inside Rust's `unsafe` blocks, which includes taking a good long stare at Rust's memory model and all its concurrency primitives and making sure that they're sound. While we're on the topic of interns, you're also overlooking the `Arc` pointer in the stdlib, which is an intern project from more than three years ago which allows one to safely share memory between threads. Meanwhile, pcwalton (Rust core team member and full-time Servo developer) has been working on shmem support and multiprocess allocators for Servo. In addition, the Rust stdlib contained an implementation of fork/join prior to 1.0, but at the last minute the API was found unsound for a few edge cases and so it was deprecated and a working implementation was deferred until later (today there are at least two working reimplementations of this on crates.io with the unsoundness fixed).
> This apparently doesn't include preventing deadlocks, a
> problem that is common, hard-to-avoid, and difficult to
> recover from. Does that not strike you as a bit of an
> overreach?
Not at all. Statically preventing data races is an enormous leap forward in the state of the art. Here's a quote from Matthias Felleisen of Northeastern University on teaching Rust to students without experience in concurrent programming: https://www.youtube.com/watch?v=JBmIQIZPaHY&feature=youtu.be...
"I had them program parallel programs in Rust, and as some of you know, Rust prevents race conditions with its type checking. [Note that he's incorrect here, Rust prevents data races, not race conditions in general.] I will admit, the first two weeks, because we used the beta release, was a mess. We couldn't understand the type error messages. But once we got over the hump, I was blown away, that these kids never had problems writing parallel programs in imperative style. There were no race conditions. The type system slapped their fingers. We taught them how to design, the type system enforced it, and lo and behold, I hate to admit this but I have to admit it, they just didn't have problems with this stuff."
SIMD, lock free data structures, and ARC pointers are great. But what's the timeline for concurrent IO (either async or coroutines)? Preferably one that is stable, portable, and efficient (mio isn't the first two).
>Statically preventing data races is an enormous leap forward in the state of the art
The borrow checker is a great achievement. But it doesn't prevent deadlocks. And because Rust doesn't provide higher level concurrency tools than threads and mutexes, deadlocks are going to be a significant problem in practice until such tools are standardized.
Could you describe how you've developed the misconception that Rust does prevent deadlocks? Whatever it is, we should try to fix it right away, because Rust has never tried to prevent deadlocks, and suggesting that it does is not good.
> And because Rust doesn't provide higher level concurrency tools than threads and mutexes
Reasonable people can disagree about what "safe" and "easy" mean, but I don't think Rust's concurrency primitives are either given the (very real, very bad) possibility of deadlock.
>Rust's standard library has channels.
Which are a good start, but only a partial solution without coroutines (and the other additions you mentioned).
> but I don't think Rust's concurrency primitives are either given the (very real, very bad) possibility of deadlock.
OK. Then I don't know what to say. An infinite loop is a deadlock. Unless you have a programming language that can guarantee termination, you can't prevent deadlocks statically.
So yes, I think a brief marketing pitch of "safe and easy concurrency" is totally appropriate given the domain Rust is shooting for. In fact, it is one of Rust's strengths, so to not advertise it as such would be quite odd.
> Which are a good start, but only a partial solution without coroutines (and the other additions you mentioned).
OK. But you said "And because Rust doesn't provide higher level concurrency tools than threads and mutexes" which just isn't true. I'm trying to clear up what it is Rust does have. I'm not saying it has everything. Writing software takes time.
> not prioritizing higher level concurrency tools because you have the
> borrow checker is a mistake.
Ahh, but to mix up your analogy here, the borrow checker is the engine, and the higher level tools are like building fancier cars. You have to get your foundations built before you can build higher-level things on top of them.
In general, there are many reasons (fragmentation, quality, portability, dependency management, compiler support) it's preferable to have concurrency tools standardized. Standardization is only going to get harder in the future.
Specifically, it's great that Rust has a capable atomics library. I'm sure we'll see many good concurrent data structures come from it. But that's not quite what I meant by concurrency tools. The thread primitives Rust offers are the same ones POSIX standardized twenty years ago. It would be very valuable to have something like OpenMP (parallel loops, etc.) and something for concurrent IO (either asynchronous or coroutine-based).
There's no reason Rust can't have the performance of C++ and the easy concurrency tools of Go or Erlang.
We do have libraries for asynchronous io, and co routines, they're just not ready yet. I absolutely agree with you that these things are useful, but the code needs to be written by someone. We're just at the stages where these things are getting good enough to try out.
(And the language can't have exactly the same stuff as Go or Erlang without adding a runtime, which is contrary to the goals of the project.)
> Here is 90% of what Go gives you as a C library. No heavyweight runtime necessary
Rust has channels in its standard library. Admittedly, they do not have the same functionality as Go's concurrency primitives. The two most significant omissions are probably a multi-producer/multi-consumer channel and a more complete (and stable) `select` construct.
There is also my `chan` library, which replicates Go functionality with respect to channels: http://burntsushi.net/rustdoc/chan/ --- Notably, you still have to use native threads.
Your `libmill` example is a coroutine library with a scheduler and everything. That is definitely too much runtime for standard Rust. However, there are people working on coroutines in Rust, on which something like `libmill` could be built: https://crates.io/search?q=coroutine
Is it incompatible with native threads? (no) Does it affect non-coroutine functions? (no) Does it have any effect whatsoever when not using the library? (no)
I encourage you to look over the code before dismissing it as "definitely too much runtime."
>It would be very valuable to have something like OpenMP (parallel loops, etc.) and something for concurrent IO (either asynchronous or coroutine-based).
There was another library out there that provided tons of concurrency utilities, but I can't find it now. Parallel loop-like syntax should be easy with scoped_threadpool and a macro though.
Rust tries to avoid stuffing everything into the standard library. So the lack of utilities in the stdlib isn't an artifact of some ignorance of concurrency, it's because Rust doesn't want to keep everything in the stdlib. Rust keeps the basic framework for thread safety in the stdlib (though it doesn't need to, not exactly), but the rest is built upon by the community.
What happens when one has 10 different ways of handling async IO? Massive ecosystem fragmentation.
Ruby has EventMachine, the standard lib, and Celluloid. None of these are interoperable.
Python has the standard lib, Twisted, and a number of other projects. All have the same problems.
C++ has Boost, Asio, a few others. This is more along the lines of where Rust is headed.
C has /countless/ options. None of them are standard, and it's totally understood in such an old language.
---
Go has goroutines: it's a harmonious, unified ecosystem. I greatly dislike go, but this is one thing they do absolutely correctly.
Erlang has fully-preemptive multitasking (which the entire ecosystem is built upon).
Haskell has incredible parallelization and concurrency primitives built right into the stdlib.
Certain things belong in the stdlib. Async IO is definitely one of them. If not an implementation, a defined, common interface of some nature, to help prevent some of the fragmentation.
async/await, if I am not mistaken, will require compiler and borrow-checker support. This would be a good start.
> That's a valid philosophy, but also one that leads to problems with fragmentation, quality, portability, dependency management, and compiler support.
No, not necessarily. We would like for the standard library to remain minimal, but one of its important functions is to collect common interfaces to maximize interoperability between crates. This has worked well in practice so far.
Similarly, the standard library provides portable facades on top of platform specific APIs, for example, for performing IO. Crates can take advantage of this so that they can be portable themselves. Moreover, crates themselves can also provide portable facades over platform specific APIs, so I'm not convinced that this will be a problem in practice.
Dependency management is handled quite well by Cargo. It has been a wonderful tool to have at our disposal and is really the crux of what makes a small standard library possible.
Compiler support is a good argument, but one that I hope becomes weaker in time as we stabilize more functionality.
To be clear, I agree that a small standard library has its own downsides. In particular, quality is IMO on of the best arguments against a small standard library. Fortunately, we trying to mitigate this by adopting officially blessed libraries into the `rust-lang` organization: https://github.com/rust-lang/rfcs/blob/master/text/1242-rust... --- This allows us to avoid the problems with a big standard library (too much stuff that is hard to evolve because of stability) while still providing quality with crates that we promise to maintain.
> so I'm not convinced that this will be a problem in practice.
Historically, it's been a massive problem. See my other post. Rust is already seeing IO-related fragmentation.
C, C++, Ruby, Python, etc. are all massively-fragmented ecosystems regarding IO, concurrency, (safe) parallelism. I don't think we need to soil a great language (Rust) with these same mistakes.
I was specifically speaking about the size of Rust's standard library. Python, at least, has a massive standard library, so I'm not sure how that's applicable here. (Or, at the very least, demonstrates that a big standard library isn't sufficient. What matters is what is in the standard library, which is not fundamentally incompatible with a small standard library.)
Also, none of those languages started out with a tool like Cargo.
I made a few other comments about mitigating this as well that should be considered.
I don't agree with the idea that Cargo is just going to magically make all fragmentation disappear, simply because it's convenient, good, and available. If this were true, I would suspect we'd have seen some of this fragmentation stop in these other ecosystems: it hasn't, despite excellent tooling.
I'm not suggesting a Python-esque stdlib. I'm suggesting a multi-threaded, cross-platform, (ideally, edge-triggered) event system, above which higher level primitives can be introduced and safely interoperate.
If Rust already has std/net, this is not that far of a gap to close. Granted, implementing a reactor or (higher level) green threads greatly affects the way your programs execute, in my opinion, the benefits of a "blessed way" would outweigh the problems.
Also, as with all of those languages with fragmented ecosystems: nothing is preventing a developer from implementing their own solutions, they're just heavily encouraged to be compatible.
I never meant to imply that any one tool is a panacea. If this is an argument you think I'm making, then let's just squash that right now: I'm not. What I'm suggesting is that we have thought through this very problem and come up with a number of ways to mitigate it. We don't want to see the ecosystem fragmented. Cargo is one strategy. Blessed crates are another. In particular:
> To be clear, I agree that a small standard library has its own downsides. In particular, quality is IMO on of the best arguments against a small standard library. Fortunately, we trying to mitigate this by adopting officially blessed libraries into the `rust-lang` organization: https://github.com/rust-lang/rfcs/blob/master/text/1242-rust... --- This allows us to avoid the problems with a big standard library (too much stuff that is hard to evolve because of stability) while still providing quality with crates that we promise to maintain.
I could absolutely see one of these crates providing async IO. I am less sure of seeing it wind up in std. Some examples of crates that are currently on track to being blessed---but maybe never end up in std---are regex and rand.
I disagree with adding good package tooling after-the-fact and using its failure in preventing fragmentation as a reason for why Cargo will be ineffective. Overcoming inertia is hard. Having Cargo at the outset is a nice advantage we have working in our favor. We should acknowledge that.
> If Rust already has std/net, this is not that far of a gap to close.
It's a pretty big gap IMO, especially if you want to provide a common high level interface. std::net is a portable interface around platform specific APIs and not much else. Async IO is quite a bit more involved.
> I'm not suggesting a Python-esque stdlib. I'm suggesting a multi-threaded, cross-platform, (ideally, edge-triggered) event system, above which higher level primitives can be introduced and safely interoperate.
Note that my comment was specifically about arguing against a Python-esque stdlib, or rather, in favor of a small standard library. It was not meant to target omission of any one particular feature, which is what you seem to be focused on. (The criticism I responded to was not specific to async IO.)
If you directly use low-level locking primitives, no. However, Rust provides many higher-level concurrency mechanisms that avoid the problem entirely. And typically, if you had some complex data structure with multiple levels of locks and ordering requirements on those locks to prevent deadlock, you'd want to encapsulate that data structure with methods that handle those locking requirements internally.
>Rust provides many higher-level concurrency mechanisms
I'm looking at the standard library and I see only threads and channels. Is there anything higher level? Parallel map, reduce, etc., something like OpenMP?
one of the design goals of rust is to design the language in such a way that they're simply not possible. I thought it was a new FOTM language but after reading a bit more about it I'm very excited for its future.
That said, I wish they would have looked a little further outside Reddit's homepage rendering for parallelization inspiration. Reddit does not take a long time to render. However, CNN.com [which they looked at too] does, so...
Be careful with terminology here when you say that something is simply not possible. The parent is asking about deadlocks, which includes scenarios such as entering an infinite loop or having a channel block on a message that will never come, and neither of these are things that Rust can guarantee will never happen. As I say in my sibling comment, what Rust's typesystem guarantees is the absence of data races, which is still an amazing achievement (unprecedented AFAIK) but is only a subset of the general category of race conditions.
http://www.oreilly.com/programming/free/why-rust.csp
"One of the original designers of the Subversion version control system, [Jim Blandy is] a committer to the SpiderMonkey JavaScript engine, and has been a maintainer of GNU Emacs, GNU Guile, and GDB."