Actually, as I'm so used to the articles that place benchmarks at the end, I immediately scrolled down to the bottom after reading the preface. Then I realized the pretty graphs are at the beginning.
Anyway, this is an excellent way to present an idea to the public.
I'm extremely excited about posts/libraries like these. They really show off the unique powers of Rust in a way that's a bit more compelling than "Rust is just C++14 with a bit stronger guarantees and more clean semantics."
Also worth noting, and something I missed when I read the draft of this post: this isn't just an implementation of a lock-free structure: it's a whole library, Crossbeam[1] that helps you implement your own lock-free structures.
> In general, it’s possible to take lock-free algorithms “off the shelf” (the
> ones on the shelf generally assume a GC) and code them up directly against
> Crossbeam in this way.
I agree with you, and this post and the library are awesome, but honestly "it has all the great stuff C++14 has, but without the decades of compromise" is not a bad start for a pitch either. Playing in the same league as C++14 is already a major achievement for a language.
> I'm extremely excited about posts/libraries like these. They really show off the unique powers of Rust in a way that's a bit more compelling than "Rust is just C++14 with a bit stronger guarantees and more clean semantics."
Maybe I'm just too used to C and see it everywhere, but in this specific case, I don't really see the the "unique powers of Rust". Personally I find that perfectly alright - there's only few people that actuall will (and can) write correct lock free code, and as long as that lock free code can easily and seamlessly be integrated into the rest of the environment.
Using the borrow checker to help enforce correct usage of the API is indisputably unique, given that C doesn't have anything approximating a borrow checker.
I totally get that. Even though I spent 50% of my time on a take-home math final writing it in XeTeX from scratch (or "article"), it was still worth it, even if only for the motivation. I love the choice of Minion in your paper, but I needed something more math-friendly so I went with my other favorite: Arno Pro.
Oh, and thank you for including a colophon, I really wish more books did that :)
I haven't read it. On IRC he said he was gonna post this and then go to lunch, but given that he submitted it here, I'm sure he'll be back to elaborate at some point :)
I'll echo that Rust really does seem to be something new and exciting in the native land of languages.
I feel like it's still got some rough spots(around FFI and just some APIs still being unstable) but there's so many things to like with the current direction.
Yeah its unfortunate that the some FFI APIs are still marked unstable, but Rust is still a language with memory safety, no garbage collector, and opt-in to a C-compatible ABI. That's a pretty big deal in my opinion!
For each use of unsafe in Rust code, presumably the developer sees some safety guarantee that the compiler does not understand but that he thinks will guarantee that the code is indeed safe. Can anyone shed light on what form these guarantees (that are not seen by the borrow checker) take, and whether it will be possible in future Rust versions to write such code without unsafe (either by improvements to the borrow checker or implementation of an additional safety checker)?
Usage of `unsafe` generally has to do with subverting the Rust compiler's notion of ownership. Take a look at this snippet from the OP discussing the usage of unsafe code in the client example:
> The operation is unsafe because it is asserting that:
> * the Shared pointer is not reachable from the data structure,
> * no other thread will call unlinked on it.
This is only necessary because the library here is concerned with sharing a data structure among multiple owners (the owners being threads in this case) who all wish to mutate the data structure.
There are other places like this where one may wish to subvert the borrow checker, and when such a pattern becomes widely observed it can be placed in the standard library behind a safe interface, (ideally) removing the need for unsafety in client code entirely. The `Rc` smart pointer in the stdlib is one such example of a safely-exposed interface to code that internally uses unsafety to subvert ownership in a specific way. And as long as you trust the compiler to be correct, you can also trust that the "unsafe" code in the stdlib is also correct (or at the very least that the Rust developers are guaranteed to fix any incorrectness that is found).
I understant those concepts. My question was more along the lines of, could the code you mention from the article one day be written without unsafe? Somehow the programmer is very confident that neither of the two restrictions you mention will be broken. Is there any way that it can be proven from the code that they will not be broken, and therefore remove the need for unsafe? Or, if safety can not be proven from the code, how is the programmer so certain of it?
Isn't there theoretically a risk that one thread could hang indefinitely while accessing (or holding a reference to) the data structure, which would prevent the epoch from ever advancing and prevent all memory from being reclaimed?
(On the other hand, I'm not sure what guarantees GC-based lock-free data structures can make in the same situation.)
```
Although limbo lists are accessed using lock-free operations, and garbage collection does not interfere with other mutator processes, this reclamation scheme is not strictly lock-free. For example, a process which stalls for any reason during a shared-memory operation will not observe updates to the epoch count. In this situation the limbo lists will never be reclaimed and memory cannot be reused. Other processes can make progress only until the application reaches its memory limit. This drawback may also affect preemptively-scheduled systems, in which a process may be descheduled in the middle of a shared-memory operation with no guarantee when it will be rescheduled.
```
Yes. In practice this is extremely unlikely for correct lock-free structures. But you can also protect against it by releasing the epoch on every iteration of the retry loop, rather than just at the outset. (This may be a good idea for other reasons.)
If I read correctly, you only increment the epoch when the active bits for the current epoch are clear. As long as your stalled thread doesn't clear its active bit at an unsafe time, the epoch will stall with it and you'll be ok.
Not entirely sure how a thread is supposed to set its active bit without racing with a guy who just noticed that all active bits are clear. I guess that's why it says "try to increment the epoch" and not "increment it". i.e. it can fail. EDIT: I guess this is why you go two epoch units back and not one. That, combined with the idea that everybody only goes forward in time, would probably do it.
The epoch based approach basically is on of the many variants of RCU (as e.g. used by the Linux kernel - thoroughly documented on lwn.net BTW).
It does mean that a stuck or killed process can cause rather noticeable pain. With a GC you usually have a upper bound of the amount of memory one stuck process can prevent from being reclaimed, not so with a generation based approach.
On the other hand it dies often voids concerns around ABA style problems.
Oh, one more thing: often an approach based on hazard pointers allows to reduce the impact of a stick or slowed down process, without using a GC. And while stool protecting against ABA. Another good thing is that luckily the patent application around it was rejected.
A couple years ago I saw a 3.5x improvement migrating a lockfree skiplist from hazard pointers to epochs. The problem with hazard pointers is that they require memory barriers after every assignment, whereas epochs only require one at the start.
Hazard pointers aren't any better when it comes to stuck processes, right? If you aren't removing your nodes from the hazard pointer list, you're stopping reclamation just as with this algorithm.
The epoch algorithm strikes me as pretty much strictly better than hazard pointers.
HP is much more granular and a stuck thread does not inhibit system wide progress (yes, the slots won't be reclaimed but other entries will continue to be reclaimed).
Very nice. RCU like this can be very effective (we have a custom per-cpu version based on https://lwn.net/Articles/650333/ here, which scales better than per-thread versions for obvious reasons.) I'm glad to see more wide use of the techniques.
Cool! Who is "we" and where can this code be found?
You say "RCU like this", but AIUI RCU and epoch-based collection like in this article are quite different from each other.
RCU involves no-synchronization reads of a data structure and writers that wait for a "quiescent" state to delete old nodes. It is used for data structures that have pure readers (like a frequently read but infrequently-updated list of things).
Epoch-based GC involves lock-free writes but puts garbage in freelists to be collected at a later time. It is used for data structures that are mostly based around writes (for a stack or queue, both push and pop are mutating operations).
Could someone explain the big difference between RCU and epoch-based reclamation? It seems that the only difference is that RCU has quiescent periods between reschedules and epoch-based has them when you don't have a Guard struct active. Seems like six of one, half a dozen of the other.
They are very similar; I think of the epoch scheme as a particular way of doing RCU that does not impose the need to find quiescent states yourself (RCU can be employed in userspace as well if your application can provide quiescent states in some way).
I think a key property of RCU as traditionally presented is that writers always wait for global quiescence directly after their write, and then immediately delete the garbage. This approach wouldn't make much sense for data structures like stacks and queues where every operation is a write. I see Epoch based GC as an approach that defers the deletion to make the write path cheaper.
> I think a key property of RCU as traditionally presented is that writers always wait for global quiescence directly after their write, and then immediately delete the garbage.
A rather common, in my opinion, way to use something like rcu is to delay freeing (or reusing) memory to the next grace period, without blocking until then. E.g. in the kernel you can use kfree_rcu(..); instead of synchronize_rcu(); kfree(..); for that. That's basically what the epoch based approach does with a the lists of to-be-freed allocations.
According to the paper, it simply makes everywhere outside of data structure code a quiescent state. This may be a big difference in practice through, because one of the appeals of RCU is being able to interact with data structures with no fear of following dangling pointers.
However, with both RCU and the epoch-based scheme, you can make quiescent states as fine- or coarse-grained as you desire.
"According to the paper, it simply makes everywhere outside of data structure code a quiescent state" -- yes, from a per-thread point of view, but of course all the hard work is in detecting global quiescence.
RCU is a higher-level abstraction. You can implement RCU with epoch reclamation, typically there are more performant ways to detect grace periods in kernel space.
"We" is "infrastructure teams at Google". You apparently work there too, my username is the same.
RCU is an overloaded name; it's both a family of implementations of epochs, and a pattern for using epoch-like systems. The name literally refers to the latter (a read-mostly RWlock setup where we don't bother protecting the read side of a data structure, but _copy_ and atomically update access to the data, using epochs to dispose of the old version safely) but is often used to describe the whole system (tracking quiescence and determining when disposal is safe.) One could easily use the kernel's RCU (our my userspace one) to build a data structure like this.
I've been something wondering something for a while now. Rust is supposed to be a kind of C/C++ replacement, right? But the last I heard it's still much slower. So what's the roadmap to speeding it up? Googling for it failed me.
In general it's possible to write Rust code that is competitive in speed with C and C++. Some safety features like bounds checking do have a small performance cost, though in typical performance-sensitive code it is on the order of 1–2%. You can write unsafe code to avoid the penalty when really necessary.
The programs submitted to the benchmark game currently show Rust faster than g++ in some benchmarks, slower in others, but generally in the same ballpark. Of course this changes as the compilers and the programs improve, and the benchmark game isn't representative of anything in particular, but it illustrates some of what's possible.
There may be cases where the compiler doesn't optimise as well as older C/C++ compilers (e.g. in my SIMD work, C/C++ code compiled with GCC occasionally comprehensively beats both the same code compiled with Clang and equivalent Rust with rustc (both built on LLVM) due to better optimisations and instruction selection: not a language issue), or where the pure-Rust libraries aren't as optimised as C/C++ equivalents. However, my experience is that it is rarely a lot slower.
"How does Rust's speed compare to C/C++" is something of an interesting question. I looked at some micro benchmarks that Rust did poorly on, and listened to some chatter, and this is some of what I saw:
1. Alioth's Regex DNA: this was one of Rust's worst in the benchmark game. The author (burntsushi) of the Regex library being used (written in Rust) hypothesized that it was because of the algorithm the lib used. He worked on it, and now Rust performs excellently on that benchmark.
2. Alioth's n-body: the fast implementations are directly using SIMD. In Rust that is currently on an active path towards stabilization.
3. A networking micro benchmark -- I lost the link here, but the Rust implementation made the simple mistake of allocation a large array by pushing one element at a time to an initially empty dynamic array. Rust compared well after the fix.
4. Rust's buffered reader zeros it's internal buffer, and its default buffer size is [was?] pretty large. This made some IO operations pretty slow without manually tuning the buffer size, and still left some amount of overhead after tuning. In the Reddit thread where this complaint was brought up someone announced a pull (since merged) to significantly reduce the zeroing overhead.
I found these examples encouraging, except for the last. The former are regular growing pains of any language ecosystem and benchmark implementation. Only the last looked like an issue where Rust's design might cause a measurable run time cost. In all cases it was cool seeing people tackle the issues and get a good speedup. Worth mentioning is that although Rust's benchmark game results are still behind C's, they are now in a similar ballpark with C++'s.
as browsing the api docs I kinda like lots of things from rust however, how is the portability ie. target windows/linux/mac at once? is it easy to cross compile?
is it easy to include c/c++ header files and use them?
kinda like to link with nuance c sdk
Cross-compiling is easy... once you have the standard library for the target, which usually requires compiling the compiler manually (work on better cross-compiling is focusing on removing this requirement). For this particular library, I've been running tests for the author on some ARM hardware (both Android and plain Linux) without issue: it's just a `cargo build --target=...` away.
1. Cross compiling has worked in rust since before 1.0, but it's required some manual configuration (downloading the platform's stdlib and adding command line switches, etc.). A goal for the future is push-button cross compilation [1]. I'd also add that the standard library has most of what you'd want for cross platform compatibility, from higher-level interfaces that work cross-platform (like std::thread [2]) to lower level bindings for each platform like (std::os::unix::fs [3] or the OsStr system [4]).
2. Yep, C interfaces are pretty trivial. However, they do need to be manually defined. There is no built-in way to parse C header files or declarations. [5]
Google Spell Check, WordCount, and Wrap Plus for docs.
LLVM for IR highlighting, Rust for Rust highlighting.
I've tried racer in the past but it just doesn't scale with the current compiler architecture. I'm hoping the ongoing refactoring work will get us a great incremental/parallel/continuous compilation system for autocomplete/red-squigglies.
My two biggest pains are: "trivial" errors (syntax, typos, etc) which I ideally would just see as red squigglies as I go (not a compile-fail loop); and the absolute lack of any kind of incremental compilation (which paired with the community's over-use of generics is super painful for codegen time).
Note that epoch reclamation is only competitive with amortization, otherwise you may as well use hazard pointers for simple structures to benefit from stronger progress guarantees. Of course, QSBR is best if you have quiescent points.
Given that about every other line in the example stack implementation is marked as "unsafe", I fail to see how Rust is any more safe or well-suited for this than other low-level programming languages (C++, D, etc.).
As others have said, in a low-level setting like this it would be very difficult to ensure total safety, which depends on complex data structure invariants.
That said, Rust is able to eliminate some sources of unsafety here -- in particular, it guarantees that you have pinned to an epoch before you get your hands on any snapshots, and provides statically safe access to the snapshots for as long as you're pinned. This is very similar to the way that Rust deals with locks, which is explained more in an earlier post: http://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.ht...
I see 3 lines/expressions marked `unsafe` in http://aturon.github.io/blog/2015/08/27/epoch/#treiber%27s-s... and a tweaked `unlinked` API could cut that down to 2, and removing the unsafety from `cas_shared` (which doesn't strictly need it, but is a major contributor to any issues that could occur) would cut it to 1. Those lines are exactly the danger-points of the data structure, i.e. where misuse can cause use-after-free.
In any case, Rust really shines for the Owned and Shared types, where the affine typing & lifetimes allows sharing to be controlled so that most operations are perfectly (memory-)safe, with compile-time guarantees.
The number of individual lines marked "unsafe" isn't what's important. The safety of those lines relies on invariants put in play by by the rest of the code. All the code behind an asserted-to-be-safe API has to be correct, or incorrect but lucky, in order for the data structure to be safe.
And yet Rust still provides tremendously more type safety in this application thanks to the lifetime system statically verifying the accessibility of the guarded data. It's true that unsafety is a stateful property than a local one, but unsafety still has to involve the unsafe block somehow and that's enormously useful for auditing, debugging, and providing general assurance towards code correctness.
Oh, it's also worth saying that only the `unlinked` method really has to be unsafe, since it's the root of any unsafety with this API. I chose to make the `_shared` methods unsafe as well because they present one of the ways that using `unlinked` could go wrong, but you could reduce the amount of unsafety to a single line in principle.
After discussing idioms with a few others on the Rust team, I've pushed this refactoring, bringing the example down to a single use of `unsafe`. Feels much cleaner, and draws attention to `unlinked` as the source of unsafety here.
The important thing is that the unsafe implementation satisfies the invariants of type system. So the unsafety is contained within a much smaller surface area than those other systems programming languages.
The presence of “unsafe” doesn’t make Rust unsafe. It’s memory-safe by default—you just use “unsafe” to assert to the type system that you have personally verified the invariants that the type system can’t verify. And if it turns out you were wrong and get a segfault or what have you, you can grep the code for those bits you need to verify. The enemy, as they say, is Murphy, not Machiavelli.
I believe that you've overlooked the section entitled "The Rust API", which goes into detail about how it uses lifetimes and the borrow checker to help ensure correct usage of the API. It's a shame that the API can't be 100% safe, but let's not pretend there's no merit to it.
Besides the fact that there isn't that much unsafe code in there as you say, Rust is more safe because the unsafety is confined to some of the implementation. Now, this library can be safely used with all kinds of pointer gymnastics without any worries.
I tried Rust about a year ago, and it was pretty horrible. Not intuitive and broken back then. Maybe it's better now, and benchmarks are great, but also show me its community and how practical and easy it would be to use.
I like Mozilla and would like to see Rust succeed, but I just see adoption as too risky right now. It's like when Go was the new-and-shiny; many jumped on the hype bandwagon because it was Google backing it and it was fast, then tried it out and realized it was a language for the <5% of teams that have projects that fit well.
Performance is an awesome thing and there is definitely a place for Rust. But, I don't see it taking the place of C, C++, or Java anytime soon for writing high performance or portable apps. All the same, I look forward to using it in a web browser!
Rust has changed a lot in the past year. It lost lightweight tasks in September, it lost its runtime in December (so you can write C-ABI libraries with functions that can be called from C or any other language that can call C libraries, without needing any setup), the IO and filesystem libraries got rewritten around March, and it hit 1.0 in May, so the riskiness, documentation, community, and brokenness sharply changed for the better. If you haven't looked at Rust since about last winter, forget what you know about it. If that means it's a tiny language you haven't heard of, that's fine -- that's more accurate than thinking it's closely related to what it was a year ago.
A year ago Rust was (to first order) an awkward compromise between Go and C++11, in personality. Today it is strictly a better C++ (although arguably Go has been shifting in the same direction). It's not just a better C++, since there are fundamental ways of thinking about things (which this post touches on) that aren't and never will be pervasive in C++. On the flip side, the library ecosystem is much younger, so that definition of "better" is not yet there. But I'd argue that any long-term projects you might want to start in C or C++ could be better done in Rust, and I'm way more confident of that argument for Rust 1.x than for a hypothetical continuation of Rust as it was about 12-18 months ago.
> It's like when Go was the new-and-shiny; many jumped on the hype bandwagon because it was Google backing it and it was fast, then tried it out and realized it was a language for the <5% of teams that have projects that fit well.
I don't see that all with Go. Go is getting a good degree of traction, because it's a really useful language for a lot of people.
> But, I don't see it taking the place of C, C++, or Java anytime soon for writing high performance or portable apps.
> I don't see that all with Go. Go is getting a good degree of traction, because it's a really useful language for a lot of people.
I agree that Go has the traction and community to stay around for the long haul as a niche language.
It's not a replacement for C, and never will be. I can't imagine more than 5% of developers ever using it.
Let's take Java which despite C# is still the number one language in the world to use by number of developers and just compare Go to Java. That's more than fair since the number of Java developers in the world is less than the total number of developers in the world.
Similarly, in GitHut which compares language use across GitHub, you can see how the amount of Java code dwarfs the Go code currently shared: http://githut.info/
I know, I know- Go still have a great growing community, but it just hasn't taken off the way they would like you to believe. I still think it's a great language, but it isn't making the inroads we'd all believe it has based on the hype. That's not to say it isn't useful- it is, or that developers shouldn't learn it or use it- they should. But, it isn't one of the top languages nor will it be at the current rate of adoption.
> Why not, specifically?
That's a much bigger question, but the answer is: Go is not AngularJS.
What?!! You may ask. Here's what I mean:
In today's development world of every decent language and framework getting 15 minutes of fame, you have to have something really trendy and seemingly cost-saving to get you there. Go is a practical language written for speed (similar to Rust). That is not the Toyota Camry or Honda Civic of languages, that is the high speed locomotive. A different use case, and tough to market.
You don't deserve the downvotes, but I'm still not sure what your concern is. The modern history of the programming languages ecosystem is marked by rapid specialization and fragmentation, in which 5% market share is an enormous quantity. Furthermore, relative percentage of mindshare isn't as interesting as absolute numbers: your language needs only enough active contributors to keep pace with the evolving trends of the industry, which can likely be achieved with less than 20,000 active users (well less than 1% marketshare).
The point of any given programming language is not to be crowned queen of the programming prom, it's to produce useful software and improve the state of software development. As long as your language meets a minimum threshold for notoriety (which both Go and Rust do), the fact that they have less marketshare than Java is irrelevant unless your sole motivation for learning a language is to get a job at an enterprise company (which is a fine reason to learn a language, but I think you might be on the wrong forum if that's your overriding concern :P ).
This man knows how to get right down to business.