Hacker News new | past | comments | ask | show | jobs | submit login
Four years with Rust (steveklabnik.com)
376 points by steveklabnik on Dec 21, 2016 | hide | past | web | favorite | 188 comments

Rust as a language is now realizing the benefits of borrow checking. As the article points out, the syntax doesn't have to distinguish between move and assign. The borrow checker will catch a reuse of something already moved away. This turns out to be effective enough in practice that the syntax distinction isn't necessary. That wasn't obvious up front.

Not having exceptions tends to generate workarounds which are uglier than having exceptions. Rust seems to be digging itself out of that hole successfully. Early error handling required extremely verbose code. The "try!()" thing was a hack, because a expression-valued macro that sometimes does an invisible external return is kind of strange. Any macro can potentially return, which is troublesome. The "?" operator looks cleaner; it's known to affect control flow, and it's part of the language, so you know it does that. Once "?" is in, it's probably bad form for an expression-valued macro to do a return.

There's still too much that has to be done with unsafe code. But the unsafe situations are starting to form patterns. Two known unsafe patterns are backpointers and partially initialized arrays. (The latter comes up with collections that can grow.) Those are situations where there's an invariant, and the invariant is momentarily broken, then restored. There's no way to talk about that in the language. Maybe there should be. More study of what really needs to be unsafe is needed.

> There's still too much that has to be done with unsafe code. But the unsafe situations are starting to form patterns.

I think as long as those patterns can be abstracted out and moved into thoroughly-vetted libraries with a safe interface, unsafe code isn't really a problem. I expect that getting Rust's standard libraries to a place where regular applications very rarely need to create their own unsafe code blocks is going to be a major long-term effort.

(Of course, if there were some simple language feature that would make some common use case of unsafe code blocks unnecessary, by all means we should do that as well.)

Agreed. And considering that the most common uses of unsafe are for collections, I think a limited number of vetted libraries completely realistic.

Rolling your own collections is typically not that great anyway; how much better is your linked list/hash table/skip list than everyone elses?

As long as there are 40+ collections data structures like in Java.

Is there something today that you're seeing that causes applications to use a lot of unsafe? In general, it should mostly be in library code, not application code.

No, but I'm pretty new to rust and haven't written enough code to know how often situations that call for unsafe code blocks come up.

(My current project I'm using to learn Rust is to port a ray-tracer I wrote in Haskell. I wouldn't expect functional-style code to require a lot of unsafe blocks and I haven't needed any yet, but who knows?)

I listed two things Rust doesn't handle well without unsafe code, doubly linked lists and multidimensional arrays. Here are examples of both from popular repositories with high download numbers:

- https://github.com/andelf/rust-adivon/blob/master/src/deque....

- https://github.com/andelf/rust-adivon/blob/master/src/queue....

These are all the doubly-linked list problem:

    struct Node<T> {
        item: T,
        next: Option<Box<Node<T>>>,
        prev: Rawlink<Node<T>>

Since this is templated code, it might be possible to break it by instantiating it on a type with unusual semantics.

- https://github.com/BurntSushi/aho-corasick/blob/master/src/f...

Looks like unsafe code for "performance reasons". But there are no comments near "unsafe", so it's hard to tell.

- https://github.com/SiegeLord/RustAlgebloat/blob/master/algeb...

Matrix math. "Unsafe" all over the place, and unsafeness is exported, allowing the caller to do unsafe things. This is an example of why I occasionally stress the need for multidimensional array support at the language level. If the compiler knew about multidimensional arrays, it could optimize the subscript checks for them, avoiding code such as this.

The unsafe version. The caller can store anywhere in memory.

    fn unsafe_set_idx(&self, $mat: &T, v: f64)
        let $self_ = self;
	let (r, c) = $rc_expr;
	    { $mat.raw_set(r, c, v) }
Safe, but inefficient version. The compiler can't hoist those checks out of loops.

    fn set_idx(&self, $mat: &T, v: f64)
    {    let $self_ = self;
	 let (r, c) = $rc_expr;
	 assert!(r < $mat.nrow());
	 assert!(c < $mat.ncol());
	 { $mat.raw_set(r, c, v) }
This is the sort of thing that leads to exploits in code that reads things like JPEG files. Yet you can't do much better in Rust. That's why doing multidimensional arrays in macros and templates isn't good enough.

The multidimensional array thing has nothing to do with "Rust not handling multidimensional arrays without unsafe code".

It's a mistake in the library to export that as safe, yes (filed an issue). But that unsafe code being unsafe has nothing to do with multidimensional arrays. It has to do with arrays in general.

Implementing multidimensional arrays in the language would not change the fact that they would have `get_unchecked()` and `set_unchecked()` methods. The only thing that would change would be that you might have slightly nicer syntax for them, and "one way to do it". It doesn't change the scope for optimizations, either. Subscript checks do get optimized out in 1D arrays and they should get optimized out in this case too. The way indexing works for 1D arrays is basically exactly the same; there's a pair of methods for checked and unchecked; the subscript operator is specified to do checked indexing via an Index impl, and the optimizer usually gets rid of the checks. This would not change if indexing for the 1D array type was not implemented by the language itself.

Sometimes when the invariants aren't easily seen by the optimizer it won't get optimized out, and that's when folks use unchecked indexing.

Now, there is a problem here, and that is that unchecked indexing is an unsafe operation in Rust, whether with 1-D or 2-D arrays. Could be fixed with dependent types, but that's a lot of complexity and 99% of the cases where dependent types would work would have been optimized anyway.

But this has nothing whatsoever to do with multidimensional arrays, and would not be helped at all by multidimensional arrays being in the language.

The compiler already knows enough about multidimensional arrays to be able to optimize things. Rust supports multidimensional arrays in the language. It just doesn't have syntax sugar for it; and syntax sugar can't really affect optimization.

> it might be possible to break it by instantiating it on a type with unusual semantics.

Can you give an example? A lot of the semantics are well-encoded in the marker traits, so as long as you correctly specify the right Sized/Copy/Send/Sync bounds these problems should go away.

(Panic safety could be an issue if you were calling methods on T, but you're not)

Why do you still think the compiler can't host those out of loops? LLVM is perfectly capable of doing so in many cases and you've been told as much previously, could you make your claim more specific?

Where does LLVM do that? LLVM does hoist of invariant code out of loops in the Loop Invariant Code Motion and Loop Strength Reduction phases.[1] But that's not enough. This isn't an invariant situation. Consider a matrix multiply, the most common operation in number-crunching. You're indexing through three 2D matrices along both axes. The indices are usually controlled by FOR statements, so the compiler knows the range the indices can take. If the compiler knows about multidimensional arrays, it's easy to make those checks once at FOR loop entry.

But if those checks are in asserts, it's tougher. Is LLVM allowed to fail an assert early? If the array is 0..999, and the index is 0..1000, a subscript out of range condition will occur on iteration 1001. For best performance, you want to detect the subscript out of range condition at the point it becomes inevitable, rather than checking on every iteration and failing on iteration 1001. (Although technically you could generate a special case.)

But that requires special treatment of "assert". In Rust, "assert!" is just a macro. The compiler can't optimize it that aggressively and fail early. Especially since you can now catch assertion failures during unwinding.

If all those optimizations really exist, why is there code like this (at https://github.com/SiegeLord/RustAlgebloat/blob/master/algeb...)?

    MatrixMul<LHS, RHS>
    {   unsafe fn raw_get(&self, r: usize, c: usize) -> f64
	{   let mut ret = 0.0;
	    for z in 0..self.lhs.ncol()
            { ret += self.lhs.raw_get(r, z) * self.rhs.raw_get(z, c); }
If what you say is true, all that unsafe stuff is unnecessary.

[1] http://llvm.org/docs/Passes.html

In this example, the loop bound is from the lhs, so llvm can prove that it will never get out of bounds and elide the bound checks. But llvm can't be sure that the rhs has the same size, so it can't elide the bound checks for the rhs.

However, there's a trick. Take a look at https://github.com/rust-lang/rust/commit/6a7bc47a8f1bf4441ca... where the Rust developers had the same issue, and managed to avoid the bound checks without having to use unsafe code. You have to somehow make the optimizer see that both sides have the same length, so it'll elide both bounds checks. A slice is actually a pair of a pointer to the first element, and the length. Their trick copies the length from one slice to the other (after a bound check, obviously), so the compiler is sure that the lengths are the same.

> If the compiler knows about multidimensional arrays, it's easy to make those checks once at FOR loop entry.

Or you could just use a little unsafe code to implement iterators and matrix multiplication on top of the array, and then never use unsafe with the matrices anyway. This is what gets done with regular arrays and vectors. Directly indexing these types is pretty rare. You would do the same with matrices. "Put it in the language" is not a solution for everything, especially when the tools are there to build it with almost the same level of ergonomics.

Requiring unsafe for designing some kinds of abstractions is not a bad thing. There's no need to shove everything into the language if it can be implemented as a library with a smattering of unsafe code.

It's almost equivalent, really. Making a mistake when writing this unsafe code and making a mistake in the builtin optimization are mostly equivalent risks. There's nothing inherently worse about doing it as a library, aside from the minor issue that the indexing syntax isn't so great.

LICM is very relevant: it allows hoisting various dimensions of the check up to the loop with the relevant induction variable, meaning LLVM only has to handle things of the pattern:

  for i in 0..X {
    if !(i < X) { panic!() }
It is capable of doing this with the various induction variable passes it has, showing that i < X is always true.

There's also the IRCE (inductive range check elimination) [0] pass—which isn't listed in that document—that should help even more, if/when it is enabled.

Hopefully you agree at this point that LLVM is perfectly capable of handling multidimensional arrays and matrices even if it doesn't have optimisation passes with names that obviously apply to them specifically. Most of the optimisations you keep talking about are basic consequences of other standard passes, a fact that has been pointed out many times to you before.

> Consider a matrix multiply, the most common operation in number-crunching

Also an operation you won't be implementing manually if you actually care about performance (better implementation: call a BLAS library). And, any high performance implementation will be doing more than the naive triple-nested loop (blocking, SIMD, etc.). Focusing on this operation is somewhat missing the forest for the trees.

> You're indexing through three 2D matrices along both axes. The indices are usually controlled by FOR statements, so the compiler knows the range the indices can take.

Yes exactly, the compiler knows the range of the indices.

> If the compiler knows about multidimensional arrays, it's easy to make those checks once at FOR loop entry.

This is a non-sequitur: the compiler can still make/move the checks based on what the code looks like, without having to have a hardcoded concept of arrays. Ensuring optimisation passes are powerful enough to handle these sort of relatively straight-forward cases is more general than relying on a language-level concept of multidimensional arrays, as it allows them to apply to cases which can't quite be implemented with an array directly.

> Although technically you could generate a special case

This is exactly what compilers do, see the IRCE documentation.

> The compiler can't optimize it that aggressively and fail early

It certainly can: these sort of assertions shouldn't ever trigger, and with appropriate top level assertions (e.g. asserting that the dimensions of the incoming matrices work for multiplication) the compiler can easily see this. Also, the compiler can hoist assertions early if this cannot be observed externally, e.g. the loop kernel only mutates locals.

> If all those optimizations really exist, why is there code like this (at https://github.com/SiegeLord/RustAlgebloat/blob/master/algeb...)?

That code is almost 2 years old, and so the compiler will have improved since then. Of course, the compiler may not have improved enough (e.g. IRCE still may not be enabled). Additionally, there's a lot of uncertainity/lack of clarity (as demonstrated by your own comments) about exactly what the compiler can do, and so people may use `unsafe` unnecessarily. I've certainly been guilty of this myself, and been glad when people have pushed back in code-review, making me work a bit harder to get equal (or better) performance in safe code.

[0]: http://llvm.org/docs/doxygen/html/InductiveRangeCheckElimina...

Here's what the current Rust compiler actually does. Optimization level 3, checking enabled. The question is whether the Rust compiler can optimize out all the checks for a simple matrix multiply without loss of safety.

Rust doesn't know about multidimensional arrays, so they have to be supported in a library. This matrix representation was extracted from the "algebloat" crate:

    pub struct Matrix
    {   data: Vec<f64>,
	nrow: usize,
	ncol: usize
Access functions, get and set written in the obvious way:

    pub fn get(&self, r: usize, c: usize) -> f64
    {   assert!(r < self.nrow);
        assert!(c < self.ncol);
        self.data[c + r * self.ncol]            // index 
    pub fn set(&mut self, r: usize, c: usize, v: f64)
    {   assert!(r < self.nrow);
        assert!(c < self.ncol);
        self.data[c + r & self.ncol] = v;       // set value
Matrix multiply, written in the obvious way:

    pub fn mult(&self, other: &Matrix, result: &mut Matrix)
    {   assert!(self.ncol == result.ncol);  // out of the loop checks
        assert!(self.nrow == result.nrow);
        assert!(self.ncol == other.nrow);
        assert!(self.nrow == other.ncol);
        for r in 0..self.nrow 
        {   for c in 0..self.ncol
            {   let mut tot = 0.0;
                for rr in 0..self.nrow 
                {   tot += self.get(rr,c) * other.get(r,rr); }
                result.set(r, c, tot);

Generated code for the inner loop:

        /// mult - matrix multiply, straightforward approach
        pub fn mult(&self, other: &Matrix, result: &mut Matrix)
        {   assert!(self.ncol == result.ncol);  // out of the loop checks
            assert!(self.nrow == result.nrow);
            assert!(self.ncol == other.nrow);
            assert!(self.nrow == other.ncol);
            for r in 0..self.nrow 
            {   for c in 0..self.ncol
                {   let mut tot = 0.0;
                    for rr in 0..self.nrow 
                    {   tot += self.get(rr,c) * other.get(r,rr); }
                    result.set(r, c, tot);
Generated code for the inner loop. "rustc 1.14.0", optimization level "opt-level = 3", AMD64 instruction set. Debug mode, so asserts should be checked. (Not sure about this; it is possible that opt-level=3, "Aggressive" disables some checking. Documentation is unclear on this.)

    .Ltmp254:                      ; in Matrix::get()
    	.loc	1 122 0            ; self.data[c + r * self.ncol] 
    	movq	%rdi, %rax
    	mulq	%r11               ; doing the multiply for the subscript every time
    	jo	.LBB8_74           ; and checking it for overflow
    	addq	%rsi, %rax         ; doing the add. No strength reduction 
    	jb	.LBB8_76           ; another check 
    	.loc	17 1362 0
    	cmpq	%rax, %r9
    	jbe	.LBB8_72           ; and another check
    	.loc	17 1362 0 is_stmt 0
    	cmpq	%rcx, %r15
    	jbe	.LBB8_78           ; array overflow check
    	.loc	1 188 0 is_stmt 1
    	incq	%rdi
    	.loc	1 122 0
    	movsd	(%r12,%rax,8), %xmm1
    	.loc	1 147 0
    	mulsd	(%rbx,%rcx,8), %xmm1  ; The real work: floating multiply
    	addsd	%xmm1, %xmm0          ; and the add
    	.loc	18 746 0
    	incq	%rcx
    	cmpq	%r14, %rdi
    	jb	.LBB8_50              ; loop counter check - required
This is relatively decent code. the compiler got rid of multiple checks on the same values. There was some strength reduction of indices, too; only the subscript that's traversing the "wrong way" generated a multiply. The ones that are advancing one element at a time along the underlying vector are just adds. About five instructions could come out, but it's not bad code.

I've seen FORTRAN compilers do this kind of matrix multiply with a five instruction inner loop on a mainframe, incrementing pointers in registers for both dimensions. That's the advantage of multidimensional array support.

The point I made about multidimensional arrays stands - the compiler didn't strength reduce the multiply and eliminate the rest of the checks. That requires inferring too much from the user's code.

On the other hand, the code is good enough that using "unsafe" for performance reasons is very seldom justified.

I started writing a test case, and discovered that crate "algebloat" won't even compile on stable rust 1.14.0. (It uses features "rustc_private" and "test".) It looks like this crate was never finished.

This is the other problem with not having built-in multidimensional arrays. There are so many implementations to choose from. Here's the list of all 34 Rust matrix math packages.[1]

Looking at crate "matrixmultiply", it's all unsafe code.[2] That's because it's C written in Rust:

    pub unsafe fn sgemm(
        m: usize, k: usize, n: usize,
        alpha: f32,
        a: *const f32, rsa: isize, csa: isize,
        b: *const f32, rsb: isize, csb: isize,
        beta: f32,
        c: *mut f32, rsc: isize, csc: isize)
Arrays? What arrays? Raw pointers are good enough, right? What could possibly go wrong?

Still trying to find a package with a matrix multiply in safe Rust code with the subscript checks optimized out.

Update: checked crate "ndarray". Indexing is unsafe.[3]

Update: checked crate "matrices". Empty project.

Update: Checked "scirust" - more raw pointer manipulation.[4]

Not finding real-world matrix libraries in which all those fantastic checking optimizations are used and working. I'd like to see that stuff in action.

[1] https://libraries.io/search?keywords=matrix&languages=Rust [2] https://docs.rs/crate/matrixmultiply/0.1.13/source/src/gemm.... [3] https://github.com/bluss/rust-ndarray/blob/master/src/dimens... [4] https://github.com/indigits/scirust/blob/master/src/matrix/m...

Cool, just wondering. In general, unsafe code should be encapsulated, to isolate its possible effects, and once you've got it isolated, it's easy to break that bit out in a library.

From poking around in Rust for ~1 year my common ones are uninitialized arrays for scratch buffers(> 1Kb) and FFI(obviously).

Like you said all the other cases are nicely wrapped into a library but I'm not quite sure how you'd abstract the first one above. It's small enough to do the unsafe block inline that it's not annoying but definitely shows up(esp when you hit APIs that are copy of C interfaces).

Depends on what you're doing; see https://crates.io/crates/init_with as one example. It's not the exact same thing, and yeah, I would argue that's one area where it's okay to not totally abstract it. Rules are meant to be broken. (I was thinking about removing bounds checks with get_unchecked as well).

That's a much more involved use case that I usually have. Looks like a solid abstraction though.

Like I said, not really an issue, just a common pattern that I see.

Absolutely, understood.

Ironically, this is also a good example of how unsafe is complicated; "hey make an array (not vector) of copies of this thing" has lots of edge cases!

What I worry about is those "thoroughly-vetted libraries" is that the use of those will likely follow a power law distribution in terms of code dependent upon them. When that happens, a disproportionately-high amount of code will depend upon those libraries. So when a defect (or vulnerability) shows up in one of them, a disproportionately-high amount of code will become vulnerable.

Reducing this unsafe surface area should result in outsized gains overall. I'm not familiar with the unsafe patterns, and to what extent they could be addressed/reduced through Rust language/compiler design, but that'd be one area where research focus might produce even more benefit.

The other weak point may be the LLVM itself, which I guess would be sort of a background hum of risk to Rust.

Outsider view: not sure of the relative levels of risk in each layer, and to what extent they can be addressed.

Can (or will) these unsafe patterns be capable of being addressed by future changes in Rust?

> Rust as a language is now realizing the benefits of borrow checking. As the article points out, the syntax doesn't have to distinguish between move and assign. The borrow checker will catch a reuse of something already moved away. This turns out to be effective enough in practice that the syntax distinction isn't necessary. That wasn't obvious up front.

Could you say a bit more about what you mean by "now"? You write like it's a recent realisation but (as linked in the parent article) it was "discovered"/described/promoted more than 4 years ago: http://smallcultfollowing.com/babysteps/blog/2012/10/01/move... .


Are your other two paragraphs related to the article, or are they just your general observations about Rust?

Never read "smallcultfollowing" before.

The "implicitly copyable" problem is amusing. That was dealt with by Wirth in Modula 1 with the rule "if the programmer can't tell, it's up to the compiler". Thus, non-writable objects could be passed either by reference or by copy, depending on object size. This was up to the compiler. The usual rule was that anything up to 2 words in size was copied. Since the called function couldn't change the value, it didn't matter.

This avoids philosophical gyrations. You want a rule that says you can copy ints and floats, for performance reasons. Trying to reach that via type theory makes it harder. It's an optimization.

We tried making copy/move an optimization. The number of useless copies that ended up in the resulting code was absurd. You think compile times are bad now…

You mean there were lots of copies that made it to the back end, only to be converted to non-mutable references late in the compilation process?

How do you convert copies to non-mutable references? That involves proving things about aliasing. Even in Rust that is not that easy…

The other way round. Sometimes you want to convert non-mutable references to non-mutable copies. This is almost always a win for int, float, etc., and usually a win for anything up to 8 bits on a modern processor.

I'm going to nitpick and suggest that you almost certainly meant 8 _bytes_.

Fortunately there's no need to have read all of the (very interesting) archives of Niko's blog in this case, as the parent article linked to that post in the relevant section too.

? is already in stable Rust.

Is it ? According to [1] it should be available in 1.14, which is the current beta.

[1]: https://internals.rust-lang.org/t/all-the-rust-features/4322

Must have been a bug, it landed in 1.13. I just tried it myself to triple check https://blog.rust-lang.org/2016/11/10/Rust-1.13.html

(1.14 is tomorrow, so even if that was true, I'm off by a day.)

Good to know that this list must not be taken as Gospel. Thanks.

For those of us who are getting to the party 3 years late, thank you.

Just my 2p for others learning: for me, Rc::RefCell was what I was missing, even after I thought I was up to speed. I was fine using Channels for inter-thread communication and I never needed Arc, but use of Rc is common in the Rust ecosystem and a lot of my early fights with the borrow checker weren't fights I needed to have. In situations where it wasn't a trivial change, and I found myself banging my head against the wall, it was often because I was in a situation that called for a reference counted cell.

> Rc is common in the Rust ecosystem

I'm really surprised you think that's the case. Most rust codebases I've worked with use Rc very sparingly, if at all. If a codebase does use Rc there's usually one central thing that is Rc'd, with everything else using regular memory management.

I have noticed that beginners coming from GCd languages often tend to structure their code in such a way that paints them into a corner where they must use Rc. This might be what hit you. I'm not really sure how to teach idiomatic Rust though.

> there's usually one central thing that is Rc'd,


In my case, I was using the nphysics library, which for unsurprising reasons uses Rc for entity handles.

Similarly, in my app's code, I used nphysics' proximity and contact handlers to watch for events that indicated a change in status. Those handlers needed to initialize/set values in a HashMap accessible (and periodically cleared by) the main sim loop -- 'one central thing' that is Rc'd.

It's not sprinkled everywhere in your codebase, and for obvious reasons most libraries don't need to use it. But in applications, a central loop + library-provided callbacks + some Rc'd state of doesn't seem too uncommon.

> I'm not really sure how to teach idiomatic Rust though.

Learners should read and write lots of Rust. That's a great way to learn what is idiomatic. There will be early missteps, but thankfully the language makes it more comfortable when you do things the right way.

Yeah, my advice in general is to pick a few rust codebases and start hacking on them.

But I would love to figure out how to teach this in a way that doesn't involve random codebases. Some folks learn by doing (I do!), but others fare better with tutorials.

> Just my 2p for others learning: for me, Rc::RefCell was what I was missing, even after I thought I was up to speed.

I started (and then stopped) learning Rust a few months ago (before their docs rewrite) and the borrow checker and concepts weren't explain in a way that I could understand. Programs wouldn't work at all, and when I read about Rc::RefCell I was scared because I wasn't sure how much garbage collection rust would do and whether or not it would actually be safe.

So yeah. Rust definitely has problems for beginners.

To be clear, Rust has no garbage collector, and unless you type "unsafe", it's safe. RefCell will panic when something goes wrong.

I should take this opportunity to mention that the book has been stagnating because I (along with Carol Nichols || Goulding) have been re-writing it out of tree: https://doc.rust-lang.org/stable/book/ The ownership/borrow checker stuff has been completely re-done. The existing book chapters explain it in the abstract; in the new book, we use String/&str as a motivating example, since that's an area that often trips up new Rust programmers.

That's great to hear. That's definitely an area that has made me slow down on learning Rust. Will definitely be checking out the new book once it's ready.

> I wasn't sure how much garbage collection rust would do

Rust doesn't do magical garbage collection.

Rc<T> does reference counting, which is a form of garbage collection, but you get to choose where it gets applied, so it's a linear cost with no magical GC pauses or whatever. Rc isn't unique to Rust, it exists in C++ too.

RefCell makes mutation within an Rc safe. It panics if you misuse it.

http://manishearth.github.io/blog/2015/05/27/wrapper-types-i... has more on the Rc<RefCell<T>> pattern

Reference counting is not a form of garbage collection. Both reference counting and garbage collection are types of automatic memory management.

This discussion crops up each time reference counting is brought up.

In academia reference counting generally falls under the umbrella of GC. In the industry "GC" usually means "tracing GC".

The terminology is irrelevant to the point I'm making. I did make a distinction between RC and the "regular magical kind" of GC.

Garbage collection requires a garbage collector, which reference counting doesn't have or need. This is true for both academia and industry.

It's still runtime memory management. That makes it garbage collection in my book. Sure it's not as involved as tracing, the prototypical form of GC, but it's still something that requires a runtime check each time that resource is used or a binding to it goes out of scope, similar to aspects in generational GC.

The point is that while both reference counting and garbage collection are types of automatic memory management, the reverse isn't true.

From a 1976 paper on automatic memory management https://www.cs.purdue.edu/homes/hosking/690M/deutsch.pdf

"Automatic reclamation of storage no longer in use is done by the following two techniques:

* Garbage collection * Reference counting"

Note that these are two separate items and that one is not a subset of the other (and vice versa). It is simply incorrect to call reference counting "garbage collection" when the academic and industrial practices already have explicit meanings to these two terms.

And http://dl.acm.org/citation.cfm?id=356854 (pdf: http://citeseerx.ist.psu.edu/viewdoc/download?doi= treats reference counting as a "way to distribute GC overhead time".

I'm not saying the term always is used that way in academia. I'm saying that it's sometimes used that way, and that it's valid to call RC a form of GC.

Ironically I said that RC was a form of GC to avoid precisely this argument, because if I say "Rust doesn't have GC" I'll have a bunch of folks telling me that RC is a form of GC. This is largely irrelevant to the point I was making, which explicitly distinguished between RC and tracing ("magic") GC -- which I suspected (but wasn't sure) was what was being asked about.

> So yeah. Rust definitely has problems for beginners.

IMO Rust has the challenges that are very similar to all other languages have. But if you are trying to make the leap from a GC'd language like Java or Python to Rust without ever having written C/C++, you should expect to have to learn not only new language concepts but new programming concepts.

I'm a programmer that has been using C for at least 8 years, and I've been developing in Go for the past 4 (as well as a couple of years of Python here and there). The point is that I have experience in quite a few programming languages, and I am familiar with the programming concepts you're referring to.

I do not agree that Rust's challenges are the same challenges with other programming languages. Because Rust will not let you even play with the langauge unless you understand how to write safe programs (which is a concept that is defined in Rust). So there's a whole bootstrapping problem of "how the hell do I play with this thing to understand it if I can't play with it until I understand it fully". C, C++, Python, Go -- none of them have this issue. They will let you write bad code and won't stop you from running it (which I admit is not a good thing, I'm just saying that Rust's strengths are not without their downfalls).

I had a similar problem with the type system in Haskell. I'd look at Haskell code and it was obtuse to me. Even reading books, it was hard for me to understand some stuff until I started writing code... but writing code was difficult because I didn't understand it.

For me the key was just doing toy problems over and over again. Without understanding the idioms, it's hard to dive in and write something real. I think Rust is approachable if you take it like that (or at least it was when I looked at it about 2 years ago).

Reference counting can be a useful memory management technique. But reference counting may cause unpredictable pauses when large amounts of objects suddenly need to be freed (e.g. when dropping the last reference to a large array holding many references).

At least a carefully written garbage collector can free objects incrementally, and concurrently.

So memory management in Rust is certainly not a solved problem.

EDIT: Removed mention of RefCell.

> So memory management in Rust is certainly not a solved problem.

This looks like nay-saying just for the sake of it, but in case you're actually just confused:

Rust isn't reference counted unless you yourself add reference counting. Rust tracks ownership in the compiler so it knows statically, at compile time, when an object needs to be freed, and the compiler emits the code to do the freeing in that spot. Rc is a utility function: you can choose to opt into reference counting on a per object basis. But if you're not actually typing Rc yourself then it's not happening. Rust is no more reference counted than the ability to build the same thing in C makes C a reference counted language.

> But reference counting may cause unpredictable pauses when large amounts of objects suddenly need to be freed (e.g. when dropping the last reference to a large array holding many references).

Then again so does manual memory management if you're freeing a large object graph at once.

> At least a carefully written garbage collector can free objects incrementally, and concurrently.

You could get that with refcounting, stashing Rc0 objects in a list of items to free incrementally rather than freeing it all at once and synchronously.

Since refcounting only happens expicitly it should be pretty clear where possible drops happen that lead to large pauses.

It wouldn't be too hard to replace that Rc with an Rc that performs deferred drops by putting them in a queue and performing a number of drops at an oppurtune time (eg. using an idle-hook in a event loop).

It might even be possible to send all drops to a thread dedicated to dropping, but that would probably only work if the contents are Sync or Send. This form of async drops should always be possible for Arc.

But big pauses should rarely occur since refcounting is something that has to be added explicitly, Rust is a language where compound types are used extensively instead of building dynamic structures using collections for trivial things and the use of Drop trait is not something that can be fully relief upon (making that most solutions try to go without using it). This should make most drops pretty lightweight, even when dropping large numbers of items.

> But reference counting may cause unpredictable pauses when large amounts of objects suddenly need to be freed (e.g. when dropping the last reference to a large array holding many references).

That's still a predictable pause: it's the number of references contained within the array. If you know your array will hold at most ten references, it will take at most the time to free the ten references. Also, you know this will happen when the last reference is dropped - no earlier, no later. If your program can't handle the pause at that moment, you hold the reference until it can. If you know another component (perhaps in your own call stack) holds a reference, you know it won't be freed. And so on.

> If you know your array will hold at most ten references, it will take at most the time to free the ten references.

True. But if those references themselves contain references, it may become tricky to manage all of this. You basically don't want to think about it.

But, like others mentioned, you can put the objects in a queue, and free them in the background. It would be interesting to know how such a solution stacks up, performancewise, against an incremental, concurrent garbage collector.

Note that you don't need a GC to do that; the allocator can do it for you.

A lot of folks think that optimizations like having allocation arenas and spreading out deallocation pauses are unique to garbage collectors, but they're completely orthogonal to GC. You can have GCs with these optimizations, and GCs without. You can have regular allocators with these optimizations, and regular allocators without. Jemalloc does have arenas and stuff for allocation (I'm unsure if it spreads out deallocation loads). Of course, with a GC you can also defer the cost of iterating through large vectors and calling destructors.

But anyway, this only becomes a problem when you have large complicated Rc-trees in your application, which tends to not be the case in Rust.

An exception (large complicated Rc-trees) is the rope in xi-editor. If you load a very large file, the operation of letting go of the last reference is potentially a large enough pause to have an effect on UI responsiveness. I've considered adding a mechanism that moves the object into a deallocation thread (or a deallocation task running in a thread pool) for this reason, although I'm not sure how important it is in the grand scheme of things.

In a GC'ed language such as Go, this would not be a problem. The flip-side is that I can safely and efficiently do updates in place, because Rust's reference counted type has a way to determine when you're holding the only reference.

> I've considered adding a mechanism that moves the object into a deallocation thread (or a deallocation task running in a thread pool) for this reason, although I'm not sure how important it is in the grand scheme of things.

The one time I've done this in C++, delayed deallocation actually made major loads/unloads worse for us on mass batch operations - more cache thrashing perhaps? Since this involved GPU resources, perhaps some bad interaction with the driver? I did keep around the "optimization" conditionally for smaller operations as this let us get rid of some hitching.

> In a GC'ed language such as Go, this would not be a problem.

In C# this traditionally manifested as lengthy GC pauses you had to jump through hoops to workaround. Modern GCs are much better these days, but it's not 100% solved.

Hmm, yeah cache thrashing sounds a likely culprit - agreed. Perhaps before dropping all the references they should be sorted by memory address, so that page faults are minimised.

I'm hoping for something like this a few years down the line:


Allowing Rust to generically adapt to an arbitrary Gc that it is nested inside of would be awesome.

Yes, that would be amazing.

To be clear, RefCell does not do reference counting, Rc does. RefCell does runtime borrow checking.

>Reference counting can be a useful memory management technique. But reference counting may cause unpredictable pauses when large amounts of objects suddenly need to be freed (e.g. when dropping the last reference to a large array holding many references).

How is it compared to other forms of GC?

More performant, generally, particularly considering the fine-grained control it offers. However, cyclic datastructures may be problematic (similar to how they are in typical GC, but often worse). Of course, exceptions abound; the "better" choice is extremely use-case dependent.

It's generally predictable pauses, fwiw.

Well, a different kind of unpredictability from other GCs.

And Rc is rare enough (IME) that this doesn't usually matter.

What kind of application s would Rust be an appropriate for? What languages is it largely meant to replace/improve on.

I've been using Rust to process fairly large amounts of streaming financial data. I am using Protocol Buffers (and evaluating Cap'n Proto) for network serialization and Kafka for buffering/queuing, and overall the library support is quite good.

Previously I was using JVM languages for this purpose, but grew weary of the resource footprint, and especially the unpredictable GC pauses. I am aware of the Azul JVM which removes GC pauses and of various Java techniques to avoid GC altogether, but switching to Rust provided a GC-less model from the ground-up, a powerful type system, and familiar functional programming facilities at no cost.

What Rust libraries are you using for protobufs?

I have been using https://crates.io/crates/protobuf. It generates pure Rust protobuf implementation files.

How easy is it to get Kafka and Rust working together?

Fairly straightforward overall using https://crates.io/crates/kafka. The client doesn't automatically handle Kafka node failures, so that's the responsibility of your application code.

I haven't had the chance to try https://github.com/fede1024/rust-rdkafka yet, but it looks promising and partially wraps the C/C++ library https://github.com/edenhill/librdkafka.

> What languages is it largely meant to replace/improve on.

I think it's great for any problem area where you'd instinctively reach for C. System utilities, bare metal development, etc. Stuff where you care about the precise layout of memory but would prefer that a simple but non-obvious mistake didn't end up as a high-profile CVE.

> What languages is it largely meant to replace/improve on.

C. It has C's straightforward machine model in mind, like C its memory behaviours are entirely predictable, it's entirely explicit about error handling (no hidden paths of errors exiting functions as in C++).

It improves on C by adding strict checking to make managing memory safely and avoiding race conditions tractable problems.

Here at ThreatX[1] we use it as a replacement for C. We've used it to write our web application firewall sensor and a real time threat analytics engine. Compared to C it has enabled us develop features rapidly and safely without sacrificing any performance.

1. https://threat-x.com/

It's meant as a c/c++ replacement.

Rust is intended for applications where you need speed, concurrency, and safety/correctness. In some senses, "use Rust where you would use C or C++" makes sense, but we're also seeing a lot of usage from programmers who have rejected C or C++ for various reasons.

One angle that we've been focusing a lot on lately is productivity. Think about some feature of Rust that provides safety, like out borrow checker, which ensures that pointers don't do bad things. One way to think about this is safety, but another way is productivity: if your application segfaults, you have to track down what actually caused it, and then what caused the cause: "oh this pointer was null because I did something incorrect in this other part of my code." While having the compiler do these checks can sometimes feel like a productivity _loss_ at the beginning, you do a lot less debugging later, so it's a productivity _gain_ overall. (Or at least, we feel that way.)

Other features of Rust make it feel productive as well: a focus on iterators and composable iterator adapters is often more productive than manual for loops, Cargo and crates.io enable wide-spread code re-use[1], we've been working on good tooling for IDE integration for those that use IDEs, and "zero-cost abstractions" help you have nicer interfaces while not having to pay the cost for them. Like this: https://news.ycombinator.com/item?id=13117608

One interesting area where we've seen lots of production Rust usage is embedding Rust in other languages. Since Rust can expose functions that look like C to the outside world, you could write a Ruby, Python, Javascript, or whatever extension in Rust instead of C. And Rust's safety features are appealing here, since you may not be the kind of person who writes C all the time: that's why you write Ruby/etc in the first place.

Soon, we expect to see more usage of Rust on the server: the "tokio" project is gearing up for an initial release, which provides a foundation for asynchronous IO. Even before tokio is released, people are playing with this: crates.io is Rust on the backend, and "npm recently began replacing C and rewriting performance-critical bottlenecks in our registry service architecture with Rust": https://medium.com/npm-inc/npm-weekly-73-no-love-for-http-ur...

Basically, lots of places. We'll see how things go into the future!

1: Anecdote time: my favorite package is https://crates.io/crates/x86, which provides low-level bindings to various x86 platform details. Writing an operating system? No need to define your own IDT entries, just grab the library and http://gz.github.io/rust-x86/x86/irq/struct.IdtEntry.html has you covered. Re-usable packages in operating systems is super cool. On a higher level, this kind of thing is enabling Firefox to re-use Servo components more easily; Firefox can just pull chunks of servo with (relative) ease.

Dream time, Microsoft gets to sponsor Rust on their stack and VS integration, and we can move on from C# + C++/CX to C# + Rust. :)

I did see this go by a while back: https://github.com/Microsoft/BashOnWindows/issues/258#issuec...

So someone over there is paying some degree of attention, at least :)

Have you tried VSCode with the Rust(racer,rustfmt) + lldb integration? I've been pretty impressed with it so far.

FWIW C# <-> Rust integration is really straightforward. You can actually pass delegates as C fn pointers and then treat them as a closures in Rust. Much less painful that I initially thought.

It is not yet the same as Blend + Visual Studio (C#, F#, C++/CX, C++/CLI).

Yes, I do use VSCode, but only for dabbling on Rust during plane/train travels. The language is not yet at a level it just fits on MS stack and is requested by our customers on their Requests For Proposals.

Regarding C# <-> Rust interoperability, it is very badly documented. I gave up on searching for it, and just used C# <-> C++/CX <-> Rust instead.

Or I am very bad searching for it.

Have you seen the FFI omnibus http://jakegoulding.com/rust-ffi-omnibus/ which includes C# examples?

No, thanks for pointing it out.

C# <-> Rust is pretty straightforward. Just follow the C FFI side on Rust and C# has standard marshal mechanisms for calling C code.

extern/dllimport[1] Covers most of it. There's automatic conversion for CString/string and delegates as function pointers. If you need to go deeper than that there's the marshal namespace[2]. Going through C++/CX sounds really painful.

[1] - https://msdn.microsoft.com/en-us/library/e59b22c5.aspx

[2] - https://msdn.microsoft.com/en-us/library/system.runtime.inte...

> C++/CX sounds really painful.

Not for someone that knows C++ since C++ARM. :)

I am pretty comfortable with C# and native interop, my issue was trying to map Rust strings with .NET UTF-16 ones, including passing ownership from Rust to .NET side.

If I remember right you can specify encoding as an attribute on the extern decl.

Wow, four years already. Maybe you can help settle this question I've had. I'm a rubyist (like you were/are, and wycats, and a bunch of rustaceans), and I think that sort of drove my interest in rust.

But after 4-5 years of ruby the dynamism which initially was super cool, has grown a little frustrating and I long for a more sophisticated type system and a compile step, since frustrating bugs crop up from time to time that would be caught by that, and which slip through a hole in our test suite.

I've dabbled in Haskell and Rust and the whole "if it compiles it's likely to work, or at least the bugs will be significant logic ones" aspect of them is very cool.

I wonder, though, if the flipside is true? Maybe people who have a compiler and strong type system get frustrated with its rigidity over a number of years, and when they see ruby for the first time are blown away with what it can accomplish. After all, it took a number of years day in and day out with ruby to start seeing its blemishes, I wouldn't be surprised if the reverse were true.

So as someone who's now spent 4 years in the other grass, do you still find it greener? (Or am I wrong from the start, and maybe you like Rust for other reasons and its type system wasn't something that attracted you to it over ruby?)

This is very thoughtful :)

So I _do_ like Rust for other reasons, or at least, I did initially. My opinion has changed over time. I used to say "I'd never write web application in Rust," but soon, I'd prefer it slightly. Needs some more libraries to come out, and they're close...

There's two sides to "is the flipside true"? One is, there's a huge variety of static type systems. Some of them are more flexible than others. So for example, if I had to use Java 1.4, or even Java 1.5 (which are very old, but were two releases I am very familiar with), I'd wish I was back in Ruby. This is because it's _too_ rigid; I can't express enough things in the type system to make up for the lack of flexibility. In a language with a better type system (and Java itself has a better one today, even), it changes the value of the equation.

In Rust though, there's another axis: speed, memory usage, stuff like that. I don't miss Ruby's flexibility here, because even basic operations in Ruby are _so so so_ much more expensive than in Rust. So when I think "oh yeah, I could do that, but if I think about how Ruby does it under the hood... it's not worth the cost." This makes me miss Ruby less as well.

Ruby is an excellent language, and I will always love working with it. But there's only so many hours in the day, and I've spent so much time with it already...

Are you talking about strong typing or static typing? I've been using statically typed languages for 20 years (C++, Java, Objective-C) and I haven't gotten tired of it yet. Especially when working in a large code-base I didn't write, Python is rough. What's a "session"? Who knows, guess I'd better put `print type(session)` and figure out how to get that function executed.

Although, the thing that bothers me about Python (haven't used Ruby) is the fact that it is interpreted. It's really annoying to have to run my program to discover syntax errors and typos, when a compiled language would do that for me. With interpreted and dynamically typed languages, you have to have 100% test coverage, because you have no idea if your code is even syntactically valid until you run it, and of course if it isn't valid, then you throw an exception. (Hopefully that crashes the program, unless someone was brilliant enough to catch all exceptions and not print any error messages, then it looks like everything worked fine until you discover it didn't.) So every change is a potential for an exception, until you execute that line of code to make sure there wasn't a misspelled variable name or something.

So, yeah, I love the compiler. (I also love Python for scripty stuff.)

> With interpreted and dynamically typed languages, you have to have 100% test coverage, because you have no idea if your code is even syntactically valid until you run it, and of course if it isn't valid, then you throw an exception.

I think you mean _semantically_ valid, e.g. no undefined variables. Most dynamic languages provide file-level syntax checking, rather than line-level. But I completely agree that this is the easiest argument against dynamic languages. Computers are better at bookkeeping than humans, and we should use them as such.

> until you execute that line of code to make sure there wasn't a misspelled variable name or something

But isn't it the purpose of tools such as IDEs? PyCharm is pretty damn good at it. Of course, it's not always possible or convenient firing up an IDE. But I suppose editors such as vim should be fully capable of doing this via plugins as well.

The best use case I know of is avoiding the common Python & Ruby pitfall of "just stick everything in a dictionary and index it by key". Welp, forgot to update every instance of that string index (particularly if that name overlaps with anything else, including variable and function and class names... so, like, a lot of the time).

Yeah, no, I don't think that is the job for an IDE. Maybe for you on your desktop, but your software should work regardless of any IDE or editor.

Languages I use most are Rust and Ruby. I find the type system not frustrating, but challenging, on occasion. But only when I'm trying to write a library with a very convenient, flexible API. Its probably easier to write the first draft of this sort of library in Ruby, but in my experience what the library does under edge cases and bugs is often super unpredictable. Not so much in Rust.

I haven't had these frustrations in application code in Rust.

Have you tried crystal? Looks like it could be the best of both worlds.

I've worked primarily in Java and C++ for the past few years, but also with several dynamic languages. I sometimes miss the expressiveness of Python, but I can't say I'm ever frustrated by a strong type system nor does the compile step annoy me as long as my write-compile-test cycle is reasonably quick (usually achievable). With Java 8, and even more so with Rust, I'm less frustrated by the lack of expressiveness, so I find myself missing dynamic languages even less.

> but I can't say I'm ever frustrated by a strong type system

Never? I can't say it happens very often but occasionally the type system get's in the way, usually when you want the function to be type T1 but sometimes it'd be convenient to do a little bit more if the type is also T2.

Rust supports algebraic types and switching over the type, which handles your example quite efficiently. You would simply define an algebraic union of T1 and T2, and switch on the type into the desired code for each branch.

What does that look like in code? I was using c# so I did the dirty:

T1 arg;

var foo = arg as T2;

if (foo != null)

Also, how do we format code here?

In Rust, you'd define an `enum` of two types.

    enum Either {
let's say you get a value of type Either

then you can match on them:

    match value {
        Type1(v) => func(v),
        Type2(v) => func2(v),

Does that let you work with the default type? What I was doing in c#:

  function(T1 thing) {
    //do normal T1 stuff here
    var foo = arg as T2;
    if (foo != null)
    //more normal T1 stuff
If I'm understanding your example I would have to wrap all the T1 stuff in a match.

Yes, you do. There is no idea of a 'default' type here, because each variant of the enum is treated exactly the same. Your C# example assumes an inheritance relationship between T1 and T2 (i.e. T2 inherits from T1, so you can pass it to something expecting a T1 reference).

Rust doesn't have type inheritance, so the only way you can define relationships between types is to have something external tell you how to package them together, which is what the enum would do here. So it's not a parent-child relationship between the types, it's a new thing which says "here is a thing which can be a T1 or a T2 but not both at once".

Not having OO does require you to think somewhat differently about how to build your data structures in Rust. I've found my intuition from Haskell is a stronger guide when working with Rust, but since Rust's type system is much more like that in an ML-family language this makes quite a bit of sense! Sadly this does increase the learning curve for more mainstream languages.

(Rust does have trait inheritance, but all that says is that to implement trait T2 you have to also implement trait T1, and thus anything working with a T2 can assume that thing is also a T1).


The above example was actually using two entirely different interfaces. It seems like rust is quite similar to the OO subset I prefer to use. In this particular case it would have been a better match because traits can implement methods.

Rust is on my "to learn" list over the holidays.

Yes, you'd have to wrap it in a match. But maybe you don't have to.

If T1 is the normal case, then there's a type in Rust called Result that handles this kind of pattern.

    fn potentially_failing(thing: Result<Type1, Type2>) {
        thing.map(|foo| success_case(foo));
this will only execute in the success case and ignore the error case

map_err does the same, but only touching the error case

Thanks, that's pretty cool.

I expect Rust should be able to handle this nicely, but I'm more familiar with Pony... So here's how to do it nicely in it:

    fun f(x: (T1 | T2 | None)): ReturnType =>
      match x
      | T2 => /* do something */
      | T1 => /* do something else */
      | None => /* Handle badly behaved. Optional to have, compiler can enforce good behaviour if you want. */
Note: You can see formatting at https://news.ycombinator.com/formatdoc

Its nice to "take a break" from static types once in a while and hack together some kind of data munging script in python every once in a while. But every time I have tried to go larger in scope I ended up regretting it. I'm not sure I would want to do any kind of dynamically typed programming professionally again.

The main languages I use with work are:

* C

* Scheme (Gambit)

* Python 3.4

* Pony

I find all of them frustrating at times.

Python's dynamism is nice, but it's so damn inflexible, requiring my to follow the One True Way.

That can be good, and makes it easier to eliminate bad code in reviewing.

But it also means that you can fight with the interpreter to do what you want.

Scheme gives me both dynamism and flexibility, with less speed tradeoffs. Yay!

However, there was someone on my team obsessed with turning everything into a macro.

What's the point of first-class functions if you just macro everything?

Also, Scheme's stdlib is purposefully small, so you sometimes need to reinvent the wheel. Thankfully Scheme makes it both easy and pleasurable to do so.

The rundown is, though fast and dynamic, code review can be painful unless you follow standards, and you might run up against, "Oh... I need to build my own FTP library", though SLIB (depending on your circumstances), can eliminate some of that.

C's compiler feels like a breath of fresh air after that.

Unless you hit a runtime error, it makes writing code a breeze, quickly and efficiently.

Unfortunately, it doesn't protect you against yourself, or lazy people on the team trying to use void pointers for everything.

So code review can be harder, and catching edge case segmentation faults can be quite difficult.

Enter, Pony.

Pony deals with the same areas Rust does, but for reasons I won't go into, when both Rust and Pony were young, my team started using Pony for a few little things.

It has been... An experience.

Pony is very type safe, it is exception safe, and data race free, all enforced by the compiler.

Which sometimes means the compiler will sit there bashing your code for a full day before you realise that you weren't writing it safe enough, and you can't just tweak it, the whole thing needs to be rewritten.

Also, the Actor Model being central to everything can be quite annoying, when you just need to fit a couple extra functions in somewhere, but you aren't sure where's best.


The compiler doesn't let you screw up.

What you do write can become massively concurrent fast programs, easily.

Pony also has some fairly good documents, for such a young language that really doesn't have the backing of Rust.


I get frustrated with any language I am forced to deal with over time.

I wish Python had compile-time contracts, but there is mypy to ease the pain now.

I wish Scheme had a better stdlib, but it goes against the ethos. (See RSR6 community breakdown).

I wish C was safer, but it's lack of safety makes it easier to do things like JIT.

I wish Pony was more flexible, but it'll never let me point a shotgun at my own foot.

If I spend too long in any world... It's time for a breath of fresh air.

Can you talk more about using Pony in a production environment? What are the performance aspects of it? How do you find writing 'non-actor' code - like just doing some string manipulation?

I find the language fascinating but I'm learning Erlang and I don't really want to get started with Pony at the same time. Do you have experience with Erlang?

what made you choose Pony?

Sorry for the bombardment of questions but I don't hear much about Pony. As a rust user, and a burgeoning fan of Erlang, I'm fascinated.

I've never met anyone else using Pony in production, so I'll try and give the best answer I can.

> What made you choose Pony?

We wanted a Type Safe language to run a REST API frontend. That is to say, we wanted to have something that could redirect requests to the appropriate servers, at scale, whilst maintaining Type Safety in the server itself. We got hit by so many issues from JSON's weak/absent typing causing runtime errors, we wanted something that could sanely prove that we wouldn't crash.

Whilst we were at it, the same language seemed great for the backend for a couple languages we develop in-house. For example, Owlang is a language developed for teaching with a group of teachers. The first iteration was written in Scheme, today, it runs on top of Pony.

We considered three languages:

* Rust

* Pony

* Erlang

Erlang was a bit odd to throw in the mix, but we couldn't ignore how amazing BEAM is, especially with recovery by dropping and creating thousands of processes without effort.

However, we found Rust was making too many breaking changes, and there was sort of a culture of using Rust Nightly, which doesn't give off a nice solid feel, or didn't back then.

Erlang has this sort of huge cognitive overload with it's syntax, making it take longer to learn, for no obvious benefits.

Pony's Philosophy[0] however, felt damn good, and we investigated their mathematical proof of safety and the like.

> Can you talk more about using Pony in a production environment?

Pony's got a few gotchas. Thankfully, they spell them out. [1]

We have been burned from long running procedures when we've had to tap out to an in-house C library. We went from using around 400mb of RAM for a group of processes, to running out of memory on a 16GB RAM machine.

That was a stupid mistake. However, using a Timer, like the docs tell you to, obliterated that, and we dropped back down to 4-500mb.

The non-preemptive nature of Pony's scheduler has taken a few people, myself included, some time to get used to.

> How do you find writing 'non-actor' code - like just doing some string manipulation?

If you try and avoid the fact you're using Actors, it'll bite you. Pony is designed for concurrency.

However, if we're just talking about methods and classes and so on, and how they feel... Pony feels simple, and easy to use.

Traits and interfaces make subtyping a breeze, especially interfaces.

But, being interested in Erlang, how about some pattern matching?

    fun f(x: (String | None), y: U32): String =>
      match (x, y)
      | (None, _) => "none"
      | (let s: String, 2) => s + " two"
      | (let s: String, 3) => s + " three"
      | (let s: String, let u: U32) if u > 14 => s + " other big integer"
      | (let s: String, _) => s + " other small integer"
        "something else"
One more bonus for something Pony has, that proved unexpectedly useful, is the ability to have multiple, or generate, Environments. In terms of Pony, that means argc, argv, argp. Being able to isolate environment variables between processes has been useful for making some configuration easier.

[0] https://tutorial.ponylang.org/#the-pony-philosophy-get-stuff...

[1] https://tutorial.ponylang.org/gotchas/

> Erlang has this sort of huge cognitive overload with it's syntax, making it take longer to learn, for no obvious benefits.

That's just weird. How much of an effort did you even put into it? Erlang may have ugly syntax but it's the simplest syntax of all popular languages and the cognitive load while writing is pretty much the lowest I've come across. The language itself is tiny. Or did yo mean you weren't used to a functional language and that was the cognitive overload?

I use Scheme extensively, which I would argue has the least syntax of the functional languages.

Erlang seems simple on the surface, but each type seems to have it's own DSL, allowing things that seem the same, to not be, which means you have to hold more context in your head.

e.g. Are these two the same?

  ensure(A, B) ->

  if A == B ->
Does -> always mean function? Or does it mean something similar to progn or begin from CL and Scheme? Or does it vary with context?

I had issues teaching where to use ; or . or end, and having people actually comprehend the reasons enough to do it without thinking.

At least with Scheme it's always )

We also investigated using Elixir, which has less overhead, and more people find easier to get up and running with. However, at the time Mix wasn't stable yet, which counted it out.

I think you mistook a barrier to entry (getting over the syntax) for a permanent cognitive load. At our company we don't care if you know erlang or not, but we expect a developer to learn it on the job and have had no issues either with junior or experienced developers. All who've tried it are amazed at the productivity.

Awesome. Thanks a lot. I'll have a look at those links with this context.

As a relatively new Rustacean (since May 2016), I have to say that Steve has been one of the most influential people on my Rust coding style and understanding of the importance of community in a language. Prior to talking with Steve on IRC, I believed that community was secondary to language quality, but now I fully understand the importance of strong community in a programming language.

I've moved away from Rust lately (long story), but in my attempts to move to Rust i was blown away by Steve. His presence in the Rust community is staggering, and his efforts truly made a Rust newbie like myself feel very welcome.

Many thanks to the Rust team and the Rust community!

Also, retrospectives and posts where a core contributor summarizes the achievements and accomplishments of the previous few years are extremely valuable. These days software and tools evolve really fast and old resources fall off the face of the earth for a various reasons, and it's important to understand how things used to be and how we got to where we are now.

I learn something new about rust, eg. irrefutable patterns as function argument in the blog.

PS: I do not understand what irrefutable patterns means, so I google.


in case anyone do not know yet.

I really like the premise of Rust. Keep at it. For my purposes, it's still a bit too immature. Last time I looked at the big 7 things, most were still not done.


The RustDT plugin for Eclipse makes the edit/save/compile/flag errors loop a lot tighter for me as a beginner. The autocomplete seems incomplete though and there's no hover docs or ctrl-click through to source that I remember.

I look forward to the day I can replace JS with Rust. It would be nice to have a quick way to start doing Rust in the browser. I tried installing emscripten to give it a go, but the version in Ubuntu's deb repo was too old. I don't remember why, but building it from source didn't work out.

I also got a bit lost trying to get boilerplate together for that purpose. Something like maven archetypes would be really great for Rust... Project templates that get you started with a working hello world for whatever you're trying to do.

Just some suggestions. I basically like where Rust is going.

A quick summary of where these are at:

Internationalization / Localization / Unicode (ICU): yup, no real progress. Needs some domain experts to drive it.

Date/Time : chrono is the most popular.

HTTP: hyper has been good for a few years now, tokio will make it async soon.

Crypto: there's lots of interesting work in this space, see ring and rustls.

SQL: Diesel is the gold standard here.

So, some progress! You're absolutely right that growing the crates ecosystem is vital; we've doubled our numbers in the last year, but there's always more to do.

> project templates


Thanks for your reply Steve. You're a stand up guy. I'll look into some of these over the holiday. I want to give the new rustup a whirl.

For ICU we have http://github.com/unicode-rs/ which covers a lot of it. There are scattered libraries for other ICU things too.

Steve is one of the amazing guys I have ever known. He answered all of my questions in IRC, he is such nice guy. I truly like him a lot. and most important point is Rust team chose him , because they realized how wonderful he is in communicating with people and being leader of community.

As far as I can tell he answers every question, and corrects every inaccuracy across at least these platforms: twitter, hackernews, reddit, [users|internals].rust-lang.org, irc; its mind blowing. Honestly, a talk on how he stays so connected without being awake 24/7 would be interesting ;)

Thanks, for all the hard work, Steve. You're devotion helped turn me into someone who loves Rust.

Thanks both, and achanda358!

It's taken lots of work. I think I'm a nicer and better person in Rust than I was in Ruby, but that's all I'll say about that. I am very much not a perfect human being.

> Honestly, a talk on how he stays so connected without being awake 24/7 would be interesting ;)

I wrote a blog post on that a while back: http://words.steveklabnik.com/how-do-you-find-the-time it's still mostly accurate kinda.

This post is amazing! The best quote is ...

"For example, the biggest time I scrubbed out in my entire life was a choice that I made almost a decade ago: going to college."

A refreshing insight from a very intelligent individual!

I have more complex feels about it as a general thing, but for me, I'm not sure I made the right call. I did learn a lot, and I made some very dear friends, but it was extremely expensive. If you (or anyone reading) are someone who's making this decision, please give it serious consideration, and don't just copy me: this is a really, really big choice to make in your life, and one you deserve to make for yourself.

You have relentlessly pushed for better documentation. Thanks for that :)

> We removed the repl from the tree, as it never really worked, and was a maintenance nightmare. This one is the only one I know of today. Some people ask about one, but nobody has done the work to get a good one together yet. It’s tough!

This is a bummer. I remember when I first tried out rust the repl was still a thing. As someone who writes mostly Haskell and Python, interactivity with a language is a huge plus. I hope that eventually another repl comes around.

I'm highly curious about Rust as a potential introduction to systems programming. One thing that bothers me is the lack of proper classes. This seems to have become a trend among newer languages, as though traditional OOP is an outdated paradigm.

I have yet to be convinced that this is a good move. The attempted justifications I've seen hand-wave about problems with OOP, "composition over inheritance," etc. Meanwhile, in the real world, 99% of production code is OO and the paradigm continues to do its basic job--helping programmers model domain objects. Older languages like JavaScript and PHP have made their OO constructs more robust over time, which suggests to me that in the end a critical mass of developers will always clamor for this easily understandable and productivity-enhancing paradigm over some theoretically pure alternative.

So what is it that you need classes for that dearly?

In the real world 99% of vehicles burn petroleum and the paradigm continues to do is basic job.

Congratulations on the work to make our computing infrastructure safer.

It's interesting to see just how much the Rust language and libraries have evolved. Is there a wish list of breaking language changes waiting for a Rust 2.0 version?

We do have https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Ais...

It is unclear if there ever will be a Rust 2.0. Even those that want it agree that unless it's incredibly easy to upgrade to from Rust 1.x, it's a non-starter.

Given what I've seen, I expect Rust 2.0 to be like C99, with Rust 1.0 being C89 (pre-1.0 is K&R C in this analogy). Even though there were a few non-backward compatible changes, it didn't fragment the C community.

There is also the possibility that a Rust 2.0 could fragment the community much like with Python2.7 <-> Python3.x, which would be "undesirable" at best.

Yes, a split like this would be undesirable to say the least. Not to mention that systems people are used to near-total backwards compatibility; any sort of near-term timeframe for such a thing would destroy a lot of our credibility, in my personal opinion. I'm on team "never 2.0".

We still have some desire to indicate "epochs" of Rust development, as undoubtedly, things like idioms will change over time, new libraries will replace older ones, etc. But I'd prefer something like "modern C++", which could signify this kind of change, without giving up on backwards compatibility.

I sort of want to make an RfC for 2.0 preparation. I am mostly on team "never 2.0", but I recognize that I am sadly not overlord of all things Rust (if I were we'd have stable emoji identifiers already) and it's quite possible that 2.0 will happen some day.

The idea is to come up with a set of processes for 2.0. If the community decides to do a breaking 2.0 for some reason, we should:

- Document exactly what has changed.

- Write extensive docs on upgrading

- Write good tools that do the upgrade for you when possible, and point out areas where they can't help with links to docs.

- Make it so that the cases where stuff isn't machine-upgradeable are minimal

- Be wary of actually removing deprecated APIs.

- Be wary of unnecessary extra breakage.

Python 3 took the attitude of "Okay, we're going to be breaking some things anyway, so let's break more!". I think that they had good reasons for doing that; and it makes sense in a way. But we may not want to follow the same philosophy.

I feel like you could just add a version number to the Cargo.toml (cargo new would automatically set the latest one). The crate would then be built with the semantics of that version and all crates you use still use the version they were designed for (as long as the semantics can be properly translated between the individual versions).

That wouldn't work in practice. Breaking changes need not be syntactic, and probably wouldn't be syntactic -- Rust isn't going to break the language for that. There's a good chance it would be semantic breakage that can't allow interoperation between crates.

What sort of semantic breakage do you have in mind? My impression is that there are few changes of that sort that couldn't be worked around somehow.

Changes to the orphan rules, for one.

Changing the behavior and representation of stdlib APIs. You would end up with e.g. two incompatible representations of String being used across the crate boundary.

Maybe compiler profiles? So that a project can say -rust2018 and disallow bad practices from pre-2018?

Yeah. Basically linting profiles. Reminds me of some proposals I've read about reducing the complexity of C++ without creating an actual subset or superset.

I wish the notion of Rust 2.0 was definitively buried to remove the fear of Rust doing its own Python 3.

Trust me, if there's ever a Rust 2.0, it will contain minimal breaking changes (ideally no breaking changes that can't be automatically and infallibly fixed by a migration tool), and will be preceded by years and years of deprecation warnings.

Python 3 broke Python because it completely changed how strings work. Nothing of that magnitude will ever come to Rust.

I would appreciate if anyone can share your dev setup for Rust. I tried Rust and Racer long time ago and it's not a pleasant experience.

I use IntelliJ-Rust with the IDEA vim plugin daily and I recommend it. I can jump to definition for functions, get some autocomplete and type inference (it's not perfect) and the project is actively maintained. Moreover, the maintainers are very welcoming and it's a pleasure to work with them to fix bugs and submit PRs.


VS Code with RLS is amazing. Still a work-in-progress, but some stuff works and it's only going to get better.

I'm mostly just used to sublime and vim with minimal tooling so it doesn't matter as much for me. But I'll probably eventually write a plugin for RLS for sublime.

> VS Code with RLS is amazing.

The only reason I have to install it.

A single tmux window, with vim on the left (with no special IDE plugins). On the right, autocall.zsh watches ./src and runs 'cargo build' whenever a file changes. (More specifically, I autocall running in a small pane up top and have it run cargo build piped through less [but you need to fake this as a tty to get cargo to output colors] in a specific, larger pane on the bottom.

> get cargo to output colors

cargo build --color=always

It might be fixed now but when I set it up either that didn't exist or even that couldn't convince cargo to output color control chars to a non-tty pipe.

Fair enough. I just know that it worked with tup, for whatever reason I was playing with that instead of just entr+cargo:

ls src/*.rs | entr -c cargo build

(entr is pretty awesome :)

Sublime with the Rust plugin. It works well for small projects but I haven't yet tried this with a large project.

Atom with a properly configured racer and rustfmt has been nice, though I have run into one or two very minor annoyances.

nvim with Rust syntax support works for me.

In today’s Rust, this would be an error, you need to write (e.f)(); At least the error tells you exactly what to do!

If the programmer knows what the error is, and how to correct it, then their program should automatically do so, rather than molesting the user, as the purpose of computers is to speed up and automate. Why should the user have to foot the designers' bills?

Another cardinal sin and a sign of lack of thinking things through is breaking backward compatibility, like the example above: imagine you were an early Rust adopter, and wrote a non-trivial application in Rust; now imagine that the newest Rust compiler has several major performance and efficiency gains. It is understandable why one would want to re-compile one's application with the newest compiler, only to be thwarted by the authors' lack of understanding of just how important not breaking users' applications is. That is one of the core differences between engineering and hacking. What other land mines await potential Rust adopters from programmers who do not have any respect for their users' time?

> If the programmer knows what the error is, and how to correct it, then their program should automatically do so, rather than molesting the user

But then the meaning of that syntactic construct differs contextually. Even if it could be parsed efficiently, it still goes against Rust's principal of favoring explicitness over convenience.

> Another cardinal sin and a sign of lack of thinking things through is breaking backward compatibility,

The examples Steve cited span throughout Rust's history, way before any commitment was made to backwards-compatibility. These sort of changes won't happen anymore (it'd require a "Rust 2.0").

It's not the job of the compiler to speed up and automate _editing_code_. That's the job of an IDE. And we have a unfolding story for that: Rust Language Server, or RLS for short, which is a package providing pluggable Rust IDE-functionality for all kinds of text editors.

Besides, the change we are talking about happened 4 years ago, when Rust emphatically did NOT have any compatibility guarantees and was in a process of rapid experimentation. This has changed a whole lot – it's a stable language now, and has been for a year and a half.

It's not the job of the compiler to speed up and automate _editing_code_.

Calling changing the syntax on unsuspecting users _editing_code_ is really a stretch of gargantuan proportions. And I don't really care whether it's the compiler, the linker, or mega_mojo-2.546-alfa5-preview18, what ever it is, it runs on a computer so that whatever is being done would be fast, automated, and therefore efficient. If the programmer knew what to do with the input, the computer should churn through it, instead of chastizing the user and making them pay for the authors' oversight.

Otherwise, a computer makes no more sense than a toy does.

If I want to use a computer as a toy, I'll go play a video game. The rest of the time, it's a tool, a tool to do something an order of magnitude faster than a human could. If it slows me down because it requires me to babysit it, I have a problem.

That's all well and good if the context is unambiguous, but if it isn't, people will get used to having this done for them (or not even know that it happens at all), and then be totally perplexed. Worse, maybe then people want even ambiguous situations to be made friction-free, so the compiler must guess. Then you've got the possibility of very confusing behaviour, doubly so if the ambiguity involves any kind of polymorphism (different, possibly contradictory behaviour per type!)

What would be interesting if it would be possible to develop with GC on, but then if you benchmark and notice that it's too slow, turn on manual garbage collection for specific pointers.

I don't know if it's possible, though (considering libraries).

Rust doesn't have a GC in the first place, so there's nothing to turn on. Even then, the fundamental hurdle with GC isn't the effect on your program's runtime, it's the effect that it has on the lifetimes of your data. GC (and RC) are means of dynamic lifetime determination. Manual memory management is static lifetime determination. The latter requires you to structure your code in a specific way, which may be less convenient for the programmer depending on the application. Converting dynamic lifetime determination to static lifetime determination then requires changing the very structure of the program itself (including how APIs work) while preserving semantics, which is beyond any existing tools.

I was just thinking that most languages are either GCed (like Java or Go) or manually curated (C/C++/rust).

My idea is that if one could write in a language like Go (which has some kind of GC), but when one wants, say "I'll take care of this memory".

Rust actually started with a model like this. The idea was when starting out for the first time you can use garbage collected pointers for everything and then switch to the manual model when you needed it. Problem is then that unless your standard library and all the major open source libraries work perfectly for both the garbage collected objects and for the manually managed objects then you need to learn manual memory management early on anyway, so the garbage collector wasn't helping anyone. The language design element of this is really tough to do in a usable way.

D has optional GC but their standard library is kind of fucked up.

Yes, my above paragraph doesn't imply the impossibility of intentionally mixing manual and automatic memory management. E.g. you can kinda do this in Go, but the language provides no facilities to help you get the manual bits right. And you can kinda opt-in to GC in C++ and Rust (via reference counting), but it's not super ergonomic.


Steve? What? We've had totally opposite experiences with him, I guess. Actually, Steve Klabnik makes me want to use Rust (and I'm generally quite happy with C++ and Ada). Very helpful, knowledgeable guy in my experience.

I suppose the dog ate the substantiation of that comment?

The only comment I'll make in this thread is to link to https://news.ycombinator.com/item?id=13098744

Odd. It was indicated in that thread that "We've banned this serial troll, of course, but the immune response from the community was particularly healthy here." I guess they got un-banned?

They certainly didn't get unbanned! And btw I vaguely recall other accounts being silly around a similar derangement-point, so perhaps this is a serial troll, which is one of the few cases where we just ban accounts immediately.

The comment might have gotten unkilled because of a bug that I might have introduced yesterday. Looking into it now.

Edit: yep, it turns out I introduced a bug, with the dismaying but hilarious property that flagging a dead comment would revive it. We'll revert the change and try again later.

Thanks dang! I do believe that this is a serial troll, but discussing that is very much offtopic for here, you have my email if you want to talk about it.

I know we sometimes have differences, but I do want to reiterate that I think on the whole, you've made this site better, and am glad that HN is taking moderation seriously.

He's still posting, so not banned? Or that because of the bug you're referring to? https://news.ycombinator.com/threads?id=cyphreak

Pretty sure that was the bug, but if anyone sees weird behavior around dead comments and banned accounts, please let us know.

Okay, breaking my rule here: yes, I am wondering that as well. Here's my assumption: the mods straight-up banned this individual's previous accounts, but shadowbanned this one instead. But shadowbanning just kills your comments automatically; in recentish times, HN has added a "vouch" feature that lets you vouch for a comment and get it un-killed. So I imagine that's what happend, which seems like a weakness in the shadowbanning feature, though maybe it's good as well, I don't know. The right way to know is to email the HN mods via the "contact" link at the bottom.

That would definitely be the usual explanation but I'm happy to report that in this case there were no vouches.

I suppose I should be glad that nobody here leaps to 'programmer error' as an explanation for anything.

I hate to drag a thread like this out but I'm genuinely curious: why is there someone/are there people out there who have such an axe to grind? Or are they just trolls?

I wrote an extremely long comment here, but deleted it. I'd rather say this: like anyone who works in public, I am very open and vulnerable to criticism. When you work in public for a long time, there's a ton of reasons for people to decide that they hate you and have an axe to grind. Some of those things are things I still stand by, some of them are things that I agree that I was in the wrong. I am not a perfect person, and make mistakes like any other. It's up to you to see my actions, who I am, and decide for yourself.

Steve you're a good chap doing some solid, respectable and valuable work. These things are just troglodytes looking to inflict pain. The interesting thing is, they could totally transform themselves by doing constructive work and presenting it to the community. It's then, quite possibly, they realize, it is your attention they seek.

The heart is a complex place and the internet a good channel for its darker bits.

Getting rid of the syntactic difference between moving and copying seems like a huge step backwards. Ditto with abandoning mailing lists for some wanky "already solved-as-a-service" web app.

> Getting rid of the syntactic difference between moving and copying seems like a huge step backwards.

Did you ever use Rust back when you had to write "move"? I did. When you wrote stuff like:

    let (move x, move y) = (move z.a, (move z.b).append(move z.c));
It got old fast.

Admittedly there's still a syntactic difference between move and copy -- with Swift and C++'s definition of copy. Rust calls that Clone, though. :)

There was a new Hurd version announcement a few days ago and somehow Rust reminds me of Hurd, but from a different direction: it seems to me that Rust simply changes way too much to be currently widely accepted at any scale, and has a high rate of attrition because of that.

Statements such as "But as of Rust 1.15, this restriction will be lifted, and one of the largest blockers of people using stable Rust will be eliminated!" and "It is unclear if there ever will be a Rust 2.0." (to avoid the "Python 2 / Python 3 effect") don't help.

I'm comparing this to Go (yes, yes, both Rustaceans and Gophers will yell "it's not the same!" - in vain), where the Go 1.0 syntax was set in stone in 2012, and remained mostly compatible now, 5 years since, and still it feels that the community is somewhat sparse and reinventing wheels on regular basis. The reality is that for complex projects and complex development environments, once "Rust 1.0" syntax gets stabilized, it will probably take 5+ years for the community to feel comfortable with it, and trust the language developers not to break things at a whim.

Remember that the still widely-popular Python 2.7 was released in 2010 (and it also sort-of set in stone the "Python 2 syntax"), and plan accordingly.

It's really quite unclear what you mean, why would Rust adding new features (syntactic or otherwise) be an issue exactly? You do realise the .7 in Python 2.7 is because every version before that added new features (and syntax) while remaining backwards-compatible right? (well mostly, new keywords broke old code).

> Remember that the still widely-popular Python 2.7 was released in 2010

I mostly remember that Python 2.7 is a straight descendent from Python 2.0[0], and as far as I'm concerned a much better language for all that was added in the meantime (though technically I only took up Python circa 2.3, a fair number of syntactic and semantic additions had already been performed).

[0] and actually older than that, while 2.0 added major features the main change was a switch towards a much more open and community-driven environment with less single-organisation control over the project, it introduced PEPs, sourceforge hosting of the tracker and source and a very large expansion in commit bits from ~7 at CNRI to ~27 people around the time 2.0 itself was released, technically 2.0 is a minor update to the 1.6 which had been released a few months earlier for contractual reasons.

Rust _adds_ new things all the time, but it does not break old things. Every language does this. We have put in a _tremendous_ amount of work to ensure this is true, from testing every commit before merge, to our stability infrastructure that ensures only stable features can be used in a stable release, to the "crater" tool that tests releases against all open source code, to our RFCs that specify exactly what constitutes something breaking.

> It is unclear if there ever will be a Rust 2.0.

I'm not sure how this implies that Rust is changing too fast to be useful, but it's meant to state the opposite: we do not plan on making gratuitous backwards changes in the future.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact