Hacker News new | past | comments | ask | show | jobs | submit login
How Rust Is Tilde’s Competitive Advantage [pdf] (rust-lang.org)
215 points by steveklabnik on Feb 6, 2018 | hide | past | web | favorite | 122 comments



So, I've been eyeing Rust for a while and have messed around with it a bit in my free time. Here are some things I'd want to know before trying it out on a large project:

1. What development environment are other developers using with very large code bases? Is the tooling responsive?

2. Using a language for a large project without some kind of async/await notation seems painful. How bad is it using only combinators for async (assuming you want to stay on stable)? And is there a date yet when async/await will be stabilized?

3. Most of the code I write is in F#. The nice thing is, there is basically .NET library for everything under the sun. For example, over the weekend I was looking for a library that could decode OBD-II data from a car, and with a quick DuckDuckGo search I was able to find three or four. In a large-ish project, how often do Rust users typically find themselves coming up short for that kind of stuff? Obviously it varies a ton by domain, but maybe someone could tell me whether the answer is "you'll probably write all wrappers for 3rd party APIs yourself" or "there's a decent chance there's a library out there that you can adapt."


>2. Using a language for a large project without some kind of async/await notation seems painful

And yet most of the software we use, billions of lines of code, from Chrome to Linux, from Photoshop to bind, and from Nginx to Skype, has been created without such notation. Somehow we've managed.


"Managed" and "seems painful" don't disagree with each other.


That doesn't make the assertion that you need async notation for any large project to not be painful any less ridiculous.


Of course you don't NEED it. It's just that I've become accustomed to the bells and whistles (and accompanying productivity) of a modern programming language. If I'm considering a new programming language I'd like the feature set to be comparable.


1. I personally use VS: Code and Vim. IntelliJ has an officially supported plugin that I haven't used, but heard good things about it. The Eclipse project is considering making a Rust IDE but hasn't decided yet. VS non-Code should work now, but I haven't given it a go personally. On the "responsive" question, it's still actively being worked on, generally, so "it depends". We're actively investing in things being more robust.

2. We expect async/await to be stable before August, but as always, It Depends.

3. As you say, varies a ton by domain :) What I will say is, two years ago, crates.io had ~3000 packages, last year, 7500, today, almost 14,000. So we're getting there...


Steve, just so you know (for future answers), the IntelliJ plug-in is stupidly good considering its relative newness. You shouldn’t hesitate to recommend to existing IntelliJ users. :D


Understood :) I usually try to be really clear about what my opinion is based on, especially when it's second-hand, like with IntelliJ.

Have you tried debugging with CLion? I don't own a license so I haven't tried myself.


CLion debugging works pretty well for most cases, though there are some small issues every so often. I always make sure to report them to the appropriate place if they're not known already, though.


Great, thanks!


> Using a language for a large project without some kind of async/await notation seems painful. How bad is it using only combinators for async (assuming you want to stay on stable)? And is there a date yet when async/await will be stabilized?

Can someone explain to me what async/await is, in the context of a language like Go? I've used Rust a fair amount, but Go has been my native language for ~5 years now.

With that said, aren't both languages synchronous languages where you can drop into another thread (or green thread) whenever you want?

What UX is improved by async/await?

Eg, in Go I never felt the need for async/await (as I've used in JavaScript at least). If anything, Go's is better than async/await to me, because if I call `foo := bar()`, I don't have to know if `bar()` is an asynchronous function.. it just works. Is there something missing there?

Again, I use Rust and Go similarly in this post because Go is my frame of reference. I thought they were the same on this front, though. No?

What's missing here? What am I missing here?


Yeah, you are missing some details. But this is a huge area, and it's hard to explain in a single HN comment. Basically, yes, the convenience of not needing to write async/await is useful, but in order to accomplish that, you need a bunch of other supporting decisions. These decisions make sense for Go, but do not make sense for Rust.

The first thing to say on this topic is that virtually all of this (including async/await) is a library in Rust, not a part of the language. You can build whatever model you want on top. The one I'm describing is the futures/tokio model, which is becoming the de-facto, default one. But if you want something else, you can do it! This already is a divergence from Go, where you have what the runtime gives you, and that's it.

Fundamentally, Go's model is cooperative coroutines, where the runtime does the pre-emption for you. (You can also call it yourself but as you note, this isn't usual.) It'll do this when you request something from the network, when you sleep, or when you use a channel.

Tokio's model, on the other hand, is single-threaded. Instead of a ton of gorutines, there's an event loop, and you place a chain of futures onto it. Each chain of futures can sort of be thought of as a goroutine, but it's also different. For example, Rust can know at compile time exactly how big each futures' stack is, and so can do a single, exactly correct allocation. Anyway, each future in the chain handles yielding for you.

Async/await, in Rust, is about making those chains easier to write.

Anyway, I hope that helps. Maybe someone else will have a good comparison too. There's a lot of options in this space, and similar terminology that means the same, but slightly different things, so it can be hard to get your head around.


Technically you can get the convenience of not needing to write async/await, with the same runtime implementation decisions underneath as Tokio/futures. Kotlin is an example of this approach- its coroutines are state machines but with implicit awaits.

This can even be extended to "awaiting across function calls" with effect polymorphism. I haven't seen this done in this context, but it would allow things like passing an async closure, or a generic type with an async trait implementation, to a normal function and having it automatically become an async function instead.

The real choice between explicit and implicit suspension points is thus syntactic, not implementation-driven. The reasoning there is more along the lines of "I like to see where my function might suspend" vs "I don't want to pepper my code with `await`."


> I haven't seen this done in this context, but it would allow things like passing an async closure, or a generic type with an async trait implementation, to a normal function and having it automatically become an async function instead.

I've done that a fair bit in Scala using HKT. E.g. superclass is written in terms of a generic monadic type, one subclass implementation uses Future (actually EitherT with Future), another uses identity, a third uses Writer to track statistics...


Ah, right. I guess I have seen this sort of thing in Haskell then as well. When implemented with HKTs it usually gets really messy with things like monad transformers, so I prefer to think of asynchronicity more in terms of continuations.


I find monad transformers much clearer than continuations; a monad transformer stack tells you exactly how your effects are going to be interleaved, which is necessarily complex, whereas a continuation could be doing anything at all, the way I see it.


No, continuations can be controlled similarly through effect polymorphism, which is what I was trying to describe.

The difference is not that continuations allow anything at all, but that your effects are commutative rather than layered like a monad transformer stack.


> continuations can be controlled similarly through effect polymorphism, which is what I was trying to describe.

Fair enough, but you're effectively talking about a secondary, novel type system, right? One of the things I like about monads is that they can be just plain old values of plain old datatypes.

> The difference is not that continuations allow anything at all, but that your effects are commutative rather than layered like a monad transformer stack.

Yeah, monad transformers do feel overly cumbersome for the cases where effects do commute. The trouble is that some effects don't commute, and I've yet to see a formalism that had a good solution to distinguishing effects that commute from those that don't. (I read a paper about a tensor product construct once, but couldn't really follow how it was supposed to work in practice)


> you're effectively talking about a secondary, novel type system, right?

Yes, exactly.

The main reason I prefer language-level effects rather than monads is for composability with normal imperative control flow. Monads are written in terms of higher order functions passing closures around, and that gets really messy in a language where you not only have early return and loops with break and continue, but you also have destructors that need to run as you enter and leave scopes.

You can add that kind of stuff as monads but it gets really messy, and is basically completely untenable in a language like Rust that also cares about memory management. Even if Rust did have HKT, it would still be impossible to write a Monad trait that supports them all, for example.

This article makes some great points on the subject: http://blog.paralleluniverse.co/2015/08/07/scoped-continuati...


I have been meaning to dig into Kotlin's implementation, thanks for that! Is there any good reference documentation to dive into? Most of the stuff I saw was from a user's perspective.



Adding a recent detailed take on the situation, by another Rustlang contributor: https://manishearth.github.io/blog/2018/01/10/whats-tokio-an... . Hope that helps :)


> Can someone explain to me what async/await is, in the context of a language like Go?

As far as I know:

    c := make(chan int)
    go func() { c <- 1 }() // async
    i := <-c               // await


Goroutines allow you to write async code ergonomically. Without those, you'd have to write callbacks, which is very unergonomic.

Here's a before and after diff of some prototype async/await usage in Rust: https://github.com/mehcode/shio-rs/commit/8078a34c075bf2f52c...


async/await means you mark all your yield points explicitly; rather than having the runtime implicitly preempt you whenever it chooses, you mark the points at which task switching can happen, and at every other point it's impossible (equivalently it's as though any block of code that doesn't contain an "await" were a critical section). The syntax strikes a nice balance, making these markers as lightweight as possible but no lighter: you don't want the yield points to be completely invisible, but you don't want them to distract too much from reading through the straight-through happy-path control flow.

I find that having one implicit, pervasive, uncontrolled effect in a given piece of code is ok, but having multiple uncontrolled effects is very much not ok, because the ways those effects will interact can be very surprising. E.g. https://glyph.twistedmatrix.com/2014/02/unyielding.html gives an argument for using explicit async/await rather than go-style implicit (green) threads, in terms of how task switching will interact with state mutation. You can make similar arguments in terms of how task switching interacts with error handling or logging or database transactions or... - one uncontrolled effect is fine, multiple uncontrolled effects cause chaos when they interact. Unfortunately the kind of examples where one can see multiple effects in a single application tend to be quite big by their very nature (and if your app is small enough to only need to have one kind of effect, then using a language in which that particular effect is uncontrolled is probably fine, perhaps even a good idea).


> async/await means you mark all your yield points explicitly; rather than having the runtime implicitly preempt you whenever it chooses, you mark the points at which task switching can happen, and at every other point it's impossible (equivalently it's as though any block of code that doesn't contain an "await" were a critical section).

That's not how I understood it at all. I always assumed that it just promoted explicitly acknowledging when the effect of an async block would be present, but that the normal yield points (traditionally syscalls and IO requests in OS threads) generally still held for the actual execution of the other code.

I mean, if we aren't allowing separate code paths to execute (and thus reducing wait on resources such as IO), what's the point?


> That's not how I understood it at all. I always assumed that it just promoted explicitly acknowledging when the effect of an async block would be present, but that the normal yield points (traditionally syscalls and IO requests in OS threads) generally still held for the actual execution of the other code.

Put it this way: the language implementation won't preempt you except at your explicit yield points (some standard library functions for things like I/O will and should be yield points that you have to call with async/await). If the language implementation happens to be running as a userspace process on a preemptive multitasking OS then it will still be subject to the same preemption rules as any other userspace process on that OS, but if anything this is usually counterproductive (e.g. priority inversions are almost guaranteed); recommended practice when working in green-thread-based systems is to run with one thread per CPU core and maybe even pin threads to cores if the OS lets you do that, because the language's own M:N scheduling will keep those threads fully occupied and OS scheduling is only going to get in the way.


I think I see what's going on here. You're answering "in the context of Go" (which is correct, and what was corrected) and I'm interpreting it as a general statement about what async/await mean as general concepts (i.e. in other languages as well). My mistake, I wasn't paying close enough attention. :)


s/and what was corrected/and what was requested/ Ugh, so many typos or word mistakes lately. :/


Having used both, I'd be pretty shocked if anyone found Twisted to be easier to reason about or less prone to bugs than Go...


Twisted predates async/await syntax in Python.


Isn't gevent still better than async/await ? (if it had core support it would be, kinda like golang)


Async/await makes sense in python or JavaScript because it’s syntactical sugar that improves readability/maintainability and mitigates against concurrency errors stemming from the event loop (JS) or the GIL (python). I haven’t worked with Go beyond playing with it a bit, but it seems async/await would be utterly useless in Go since goroutines already accomplish the same thing in a much more performant manner.


C# was, as far as I'm aware, the first major language to implement async/await atop of Task<T> (akin to Java's Future<T>). The CLR uses native threads.

Go channels are also orthogonal to async/await. Message passing is not a substitute for futures/tasks, though it can be used to achieve similar goals. I would be extremely cautious about claiming that Go channels would be "more performant" than an otherwise-equivalent futures implementation, too.


Goroutines are not the same thing as Go channels.


Sorry, yes. Fibers (Go didn't invent them) also aren't directly equivalent because they don't allow explicit yields and composition.


Fibers don't, but goroutines + channels + closures do. They permit composition using the same call/return syntax and semantics as normal function calls.

Futures and promises don't. Async/await is closer, but only by creating a second class of function incompatible with normal functions.


> Again, I use Rust and Go similarly in this post because Go is my frame of reference. I thought they were the same on this front, though. No?

No. Rust used to have a green threads runtime like Go (and Erlang/BEAM, Haskell, etc.) but that was removed before the 1.0 release. So today, a thread in Rust is a heavyweight OS thread, like in C/Java and most mainstream languages.


The intro Rust book says you can easily find crates for green threads of you really want them. Is this not the case?


I am not entirely sure, it appears like due to the present status of std in rust, any green threads lib is prone to Undefined Behaviour whenever something uses Thread Local Storage.


Yea, I'm aware - but you can spin off a async execution and then send data back over channels (I forget what Rust calls), as well as wrap that whole thing up in a function with a return value so that the caller has no idea of the threaded execution taking place. Right?

Sure the implementation of the threading differs greatly, but I was mainly referring to that I can spawn async behavior, and use synchronous blocking methods to get data back from that async behavior - AND encapsulate it.

All of these are very different than, say, what JavaScript goes through anytime async is involved.


Rust also has channels.


> 1. What development environment are other developers using with very large code bases? Is the tooling responsive?

It’s solid. If you’re coming from Visual Studio, you’ll dig IntelliJ with the Rust plug-in.

> 2. Using a language for a large project without some kind of async/await notation seems painful. How bad is it using only combinators for async (assuming you want to stay on stable)? And is there a date yet when async/await will be stabilized?

It’s painful. I don’t know if there’s a timeline. But I’m positive someone can say more.

Edit: See Steve's response. He would know.

> 3. Most of the code I write is in F#. The nice thing is, there is basically .NET library for everything under the sun. For example, over the weekend I was looking for a library that could decode OBD-II data from a car, and with a quick DuckDuckGo search I was able to find three or four. In a large-ish project, how often do Rust users typically find themselves coming up short for that kind of stuff? Obviously it varies a ton by domain, but maybe someone could tell me whether the answer is "you'll probably write all wrappers for 3rd party APIs yourself" or "there's a decent chance there's a library out there that you can adapt."

If you’re expecting the F# (and .NET) ecosystem, you’ll likely be satisfied. In fact, it might be better. If you’re expecting the Python ecosystem, you’ll need to wait (or contribute back).


> If you’re expecting the F# (and .NET) ecosystem, you’ll likely be satisfied. In fact, it might be better.

Not even close (100k unique packages on nuget, 13k crates on crates.io). On the other hand, you have easy interop with C (which doesn't give you stuff like safety, idiomatic error handling or nice APIs, but it's occasionally better than coding it yourself).


It's not about total packages, but about quality of packages and whether they provide good coverage of most problem spaces.

NPM claims to have close to 500k packages. Should we assume it solves 5x the problems that the .NET ecosystem does, that that perhaps by the 50th implementation of leftpad, there's some cruft on there?

That's not to say that Rust's package ecosystem is large enough or sufficient in comparison to .NET, just that raw package count gets to be an extremely poor indicator of quality after a certain level has been reached.

As a point of reference, I approach this from Perl and CPAN, which is one of the oldest large fully featured package networks. At this point, the problem is usually not finding a package that provides a solution, but finding the right package out of what's available. Different solutions exist, but a good curated list of well implemented solutions to common needs can help quite a bit.[1]

1: https://metacpan.org/pod/Task::Kensho


Well, an order of magnitude of difference is a hint that, whatever the quality of the existing crates, there are definitely areas for which you are going to be hard-pressed to find a good solution (say, localization), or a solution at all.

It is not a sign that there is anything wrong with the Rust ecosystem, it has a very active community and things are moving in the right direction, it is not just as full-featured as more mature ecosystems out there.

See for instance https://github.com/ErichDonGubler/not-yet-awesome-rust

There are quite a few use cases not listed here, obviously.


> raw package count gets to be an extremely poor indicator of quality after a certain level has been reached.

Broadly speaking I think the best indicators are always going to be domain specific -- which set of packages is most used within the context of what you're doing? A million awesome packages for Excel automation aren't much help, per se, for huge file chunking or BigData work.

From that point of view it's more relevant to look at the size of the successful projects in that market that resemble your technical goals.


IntelliJ and a workspace based project will get you a modern IDE experience. However don't expect a responsive edit/compile cycle, compilation times just doesn't scale for large projects, even with incremental compilation. Linking against shared libraries for stable dependencies can help though.

No the ecosystem is not mature enough to get the "enterprise" libraries. Wrappers to c/c++ libraries are often needed.


I just switched from VS Code to InteliJ and it's great. There's been huge improvements in IDE support recently.


Can you elaborate on the differences you've seen between VS Code + RLS and IntelliJ?


Sure, VS Code keeps randomly asking me to choose my toolchain even though a toolchain is chosen, closing it down and re-opening it fixes that.

InteliJ seems a little snappier, it's code completion a bit better.

The biggest benefit in day to day usage is the inline type hinting.


1. VS Code with the Rust plugin along with rustfmt and the rls is great - when it works. I've had some issues now and then, but the tooling is pretty great.

2. Not qualified to answer this

3. There's a ton of rust libraries out there. The whole cargo ecosystem is quite nice.

Coming from a C background (and dabbling in lisp), I love Rust. It's C with batteries and concepts from functional programming - iterators, immutable data. The only times I've struggled with rust is when trying to mess with pointer arithmetic and evading memory safety (e.g. trying to program a garbage collector for a Lisp interpreter)


I've been using VS Code and WLS to execute the Ubuntu version of Rust from the bash console in VS Code. I'm not sure if this is a good method or not... but it feels very natural.


Nightly rust can use this async / await crate, but it will likely be deprecated once language support is available (later this year iirc)

https://github.com/alexcrichton/futures-await/blob/master/RE...


Neat, what is your use-case for OBD-II data? I've been investigating ways to get data from a race car back to the pit crew via something like a raspberry pi.


For the radio component - I suggest that you build a "block" of useful data and send that several times using very good Hamming (error correction) instead of error detection. If using wifi then consider UDP.

You'll get better range out of something proprietary, I've had success in the 915 Mhz band. A spread-spectrum approach is better if possible.

Ideal would be if the car assumes it cannot receive anything and attempts to send each block several times before beginning to transmit the next block. Another approach could be to send a window of blocks each time.

Receiving outside the car is likely to be much easier - a large antenna can be used away from all interference. An option could be a human operator holding a directional antenna, visually tracking the car.

You might also store the blocks within the car for later 100% accurate download.


The microcontroller component of that is going to be a walk in the park compared to the radio component.


Wifi with good antennas, or a cellular hotspot seems pretty easy.


Race engines are not known for their lack of EM emissions :)

Vehicles are not friendly environments to begin with, race cars are outright hostile territory. Vibration & G forces, interference, fluctuating power, huge temperature variation, aerodynamic effects on antenna and cabling are all factors that will make this quite a bit of work to get to acceptable reliability levels.

I like your idea of a cellular hotspot, that might be the easiest way to get to something that works. Compact, no external cabling. That's good.

Edit: I tried running a webcam in a racecar in the mid 90's but it never got beyond the planning stage before the obstacles became insurmountable for the budget available, now, obviously a lot has happened in those years and F1 and other racing sports prove on a daily basis that the bandwidth is definitely there if you have the budget, you make me wonder how easy it would be to pull this off today, mostly likely you could just plug it together from some consumer components, a small tablet with a SIM card in a waterproof and vibration isolated enclosure would probably get you 90% there.


Right now I just want to get more fine-grained data about my mileage. But maybe add some automation into the mix later on (remote start, etc.).


Ensure your remote-start doesn't turn into a remote-stop and a crash.


I've come to realize that the main thing that is making me procrastinate giving Rust a proper go is entirely superficial at this point. Basically it looks like a "clever" programmers dream language - which is usually not my cup of tea. I'm also worried that larger projects ("real-world" projects, so not projects like Servo) will inevitably become an illegible mess when clever programmers makes the most of a clever language.

That said, I plan to push through this admittedly superficial barrier in the near future and hopefully calm this concern.


It may not be your cup of tea, but I don't think your concern about large code bases will be a problem. Ensuring invariants across large code bases is what rust excels at. You'll more likely have the opposite problem: getting frustrated at trying to use an API in a way that it was never intended and compiler won't let you.


May you always work with clever programmers who make clever code simple.


Most "clever" programmers makes code a lot more complicated than it should be to show how "clever" they are. The more complicated the language, the more complicated code they will produce. I've seen this a lot of times.


At least some of those "clever" programmers will eventually become "wise" enough to avoid doing that.


What did you find provoking? My intention was just to say that I personally find it harder to work with a language that has a high cognitive load to interpret.


> while the Java server could use up to 5GB of RAM, the comparable Rust server only used 50MB

How...how is that possible? It just glosses over this interesting point. What does "comparable" Java code do to use 100x more memory?


Three things:

* Java tends to allocate things separately and manipulate pointers, while Rust allocates things inline. This introduces allocation overhead and eats up the space needed by the pointers themselves.

* Java uses a JIT, which consumes of memory for profiling state and for storing the resulting compiled code.

* A tracing garbage collector needs about five times as much memory as explicit freeing to reach the same performance. https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...

* Like vegetarians, Rustaceans pay more attention to efficiency as a whole. This may give the appearance of Rust being more efficient, even if it's actually not.


The first three are not the deciding factors. Other platforms are still 10-100x more memory efficient even though they share the same disadvantages as the JVM.

Your fourth point is the primary reason.

The GC in the JVM is set to try to waste as much RAM as possible. If you tune the GC you can get a big improvement and the standard Java Frameworks are bloated and slow as hell.

In my opinion it's a platform that is bad by default and simply isn't worth using but companies don't care and just buy more RAM.


> Java tends to allocate things separately and manipulate pointers, while Rust allocates things inline. This introduces allocation overhead and eats up the space needed by the pointers themselves.

Just a small nit but the term you're looking for here is "value types". Rust does heap allocations(with the pointer overhead) via Box<T>, you just get explicit control over the overhead and cost.


Additionally things like Spring and dropwizard also add some overhead for better developer UX which rust doesn't have yet.


Spring is the new EJB. It's bloated and always amazes me on how complex it can get when you need to implement something not provided by Spring itself.

With Java 9 modularity and lightweight libraries like Guice you can get much nimbler apps.


Dagger is a more efficient, simple, debuggable design for DI, and I fully expect to see similar compile time code gen DI in Rust soon. (Adding ownership, mutability tracing, possible stack allocation, etc does make things a bit harder to get right in this arena.)


None of these are likely the reason - it should be as simple as Java's own managed heap preallocating that space.


5 times the memory is a serious exaggeration. At least twice, but no more than 3x.


The linked Wikipedia page references a OOPSLA 2005 paper by Matthew Hertz and Emery D. Berger that presents data saying that with 3x memory you might reasonably expect a 17% slowdown when compared to manual memory management. (And 70% slowdown when limited to 2x memory.)

Of course there have been some advanced in GC technology since 2005 (and that paper was deliberately measuring a relatively standard generational approach). But I don’t think I’d call the 5x figure a serious exaggeration.


Yes I've read the paper. Advances in CPU microarchitecture alone have noticeably shrunk that gap. There have also been significant algorithmic advances in GC. For instance, a few simple prefetching hints speed up marking easily by 30%, and parallel marking scales almost linearly with core count, to mention but two simple performance improvements.

So like I said, maybe in 2005 a 3x memory overhead only incurred 17% slowdown, but 3x overhead is at least at par, and possibly even a little ahead, of manual memory management.


Java doesn't use system memory management by default, but rather keeps its own heap for garbage collection. If you start your java daemon with -Xms5000m -Xmx5000m then java will use 5GB of ram and use that to allocate new objects inside it. Depending on the garbage collector used, Java may wait until that 5GB is full before clearing it out, therefore using "up to 5GB of RAM" for the same 50MB dataset in a long running process.

It's not really a useful comparison for most uses of java though.


It's hard to say without really diving into that code. The 100x disparity seems pretty surprisingly high.

Often times people will give the JVM a sizeable heap in the interests of reducing garbage collector pressure (sometimes this is counterproductive). Usually it's possible to run with a much smaller heap, but we're still talking 100's of megabytes for most non-trivial apps. Even the VM itself will allocate a decent amount of memory besides that.

There's a number of reasons that typical JVM apps use more memory, but most of it seems to boil down to using the heap much more, lack of real value types (using the stack far less), and size overhead with small objects (see: https://www.javamex.com/tutorials/memory/object_memory_usage...).

It seems like the low cost cloud instances could be a real driver for the adoption of more lean runtimes. My client couldn't even consider those 256m/512m Heroku instances (with a large legacy JVM app).


Almost certainly either bad design (their Java app had a way bigger working set than their Rust one), or bad tuning (they set the heap needlessly large). Automatic memory management _does_ require more memory for reasonable performance, but you'd be talking about 2x or 3x the working set. Not 100x. Java objects will likely end up a little larger than Rust ones, but again not enough to make up the 100x difference.

Having a way-too-big heap is usually actively undesirable; you'll get fewer pauses, but much longer ones.


> or bad tuning (they set the heap needlessly large)

The claim "for equivalent Java code", might be slightly exaggerated, but looking at the claim as a whole: "Rust is Tilde's Competitive Advantage", "bad tuning" is part of the point.

"bad tuning" usually means "no tuning". Which means they spent the time they could have been tuning and used it to build features instead, or optimize their code instead.

From that perspective I wouldn't use "bad tuning" as an argument for continuing to use Java.


Oh, yeah, it's certainly a problem. However, they could almost certainly have done a lot better with Java than they'd been doing, if they wanted to.


If they increased the heap until the JVM was exactly as fast as their compiled Rust code on the same/equivalent hardware, you could get those numbers as some kinds of GC significantly decrease overhead with increased heap size.

With that said, that's an almost entirely useless benchmark, except as a testament to the amazing JIT compilation the JVM does. That the JVM with GC is capable of matching a compiled language specifically made to have efficient memory management at ANY heap size is positively astonishing!


Java has its own heap so the memory usage, certainly for a server VM, doesn't necessarily correspond to actual application memory use.

(Also, of course, your main server uses "up to" 5 GiB? That's fucking great. Until it goes beyond 64 GiB, who even cares. Work on something that has a worthwhile monetary return..)


Assuming it is heap space: that just prompts the question why is there heap space so large?


The default sizing policies optimize pause times, throughput and only as tertiary goal for footprint.

Additionally people often tune the JVM to avoid some startup costs which includes fixed heap sizes which prevents the JVM from returning memory to the OS.

On top of that there often are just plain bad programming practices like deserializing large files into memory instead of streaming them. I suspect that rust also attracts more performance-conscious developers that avoid such things. Or maybe it is applied in projects where "throw more hardware at it" is not a cost-effective measure and investing in more developer-hours is appropriate.


It answers the question: you can't just look at gross native memory usage and conclude the Java app is 100x less efficient, that value simply doesn't tell you a lot. Server VMs are commonly configured to use gigabytes of RAM from the get-go, and when the initial VM heap is full, Java will not make small adjustments but instead grab a big chunk, and when that is no longer needed, it is often reluctant to release that additional memory.

I have no doubt that a Rust app will generally be more memory-efficient, but ultimately for server applications that is really not something a lot of people care about because of the incredibly low cost of "just buying more".


I can believe it if the application is written properly using a functional paradigm to make it parallelizable. In Java this almost inevitably means making endless copies of objects to pass to functions and then immediately discard, except the discard isn't immediate. It doesn't happen until the garbage collector kicks in. So you end up with a memory use graph that looks like a sawtooth as the garbage collector is discarding hundreds of megabytes of temporary objects every time it runs.

I've seen this on a modded Minecraft server with a half dozen players connected. It would thrash over a gigabyte of memory every 5-10 seconds or so. I had to tweak the garbage collector to avoid performance hitches.


If I were to take a stab in the dark I'd guess optimized for latency by pre-caching everything under the Sun.


This most likely has to do with their respective memory models. Rust will try and allocate most things on the stack unless an item is explicitly wrapped in a Box. This isn't the only exception however. I can't speak to Java's model but I imagine it places more things on the heap.


Java caches the universe. ;)

(FWIW, it’s an apples-to-oranges comparison from a compilation, linking, packaging, etc. perspective, but from an end-user perspective I think it’s a reasonable claim.)


I call bullshit. Most probably their Java code was FUBAR.


> Most probably their Java code was FUBAR.

And so what? If a language's natural form leads to FUBAR code, what else is to blame?


Really? You can write FUBAR code in any language, language has nothing to do with this.


Of course you can. But if the same developers write performant code in one language, and something really slow in another, that's a mark against the second language. Maybe there's a tradeoff - maybe the slow codebase was written much faster, or something like that. But the kind of code a language makes you write is totally something you should judge it on.


Well i am quite experienced with Java and in my experience if they have this kind of ridiculous discrepancy in memory use, it implies they had a bad design or made bad choices. A rewrite in Java with sane choices would also end up with much better results. So I don't buy the "language is responsible" argument.


Nice false premise there. It could easily be caused by other things.


> Rust is much easier to teach than C or C++,

Not sure about that.

If you know C, which is easy to learn, you already know a lot about C++, or at least the parts that are used most of the time.

Learning about the borrow checker, match enums don't seem very easy at first, although they surely seem elegant and much safer.


That doesn't square with my experience. I've taught beginning programmers more or less the "C subset" of C++, and it was a nightmare.

The biggest problem is that when they inevitably hit undefined behavior, there is no easy way to learn what is wrong. The program hits a segfault or prints garbage on the screen—where to begin? I instinctively know what is likely to be wrong, but that's only through years of experience; it's almost impossible to teach that quickly. Even if, by luck, the program segfaults instead of silently corrupting memory, the line number where the failure occurred doesn't necessarily correspond to what is wrong—often the problem occurred earlier.

Rust is in a much better position here, since the compiler can actually isolate precisely what is wrong in those cases and emit diagnostics at the exact location where the problem occurred. That's much easier to teach.


Learning C via an interpreter that errors out on undefined behaviour seems like a good way to address those issues. Not sure if any of the interpreters I've seen over the years actually does this though, though I imagine altering something like PicoC would be straightforward enough.


Judging by the development of UBSan and ASan, I think such a thing would be anything but straightforward, even with a simple C compiler.


C appears easy to learn. C++ a bit harder, but learnable.

What isn't easy to learn is all the horrible ways C is broken. Don't return addresses to local variables. Remember realloc can change the location pointer. Don't use the C string functions.

C++ adds fun gotchas like using the c_str() of a temporary string object. Carelessly passing local objects by reference to a lambda thread function. All of the amusingly horrible ways to screw up data structures using shared_ptr and enable_shared_from_this.

Better yet, linking C or C++ objects together that have been compiled with different "#defines" set so the structs, objects and vtables are different sizes.

C is not easy to learn.


C is just some readable assembly. If you tell a computer to crash into a wall, it will crash into a wall. That is the best lesson you can teach a student.

It is great to have a language that teaches you to write better code. Failure is a good teacher too.

If you tell a student to follow some rules without telling him why those rules exist, it is not always a good idea.

It is an awesome idea to have the safest code, but not necessarily a good idea to teach programming using safety rules.


> C is just some readable assembly.

Here's one article that argues the other side. I like this part a lot, though the wording is a maybe a little bit argumentative :)

"Be knowledgeable about what’s actually in the C and C++ standards since these are what compiler writers are going by. Avoid repeating tired maxims like “C is a portable assembly language” and “trust the programmer.” Unfortunately, C and C++ are mostly taught the old way, as if programming in them isn’t like walking in a minefield. Nor have the books about C and C++ caught up with the current reality. These things must change."

https://blog.regehr.org/archives/1520


I said it's readable, but it's just some simple translation. C still has all the issues of having to write assembly, so a C programmer should have to check if its C code is not doing anything unsafe, like an ASM developer would.

Programming is not easy, and it always has been a minefield. You cannot write good code quickly. Even learning rust will have a cost, although it's a worthy cost. Writing unsafe code is cheap and was often a good enough solution.


> it's just some simple translation

I think this is the part that the article is arguing against. For sure, I agree that many of the common problems we run into with C code (out-of-bounds reads, uses-after-free, etc.) are the result of the compiler doing "exactly what you told it to do." But there are also new problems that come up as a result of aggressive C compiler optimizations, which are _not_ what would happen in a simple translation:

- Strict aliasing violations. If two pointers point to different types, the compiler is allowed to assume that the memory they point to doesn't overlap. It can reorder a write and a read based on this rule, even when they should be causally related within a single thread.

- Non-obvious undefined behavior. For example, https://www.imperialviolet.org/2016/06/26/nonnull.html. The C standard says that passing a null pointer to memcpy is undefined, even if the size argument is zero. If you pass a pointer to memcpy (assuming a size of zero would make that a no-op) and then check it for null, the compiler will completely skip that check.

These are cases where modern compilers do something very different from what a human assembly programmer would intuitively do. It's not the majority of code in any program that has to worry about this, but I think the majority of programs do have to worry about this somewhere.


I have lots of experience with C++, and a non-trivial amount with C. I've done everything from game development to writing drivers to scientific computing with these languages. Here's how I describe it: a good portion of the stuff C and C++ programmers learn with experience (e.g. best practices, memory management, etc.) are "front loaded" in Rust by having to learn about Borrowing, lifetimes and the like. One isn't harder than the other, but the path to proficiency is quite different. In my view Rust has a superior path.


As someone who's field is front-end development I have to say that among those three Rust was the easiest to learn - mostly because the compiler helps you understand what's wrong.

C seems easy at first glance, but has so much manual work(managing memory, dependencies) that working with it is to me - well - tedious.


> If you know C, which is easy to learn, you already know a lot about C++

Not sure I'd agree. If you know C you know a very small subset of C++(and growing smaller with each revision). Things like move semantics, exceptions + constructors, lambdas and templates are huge and each one can easily be as complex as straight C99.


Many Rust programmers never learned C, or have only a cursory understanding of it.

Most C++ people I talk to would strongly disagree that in C++, you should do things the C way.


The C way in C++ should only be used the same way as unsafe in Rust.

Very well hidden in safe abstractions, and validated by a profiler that they are actually better than the safer C++ alternatives.


Yes and no: when you want to do real-time programming (either no allocation or allocation only during an 'initialisation' phase) using C++ like a better C is a good idea IMHO.


Have you heavily used C++11 or newer on a "modern" codebase? I consider myself a more than competent C programmer but the complexity of modern C++ is non-trivial in my opinion.


It really depends on where you're coming from. I had no difficulties learning Rust, but I already knew Swift pretty well.


This is some cringe PR buzzword-heavy marketing. Not what I expected on rust-lang.org


"Training existing Ruby developers to maintain safe C++ code would take too much time."

Wow. Just wow.


Wow, that paper needs a tl;dr.


Really? For a 6 white paper?

Fine:

Tilde built a server monitoring daemon with Rust and it's low resource and doesn't crash. Tilde thinks the Rust community & its resources make it easier to teach to new team members.


Yes, really. It's not the length, it's the density of interesting content.


Yeah man, learning to scan a paper for interesting content is definitely a skill worth developing!


I suggested it because it'd help other people not waste the time scanning the paper for the same minuscule amount of not-that-interesting content. But at least it gave you the opportunity to contribute your interesting comment. Yeah, man!




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: