Hacker News new | comments | show | ask | jobs | submit login

Good questions, let me try to tackle them one by one.

> The article makes a brief mention of Go causing issues with RAM usage. Was this due to large heap usage, or was it a problem of GC pressure/throughput/latency? If the former, what were some of the core problems that could not be further optimized in Go?

The reasons for using rust were many, but memory was one of them.

Primarily, for this particular project, the heap size is the issue. One of the games in this project is optimizing how little memory and compute you can use to manage 1GB (or 1PB) of data. We utilize lots of tricks like perfect hash tables, extensive bit-packing, etc. Lots of odd, custom, inline and cache-friendly data structures. We also keeps lots of things on the stack when we can to take pressure off the VM system. We do some lockfree object pooling stuff for big byte vectors, which are common allocations in a block storage system.

It's much easier to do these particular kinds of optimizations using C++ or Rust.

In addition to basic memory reasons, saving a bit of CPU was a useful secondary goal, and that goal has been achieved. The project also has a fair amount of FFI work with various C libraries, and a kernel component. Rust makes it very easy and zero-cost to work closely with those libraries/environments.

For this project, pause times were not an issue. This isn't a particularly latency-sensitive service. We do have some other services where latency does matter, though, and we're considering Rust for those in the future.

> Could you comment more generally on what advantages Rust offered and where your team would like to see improvement?

The advantages of Rust are many. Really powerful abstractions, no null, no segfaults, no leaks, yet C-like performance and control over memory and you can use that whole C/C++ bag of optimization tricks.

On the improvements side, we're in close contact with the Rust core team--they visit the office regularly and keep tabs on what we're doing. So no, we don't have a ton of things we need. They've been really great about helping us out when those things have sprung up.

Our big ask right now is the same as everyone else's--improve compile times!

> Are there portions where the decision to use Rust caused complications or problems?

Well, Dropbox is mostly a golang shop on the backend, so Rust is a pretty different animal than everyone was used to. We also have a huge number of good libraries in golang that our small team had to create minimal equivalents for in Rust. So, the biggest challenge in using Rust at Dropbox has been that we were the first project! So we had a lot to do just to get started...

The other complication is that there is a ton of good stuff that we want to use that's still being debated by the Rust team, and therefore marked unstable. As each release goes on, they stabilize these APIs, but it's sometimes a pain working around useful APIs that are marked unstable just because the dust hasn't settled yet within the core team. Having said that, we totally understand that they're being thoughtful about all this, because backwards compatibility implies a very serious long-term commitment to these decisions.




  > On the improvements side, we're in close contact with the Rust core team
One small note here: this is something that we (Rust core team) are interested in doing generally, not just for Dropbox. If you use Rust in production, we want to hear from you! We're very interested in supporting production users.


Thanks very much for the detailed and thoughtful answers!

I've read before (somewhere, I think) that Dropbox effectively maintains a large internal "standard library" rather than relying on external open source efforts. How much does Magic Pocket rely on Rust's standard library and the crates.io ecosystem? Could you elaborate on how you ended up going in whichever direction you chose with regards to third-party open source code?


We use 3rd parties for the "obvious" stuff. Like, we're not going to reinvent json serialization. But we typically don't use any 3rd party frameworks on the backend. So things like service management/discovery, rpc, error handling, monitoring, metadata storage, etc etc, are a big in-house stack.

So, we use quite a few crates for the things it makes no sense to specialize in Dropbox-specific ways.


Cool. This might be getting into the weeds a bit, but are you still on rustc-serialize for json or are you trying to keep up with serde/serde_json? If you're using serde, are you on nightly? From your comment above I got the impression that only using stable features was very important, so I'm curious how your codebase implements/derives the serde traits.


We're on rustc-serialize. JSON is not really a part of our data pipeline, just our metrics pipeline. So the performance of the library is not especially critical.


Are you guys hiring Rust developers by chance? Asking for a friend :)


Very possibly :) Drop me an email at james@dropbox.com.


How do you do network io with rust? Thread-per-connection, non-blocking (using mio or?), or something else?


We have an in-house futures-based framework (inspired by Finagle) built on mio (non-blocking libevent like thing for rust). All I/O is async, but application work is often done on thread pools. Those threads are freed as soon as possible, though, so that I/O streams can be handled purely by "the reactor", and we keep the pools as small as possible.


Any plans to open-source the futures-based Rust framework? :)


From a parallel conversation[0] on the Rust subreddit:

>Are you going to open source anything?

>Probably. We have an in-house futures-based I/O framework. We're going to collaborate with the rust team and Carl Lerche to see if there's something there we can clean up and contribute.

[0]: https://www.reddit.com/r/rust/comments/4adabk/the_epic_story...


Did you guys use a custom allocator for rust? And if so how did it differ from jemalloc and how could it be compared to C++ allocators like tbb::scalable_allocator?


We use a custom version of jemalloc, with profiling enabled so that we can use it.

We also tweak jemalloc pretty heavily for our workload.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: