Capnproto-rust vs. C++

pcwalton · on Nov 18, 2013

As I said earlier on the mailing list, I suspect (without looking into it) that the cause of the I/O performance slowdown relative to C++ is something related to buffering—perhaps the I/O is not being buffered, or the buffer isn't functioning right. This would be consistent with the serialization-based I/O leading to larger slowdowns, because there would be more calls to write(2) then. If so, then this should be fixable.

I'm pleasantly surprised to see the object mode slightly faster than C++.

renshaw · on Nov 19, 2013

I should note that the Rust version currently omits some features of the C++ version, such as read-limiting and the actual counting of the throughput. These things are cheap, but they may explain why Rust is faster in that one case.

pcwalton · on Nov 19, 2013

We did some benchmarking on IRC today in light of this post and we found that Rust's stdio is currently quite slow due to a flag not being set properly in libuv which causes it to punt to a thread pool. There is currently a fix in the queue: https://github.com/mozilla/rust/pull/10558

srean · on Nov 19, 2013

In case you get some time would appreciate a blog post or a comment describing what the issue was on the rust side as well as on the libuv side. Upvoted in advance.

eonil · on Nov 19, 2013

I have been thought all the serialization formats such as Protobuf, Thrift, BSON, or MessagePack are using compression as much as possible because saving amount of I/O is ultimate win for overall performance rather than fast calculation by memory alignment.

When I see this Capnproto, I am confusing that which one is right approach. Alignment is not exotic technique, and why didn't they align the data if there's no reason to save I/O?

Or am I totally misunderstanding these implementations?

kentonv · on Nov 19, 2013

Well, it depends on the environment. If you are doing interprocess communication, then I/O bandwidth is obviously not a concern at all. On the other hand, over the internet, it clearly is the biggest concern. For intra-datacenter traffic on a 10Gbit NIC, it's harder to say, but it _probably_ isn't the bottleneck. Cap'n Proto supports both cases by making additional packing optional, so you can choose the best trade-off for your application.

Regarding the other formats you mention, I think you may be imagining that the designers of these protocols thought more carefully about them than they really did. Protobuf, for example, was designed pretty ad-hoc to solve an immediate problem in Google's search infrastructure, and then stuck mostly because as more and more things used it, it was easier to keep using it than start over. The designers readily acknowledge that it is not an ideal format -- in fact, there are other ways they could have done the encoding which would have taken no more space but would have saved significant CPU time.

(Disclosure: I was the maintainer of protobufs for a long time, though not the original creator. I am also the author of Cap'n Proto.)

zik · on Nov 19, 2013

It'd be nice to see some description of which compiler, version and flags are used in each case.

renshaw · on Nov 19, 2013

Author here. The makefiles I'm using are on github:

https://github.com/dwrensha/capnproto-rust/blob/master/Makef...

https://github.com/dwrensha/capnproto/blob/benchmark/c%2B%2B...

I compiled libcapnp with the latest Clang g++ that ships with XCode. It perhaps would be more fair to also compile the C++ benchmarks with Clang, instead of the Macports gcc4.8 that I'm using, but unfortunately Clang barfs on some template hackery in the benchmark driver.

renshaw · on Nov 21, 2013

Update: the compilation error was actually due to some missing includes. https://github.com/kentonv/capnproto/pull/47

Switching compilers on the C++ side doesn't seem to make a significant difference in performance.

Game_Ender · on Nov 19, 2013

That would be quite useful. Otherwise it's very hard to tease apart the differences between optimization back ends of GCC and LLVM vs the Rust and C++ front ends. I feel like I have seen benchmarks showing differences around 5-20% between LLVM and GCC in speed, so the it will have an effect.

dbaupp · on Nov 19, 2013

You may go a little faster with `--opt-level=3` and no `-Z debug-info` for the Rust.