
Pb-jelly – A Protobuf code generation framework for Rust, developed at Dropbox - dochtman
https://github.com/dropbox/pb-jelly
======
haberman
I work on the protobuf team at Google, and I'm a big fan of Rust, though I
haven't written much actual Rust except a bunch of Project Euler solutions.

For protobuf in C++, we've been moving more and more in the direction of using
arenas for memory allocation. When you parse a protobuf, it creates a tree of
objects that are usually all deleted at the same time. Freeing an arena is
much, much cheaper than traversing the tree of objects and calling free() on
each one.

My dream has been that Rust protobuf could support arenas as well as C++, but
use Rust's type system to make it all provably correct at compile time (in C++
the lifetime management is inherently manual and unsafe). For absolute top
performance, arenas will always beat trees of unique pointers (which I think
corresponds to Rust's Box<> type).

I don't know Rust's type/lifetime system well enough to know if this is
possible. I was looking recently at arenas in Rust and I noticed that Rust's
version of placement new seems to be stalled:

"Unfortunately the path forward for placement new in Rust does not look good
right now, so I've reverted this crate to work more like a memory heap where
stuff can be put, but not constructed in place."

[https://docs.rs/light_arena/1.0.1/light_arena/](https://docs.rs/light_arena/1.0.1/light_arena/)

Does anyone know more about this?

~~~
nipunn1313
Hey @haberman! One of the authors here. We actually do have an arena-esque
implementation built on top of pb-jelly internally, as it was needed for Magic
Pocket.

It's built on top of the Blob traits exposed by pb-jelly. It's not yet open-
source, but it would be a good candidate to do next! It also definitely has
unsafe code to your point. We open sourced the safe implementations that uses
more standard types (Bytes/Buffer/Vec) first.

There's a decent amount of cleanup needed before we can opensource that as
well, as much of it was built years ago, when rust ecosystem was less mature
(eg Bytes/Buffer weren't around yet).

I like where you're thinking!

~~~
haberman
That's great, I'll look forward to seeing the arena-oriented code someday. :)

------
q3k
From pb-jelly-gen:

> The core of this crate is a python2 script codegen.py that is provided to
> the protobuf compiler, protoc as a plugin.

That's... surprisingly janky. Not only Python tooling is always painful to
deal with (compared to Go/Rust/...), but Python 2? And in a project that
otherwise has no reason to depend on Python? :(

In comparison, the Go protoc plugin is written in Go, the alternatice rust-
protobuf protoc plugin is witten in Rust, the Typescript one is written in
Typescript...

~~~
nipunn1313
Hi! One of the authors here. This was an oversight in the documentation. The
codegen is py2 and py3 compatible. Fixed!

See issues [https://github.com/dropbox/pb-
jelly/issues/37](https://github.com/dropbox/pb-jelly/issues/37) and
[https://github.com/dropbox/pb-jelly/issues/40](https://github.com/dropbox/pb-
jelly/issues/40) for context.

------
deepsun
There's already 6 different protobuf libraries for Rust: [1]

I've chosen Prost for our project, but see the whole list:

[https://github.com/stepancheg/rust-protobuf#related-
projects](https://github.com/stepancheg/rust-protobuf#related-projects)

~~~
q3k
It's 'only' really 4 (two of them are gRPC implementations).

But yeah - this is one of those things that makes me stay with Go instead of
moving over to Rust for my backend SOA/microservice work. In Rust, for
everything you need to do, there's at least 5 different libraries that
implement that, all competing with eachother. This is especially annoying when
dealing with transitive dependencies. Meanwhile in Go, you generally get one
choice - it might be not great, but that's fine, it doesn't have to be.

EDIT: This is not intended to be mindless bashing of Rust. I do use Rust for
other things. It's a fine language.

~~~
woah
IMO the time spent choosing a library in Rust is about equal to the spent
debugging a null pointer in Go, but a Rust project will have a lot less
libraries than a Go project will have null pointers

~~~
saagarjha
Perhaps you're not auditing your code as much as you ostensibly should.

------
staticassertion
This is great. AFAIK this is the only protobuf library in Rust that supports
zero copy. Maybe this'll help some of the other libraries implement similar
features?

~~~
lightgreen
rust-protobuf supports zero-copy when using `bytes` feature and reading from
`Bytes`.

~~~
staticassertion
Very cool, TIL.

------
zxv
Cool. It would be interesting to compare benchmarks the RPC latency under
heavy load for pb-jelly compared to other RPC methods.

Rust has such great support for performant zero copy serialization and de-
serialization in various formats (bincode, message pack, cbor, bson). Seeing
this for protobuf feels very encouraging.

------
jl2718
Is there really any good reason for code-gen? Just because “google did it”
doesn’t mean it’s a good idea.

~~~
foolfoolz
code generation is great for sharing data models and writing clients for
services without dependencies

~~~
SOLAR_FIELDS
Exactly this. Imagine you work at BigCorp and use protobuf to pass data around
- now you have a unified data model you can share and everyone can use the
same client to access it without going through the trouble of maintaining all
those getters and setters. Rolling your own getters and setters is fine in a
small project but you really see the advantages of the code gen approach once
you are dealing with multiple different teams in an org working with the same
complex data model.

There are definitely some downsides to the approach though, mostly typical
problems you would expect with machine generated code. Namely that it’s
verbose and if you have a super complex protobuf data model (hundreds or
thousands of fields) and want to ship a fatjar or similar bundling of
dependencies you can run into some size issues.

------
james412
I know it's common and perhaps even fashionable, but FWIW language like "We
take an opinionated stance" utterly puts me off caring about this package

It's a piece of software, it has a design that is either fit for purpose or
not. When ego becomes entangled in that design process, it's a strong
indicator of the kind of experience one might have trying to get fixes or
enhancements merged, or even the kind of attitude you'd find when attempting
to report a bug.

~~~
ajkjk
That's not what the word 'opinionated' means here. It's not any one person's
opinion; it's that the project overall takes a stance on an issue rather than
leaving everything open for everyone else to figure out. It provides clarity
and direction compared to the more difficult situation where every library is
completely general. No ego involved at all.

~~~
james412
Perhaps I misunderstand the text in the README. Who is "we" in this case? Is
the software writing its own README?

~~~
ajkjk
'We' refers to the authors and their organization as a collective. That's
still the meaning of the word 'opinionated' in this context.

