
gRPC in Production - loppers92
https://about.sourcegraph.com/go/grpc-in-production-alan-shreve
======
_ydfu
Another downside in gRPC is that (de)serialization can be relatively slow. One
of my pet projects is a tagging server that takes a list of tags and returns a
list of objects that match, retrieved from RocksDB. I tested gRPC (proto and
FlatBuffers) and Cap'n'Proto. For all of them, the process looked like:

Insert: Object comes in (built in RPC system for both), object is assigned ID
(which is added before serialization), object is written to RocksDB, object is
serialized and indexed.

Query: List of strings come in, strings are looked up in tag index,
intersection of results is done, objects are retrieved from RocksDB, objects
are deserialized, objects are added to list, objects are returned to client.

FlatBuffers was unfortunately a no-go since it doesn't permit modification of
fields that haven't yet been set and it didn't have a sane way of making a
copy (which would also be slow).

Protocol Buffers worked but the pure Python driver is incredibly slow and the
C++ driver for Python was non-default and marked "experimental". I eventually
adapted to it but it was still quite slow even on the server side. From memory
a response with C++ client and server took ~9ms, 4-5ms of which were spent on
serialization/deserialization.

Cap'n'Proto eventually won for me. The Python driver is unstable and memory
usage soars like an eagle until Linux shoots it down with an OOM but the
server was much faster. Typical response times were closer to 3ms, most of
which was spent in RocksDB or the actual index lookup. A downside though was
that Cap'n'Proto has its own built in and odd library for async stuff and
doesn't really support threading.

~~~
morecoffee
gRPC is slow with Python, but most people don't pick Python if they are
serious about performance. gRPC has continuous benchmarks running[1] to track
and improve perf. I agree the Python one has some serious problems, but don't
let it be the whole story. The C++ implementation is 100x faster.

[1] [https://performance-dot-grpc-
testing.appspot.com/explore?das...](https://performance-dot-grpc-
testing.appspot.com/explore?dashboard=5652536396611584)

~~~
Veratyr
I'm aware and I did try out a C++ client but there was still a definite upper
bound to its performance and latency was still ~2x Cap'n'Proto.

~~~
morecoffee
If the perf of gRPC is not good enough, you should file an issue on Github.
I'm kind of surprised that gRPC loist by 2x. That's about how much better
netperf is.

When you say you used Cap'n'proto, did you mean the RPC and serializer, or
just the serializer? I ask because gRPC's serialization is pluggable, and
doesn't have a dependency on Protobuf.

------
brango
The only thing holding back gRPC is JS web support. If it had that it'd be
time to drop swagger completely. As it is you need to go protobuf -> swagger
-> js lib, but it's cumbersome and doesn't work 100% (e.g adding auth keys,
etc.).

Will there be any progress on JS web, or can it just not be done at all with
HTTP1? Even a subset of features for basic GET/POST would be fine...

~~~
phamilton
I'd say load balancing is still a big gRPC hurdle. See
[https://github.com/grpc/grpc/blob/master/doc/load-
balancing....](https://github.com/grpc/grpc/blob/master/doc/load-balancing.md)
for the current official stance on load balancing. It's pretty convoluted IMO.

~~~
YZF
I agree. You kind of have to roll your own for every language you use which is
odd. And retries. Hopefully some more standard way of handling this will
emerge.

EDIT: I enjoyed the GopherCon talk... I don't think the video is up yet but
when it is I recommend watching...

------
stevvooe
Decent overview, but, remember that when evaluating something, projects change
over time. Even better, you can be the change you want to see! Thus far, the
grpc project has been fairly responsive in making solid changes, either
through PRs or filing issues. ;)

Regarding the complaint about errors, there is already protocol support for
structured error handling:
[https://godoc.org/google.golang.org/grpc/status](https://godoc.org/google.golang.org/grpc/status).
[https://github.com/grpc/grpc-go/pull/1358](https://github.com/grpc/grpc-
go/pull/1358) should make this easier to use. In practice, the provided error
code set is very good, so give them a try before making things more complex.

Worst case, you can just stuff things in a header or trailer.

------
joneholland
You can add Expedia to the list of production grpc users. Our entire hotel
pricing backend is grpc microservices written in Scala. Some of these services
take > 100k TPS. We have found grpc to be extremely scaleable.

If that sounds interesting, I'm hiring engineers, shoot me a note at joholland
at Expedia.com

~~~
morecoffee
Do you use the Java generated stubs or generate your own? There are several
people using gRPC with Scala that could benefit from more idiomatic stubs.

~~~
joneholland
We use the java stubs. We briefly looked at some scala proto generators, but
most times you dump the proto class into a rich scala type right away anyways.

------
tnolet
The point about operations is a very, very valid constraint of REST. Easy and
common stuff like, "run this thing in the background", "send of this one,
ephemeral, message" are very unnatural. Maybe a hybrid / bastard child of REST
and gRPC would be a good marriage of resource and operations modelling.

~~~
QuercusMax
The Google Cloud standard for async operations via the
google.longrunning.Operation service:
[https://github.com/googleapis/googleapis/blob/master/google/...](https://github.com/googleapis/googleapis/blob/master/google/longrunning/operations.proto).
This should be usable either via REST/JSON or gRPC.

It's designed to be generic for any kind of async operation, and to be "mixed
in" with your APIs. There are utilities for waiting for an operation to finish
(via polling); in theory it should be possible to use some type of server-push
to avoid polling, but I'm not sure if anybody's doing this.

(I'm an Alphabet employee who's working on APIs delivered via gRPC /
REST/JSON.)

------
nemothekid
Thrift is Facebook's version of gRPC right? If so, I don't quite understand
the comparison, and how gRPC succeeds where thrift "fails". Wouldn't all
language implementations of gRPC have to be well documented, reliable, highly
performant and easy to install?

~~~
wrsh07
My understanding is that Thrift is Facebook's analog to Google's Stubby. Or
gRPC [which is the next generation of Stubby].

It's not implementing the gRPC interface. I don't think gRPC was open-sourced
[2015] early enough for Thrift [2007] to be built to its API.

Disclosure: Google employee who uses Stubby. [but is not on the team]

~~~
puzzle
Thrift was written at Facebook by an ex Google intern trying to recreate
something close to protobuf+Stubby.

------
misterbowfinger
> Inefficient (textual representations aren’t optimal for networks)

REST APIs don't _have_ to be text-based, AFAIK. Why not just send/receive
binary?

~~~
ehsankia
Sure, but then you'd have to write a serializer/deserializer for your
messages. You could also use an existing one like Bson but then why not just
use Protobuf?

~~~
cshenton
Because bson, msgpack etc are self describing? It means a way lower barrier to
entry.

~~~
adrianmonk
I'm not sure I see how self-describing is a lower barrier to entry. It seems
like generally the main reason why you'd want a self-describing format is if
you're writing client (and server) language bindings by hand. If you have a
tool to auto-generate those language bindings, you can skip that step, and
there's no need to make a step easier if it's done for you automatically.

I can see where self-describing is better when it comes to side issues like
debugging, exploration, or the hassle of configuring your build to generate
code from the IDL files, though. But if I had to choose only one, those are
lower priority to me.

