Hacker News new | past | comments | ask | show | jobs | submit login
The state of gRPC in the browser (grpc.io)
203 points by simjue on Jan 9, 2019 | hide | past | favorite | 96 comments



My team has been using gRPC + Improbable's grpc-web [0] + Typescript for a green field project and it has been amazing.

- Typing all the way to the frontend.

- gRPC/protobuf forces you to describe your interfaces and are self-documenting.

- gRPC semantics like error codes and deadlines (if you propagate them through your stack, they're particularly useful - for instance, we cancel database transactions across service boundaries if a request times out).

- Performance is great (but we're far from seeing bottlenecks with JSON, it's not the reason we choose gRPC).

- We use grpc-gateway which auto-generates a REST proxy for our customers. We sometimes use it for interactive debugging. [1]

- Rather than importing database models for our management tools and one-off scripts, using the API is so frictionless that we even use it inside our backend code and for CLI utilities.

The Google API design guide is helpful: https://cloud.google.com/apis/design

One piece of advice: Treat your gRPC calls like you would treat a GraphQL resolver - if you squint your eyes, they're very similar concepts.

Rather than specifying a GraphQL query, you specify a Field Mask.

https://developers.google.com/protocol-buffers/docs/referenc...

Happy to answer questions on our experience.

[0]: https://github.com/improbable-eng/grpc-web

[1]: https://github.com/grpc-ecosystem/grpc-gateway


Were you able to find a good javascript or typescript gRPC client? The official google one seemed to have spotty documentation for the js module itself. It did not look like it supported promises or Nodejs streams either.


We use the Improbable grpc-web client (linked in the post), plus a thin promises wrapper. We haven't used gRPC from NodeJS (you wouldn't need the grpc-web detour since NodeJS can just speak the "real" gRPC protocol).


Does converting gRPC into a JS object has better performance than converting JSON into an object?


Is it still necessary to use a REST proxy when using protobuf from the browser?


No - it's just there for customers who want a REST API. We don't use it outside of tests and development.


lima - great points.

Questions:

1) If you have TS on the front and back end - you already have de-facto typing. Why did you opt for gRPC over just some kind of basic serialization, even JSON? I

2) If you have TS classes, then you need to maintain a set of Protobuff service definitions as well - do you find this was a lot of duplicated overhead?

3) gRPC seems to me to be a little 'internal', not quite a web tech meant to be shared between parties. Did you have trouble with authentication, especially with 3rd parties? Or other 'cross-cutting' concerns?

I'm literally looking at doing what you're doing, any feedback would be greatly appreciated.


1) I forgot to mention - the backend is written in Go. Even if both were TS, all of the interoperability and other advantages will apply.

2) The TS classes are auto-generated and the protobuf definitions are the only authoritative definition.

3) We're using JWT for authentication (in particular, short-lived tokens issues by Keycloak, using their JS library). gRPC has request metadata and it was straight-forward to use it with the Keycloak library.

We generate client libraries for customers to use and give them the option of using the REST API if they don't want to deal with gRPC. We consider the gRPC API to be a competitive advantage - our customers are enterprises and like well-defined and typed interfaces.


Hey, sorry if this is off-topic, but how have you been finding Keycloak? This is the first time I've seen it mentioned on here, and it looks like a great auth solution at first glance.


Keycloak has been great.

- Very complete OpenID Connect implementation.

- Back-channel logout.

- Short-lived tokens and lots of tooling to handle refreshing them. This solves many of the issues associated with JWT expiration.

- Documentation is very complete.

- Their data model allows for very fine-grained access control - you can issue sub-tokens with limited scopes, limit scopes per client and so on.

- Easy to deploy in OpenShift/k8s.

Only pain points are the build process (Java has got nothing on NodeJS as far as number of dependencies go!), data migrations and the fact that it's written in Java (start up time in CI).


Thanks a lot, great to hear!


I use it and it's great.


Excepting jwt this is what we are doing. Typescript has been a blessing when used this way, especially for merges.


To address all of your questions:

One of the biggest selling points of gRPC (or many of the protobuf based systems, like Twirp) is that there's a huge community of people building code-gen tools for the API and Client out of a .proto file. What this means is that you can auto-generate a typescript library in one command, then publish that to npm.

There's an implicit guarantee that, if the TS client library and, say, the Go backend, are both "the same version" of the API (meaning, generated from the same .proto file), you'll get full API compatibility. And its not just "the frontend is assuming the response is of type T" or "doing schema validation on the response once you get it to make sure it conforms to a type the frontend defines independently"; the client lib and the backend lib are both generated from the same source of truth.

This exists with something like OpenAPI, but Protobuf is much simpler. And then you layer something like gRPC on top of that to get all the HTTP2 benefits, and its obvious why its gaining popularity.

That being said, to the third question: There's a huge advantage in those auto-generated client libraries. That's literally how libraries like the Google Cloud SDK are made; autogenerated from protobufs. But, if you use gRPC, the state of using that on the frontend natively sucks. So its got pros and cons.

Personally, we've opted to use Twirp instead of gRPC for the time being. Its all the advantages of protobufs, but it creates an API that is easily callable by web clients because its just a simple HTTP endpoint with a JSON payload. The only big thing you lose is the HTTP2 streaming, which might come one day.


> - Typing all the way to the frontend.

...where it is transpiled to ES5... :(


Web has come a full circle. First we disgust and hate SOAP XML and friends. Then we go to REST and realize of it's too loose! Then we invent stuff like Swagger and JSONAPI in order to put some interfacing in place. And then we bring in cripples that give similar but water downed features of SOAP, GRPC, GraphQL and more friends.

Edit: I know I might get downvotes but think for a second we could have just taken some good parts from SOAP to begin with and have all the goodness.


Yes, it has come full circle - but SOAP XML and friends were rightfully despised. The spec is a mess and there are so many broken implementations all over the place. I have fond memories of debugging the PHP SOAP client in order to build a bug-by-bug compatible Python version :-)

gRPC and other modern RPC protocols/SOA frameworks are basically "the good parts from SOAP".


I feel that the difference is that SOAP was a proper spec made by multiple parties, for better or worse. And GRPC is just a published spec from Google with code from Google. Of course it won't be such a mess and it will be good enough. Is it good to rely on Google adopting its tech as an industry standard? That's a philosophical question.


I personally use this rule to decide whether I should use a Google service/library: it has to have more than 1 billion active users or it's an important part of such services. gRpc more specifically protobuf, imo, is very important to Google. So I won't be worried about its future.


gRPC was contributed by Google to CNCF. Google remains extremely engaged in its ongoing development.

https://www.cncf.io/projects/

You might also be interested that Google is hosting the first ever gRPC Conf, which is being organized by CNCF the gRPC community: https://events.linuxfoundation.org/events/grpconf-2019/

(Disclosure: I'm executive director of CNCF.)


From a philosophical point of view, it seems like a design critique should be focused more on the design and less about where it came from? (Assuming it's open source and there aren't intellectual property issues.)

Industry standards tend to take on a life of their own, even if they came from one company. (Consider Docker.)


> I have fond memories of debugging the PHP SOAP client

It's no better now :) I use it frequently with SOAP endpoints from various companies (all with varying authentication measures/encodings/versions) and it's a nightmare.

Unfortunately the travel industry hasn't gone for anything else yet...


I used to use https://github.com/econea/nusoap back in the day, take a look


The SOAP specs do not even make sense. What doesn't make much of a difference, because the only player I've seen even trying to follow them is Microsoft (go figure...).

SOAP has a broken type system that can not be extended (leading to many extension standards for things like lists (go figure, again) and complex formats that people didn't want to convert from and into). It had horrible error handling, with problems only becoming visible at the application layer. It inherited the XML's property vs. contents problem. It inherited the XML's bad DTD format.

Soap deserves to die. Taking the good parts of it would be good, but there was nobody in a position to do that, so we settled on fixing the problems and patching the good parts back with time. With gRPC, I think we are there.


We did REST but we completely skipped over "just using HTTP correctly." E.g. using media types in the Content-Type and Accept fields to specify that you're sending a "application/vnd.foo.myapp.myfooreq.v1+json" and looking for either a "application/vnd.foo.myapp.myfooresp.v1+json" or "application/vnd.foo.myapp.knownerror.v1+json" in response.

Then, HTTP itself does the RPC message type management. Request message the server didn't expect? 406 error.


SOAP had a lot of the right ideas. It didn't fail because the concept behind it was bad. The main reason it failed was that, although the "S" stands for "simple", it wasn't. Maybe it was simple compared to CORBA or something.



> Maybe it was simple compared to CORBA or something.

Hell no! CORBA was a binary protocol, but it was much easier to reason about and debug than SOAP.

Somebody put that "simple" there with no concern about semantics.


At one point during grad school (after far too much RPC middleware implementation work) I was able to decode hex dumps of CORBA packets in my head


I was sure able to write them.

I was never able to write a complete SOAP response, and I don't think I was once able to predict one without running first (parsing after the fact is easier, that I can do).


It would be probably more accurate to call GraphQL the XSLT of our time.

But really, all this serialization crap is what programmers are legendary for: trading one form of busywork (reading bytes into object fields) for an even more obscure form of busy work (specifying a generator of the process).


This is how tech works. The old things are new again. (Let's not forget about CORBA, Sun RPC, and others.)


Well, if data serialization, transfer and deserialization with protobuf is faster then that's a big improvement over SOAP and XML.


We built out our gRPC services starting about 8 months ago. I pounded my fists on the gRPC-web issue board a bunch of times. Things started moving. In the end, it was too little and the effort is too divided. Also the route for which they wanted to implement streaming was, personally, poorly designed.

We have since switched to GraphQL and haven't looked back.


We are looking at this as well.

Can I ask a question?

Seems GraphQL and gRPC are to some extent complementary: one is graph/data oriented, the other more functional/service oriented.

I know GQL has it's own ideas about transport, but would it make any sense at all to actually put GQP over gRPC? And by that I mean to have some gRPC services pass GQL requests as parameters? Or have I misunderstood entirely?


GQL and gRPC are absolutely complementary. Having resolvers fan out to gRPC-backed microservices has been great at our company. We initially used protos for all of our service contracts (including server <-> web UI communication). While this was nice for all the reasons other people have stated, protos kinda ended up sucking to work with on the front-end.

Protos are a serialization contract and should remain such. Too much proto-specific logic ended up bleeding into our web codebase (dealing with oneofs, enums, etc). GQL's IDL on the other hand ended up being a perfect middle-ground. It gave us a nice layer to deal with that serialization specific stuff, while letting the front-end work with better data models (interfaces, unions, string enums, etc.). GQL's IDL and TypeScript are a great match, since GQL types are ultimately just discriminated unions, which TS handles like a charm.


Can you comment more about your comparative experiences with "logic bleed"?

I find this fascinating, because it seems that a lot of the bleed should be the same (isn't 'oneof' roughly equivalent to 'union'?)... but it sounds like something is different in practice, and I'd really like to understand what the root cause of the difference is.


I appreciate you calling me out on my wording, because you are spot on. If anything GraphQL “bleeds” even more into my front-end code. I think the appropriate way to frame it is that GQL is a more targeted, holistic solution to the problem of fan-out data aggregation from the perspective of a UI client. Every UI codebase I had that depended on protos had significant chunks of code transforming the data more appropriate to our UI domain objects (that lived either in the front-end code base or one abstraction higher in a "BFF" [1]). Our UIs usually wanted to work with denormalized data structures, which was obviously in conflict with the proto models owned by small individual microservices.

GQL simultaneously addresses those two specific problems: resolution of normalized data, and giving UI consumers the power to declaratively fetch their desired data shape. It also has first-class TypeScript support through Apollo and the open-source community built around that.

I think it’s important to stress the tooling support, because you are correct… oneofs and union types are conceptually the same thing. A lot of it comes down to ergonomics in how you consume those types. In code generation GQL unions represent themselves as actual TypeScript union types, which means I can write type guards or switch on their discriminant to narrow to their correct member, whereas proto oneofs use a string value to access the correctly populated property. Small things in the day-to-day, but in how it manifested itself in code, it definitely felt like an improvement.

GQL unions also give you the power to do some really cool projection declaratively in queries [2]. Once again because of the nice compatibility of TS’ type system and GQL, the types returned from those queries code generate into really nice structures to work with.

I’m getting a bit rambly and don’t feel like I adequately answered your question, but it’s a bit late and I wanted to give you some response off the top of my head. I don’t want to knock on grpc-web or anything. A good deal of it has to deal with code ownership and team communication structures, and GQL ultimately felt like a better seam for our UI team to interact with our services.

I probably should write a blog post, because I have a lot of disconnected thoughts and need to have a more coherent narrative here. I think some code examples would better illustrate what seems like non-problems from how I've described them here. I’ll follow up once I’ve let it settle in my head.

[1] https://samnewman.io/patterns/architectural/bff/ [2] https://graphql.org/learn/schema/#union-types


I didn't mean it as a call-out at all! :) I've had similar intuitions but been struggling to put a finger on it and wordsmith well on the topic. Thank you for writing!

And yes please do continue to write more, will eagerly read :)


Damn, thanks for that write up.


This deserves a blog post :)


I think you can use both[1]; gRPC doesn't compete with GraphQL really.

[1] https://github.com/google/rejoiner


Could someone explain the difference between gRPC and Apache Thrift? Even tough the Apache Thrift project is not as active as gRPC it works quite well for us.


Thrift is what xooglers at Facebook implemented when they wanted Stubby. grpc is Google's own open source edition of Stubby.

They are very similar paradigm-wise.


Wait, what? IIRC Stubby is a lockserver, used for two things: coordinated locks, and small coordinated data. You mean protobuff, don't you? Am I misremembering and/or wrong?


You're thinking of chubby. :) https://ai.google/research/pubs/pub27897


I think they're rather similar in goals. I found this overview informative: https://youtu.be/RoXT_Rkg8LA


gRPC is protocol agnostic. You can use any wire type with it, like JSON. There's even a codec for thrift messages on gRPC called grift.


The main difference: grpc can do bidirectional streaming, and not only request-response RPC.


Slight off topic, but has anyone executed gRPC on a C platform for embedded? I'm willing for forgo the automatic code generation - but hell if there are any good resources I've been able to find on this.


With embedded platform you mean microcontroller with RTOS? Afaik there are no suitable implementations yet. It will generally be pretty hard to build something like this, since grpc requires HTTP/2, and that is hard to implement with no dynamic allocations and a low amount of memory. E.g. each substream requires a minimum amount of buffering, there is header decompression, etc.

For pure protobuf communication on embedded systems nanopb works fine. If one doesn't need concurrent streams, HTTP/1.1 or coap plus nanopb encoded payloads work fairly well.


The problem I have with nanoPB, is that the RPC comm between the app over BLE and the server over http is entirely up to me. There are no suggestions, standards, or even great examples of what the pb should contain and how it gets processed at each location.

It almost becomes RESTful where I have a proto that has a request field, response field, and all the optional fields for both, when the server or app sees one of these I need all custom implementation to decide what to do with these ‘states’. It seems wrong and complex to make it all arbitrarily handled at each of three places.

NanoPB is excellent. But it’s the handling of the data (calls, req/res, timeouts, errors, formats, patterns) that I wish there was something for.

And yea, you got it right that with no HTTP layer it wouldn’t be gRPC on the device. But my app and server could still use that, if only the device could process the intention of the gRPC format changes / extensions.


I am using the HTTP Gateway and generating OpenAPI from that for type-safe access [1]. The main downside is that this introduces an additional build step in your project and a small amount of additional run-time processing. The upside is supporting a JSON API and not requiring gPRC(-Web) for clients. Even for non-browser clients, GRPC tooling can still be heavy-weight to pull into your project. Not all languages have easy to use OpenAPI integration, but any language can always just send some JSON.

But with the HTTP Gateway you are actually supporting both, so a browser client can still use GRPC-Web if it is able.

[1] https://github.com/gogo/grpc-example


Hi, I'm the author of this blog post _and_ the author of this repository - so happy to hear you're finding it useful! I agree that the grpc-gateway is very useful still, but for greenfield projects I think it would be useful to consider the grpc-web on its own merits. I see the grpc-gateway as a way to integrate gRPC into existing environments.


One other that helps JSON/REST to remain defacto these days, is not just a web browser support. But also nice mobile clients like Retrofit.

Retrofit seamlessly converts JSON data by means of json de-serializer, into Java classes. So we get typefull semantic (but, albeit at run time only).

I have retrofit integrated with RxJava, and when I looked what it would take to 'plugin' gRPC, I found that I have to change not just my backend, but all the wiring for the mobile frontends. And, at the time it looked like too much work.

If Retrofit, would make it transparent JSON vs gRPC -- then it would be great for folks that already invested into Retrofit/JSON.


gRPC is two things to me (please correct me if I'm forgetting something):

- A specification language for services and the RPC calls they take (except RPC response primitives include single message and stream)

- A binary object definition/packing scheme (aka protobuf)

Outside of the HTTP and what HTTP/2 makes possible, I feel vaguely like everyone is rushing to replace pure HTTP/HTTP2 (and HTTP3 in the future) with something that is less extensible and could be implemented inside HTTP for the most part.

Obviously you can't do anything about the inefficiencies of headers in HTTP/1 (this is better in HTTP/2 & 3), the lack of built-in stream semantics (again only HTTP/1 though you can make do with some longpolling/SSE/websockets scheme) but outside of the HTTP stuff you can absolutely transmit your content as a stream of tightly packed bytes and let consumers do whatever they need to...

Why is gRPC anything more than a content type? It's so weird to see gRPC evolve from (in some sense, I might be wrong) HTTP + HTTP/2, and now people trying to shoe horn it back into the browser.

It's really hard to find metrics (maybe I should do a comparison and post results I guess), but like in this SO post[0]. The hype train is presenting gRPC as the next thing, but it shouldn't be, IMO. Most of the improvement is from use of protobuf for more efficient (de)serialization -- and I don't think gRPC is the best tool for declaring API schemas (RPC or otherwise) either.

[0]: https://stackoverflow.com/questions/44877606/is-grpchttp-2-f...


I think what you describe is the shortcoming of the initial grpc spec. It was for some reason decided to define it on top of features which are barely implemented outside of HTTP/2 and special libraries (especially: Trailers). But it would have been possible to just define things on top of common HTTP semantics. This is what grpc-web now fixes according to my understanding.

I think even streaming should have been possible with HTTP/1.1. Bodies can be streamed there just fine, if libraries support it (for browsers the issue was up to now that the APIs don't support access to bodies as streams). The only thing I'm not sure if there is an issue in HTTP/1.1 with request streams still running while the response stream has already finished.


> I think what you describe is the shortcoming of the initial grpc spec. It was for some reason decided to define it on top of features which are barely implemented outside of HTTP/2 and special libraries (especially: Trailers). But it would have been possible to just define things on top of common HTTP semantics. This is what grpc-web now fixes according to my understanding.

This is also my understanding (minus the Trailers bit) -- there's stuff like grpc-gateway[0] that I've considered using before so I know there's a mapping (whether it's easy to use is another thing) from grpc to HTTP/1.1 ...

> I think even streaming should have been possible with HTTP/1.1. Bodies can be streamed there just fine, if libraries support it (for browsers the issue was up to now that the APIs don't support access to bodies as streams). The only thing I'm not sure if there is an issue in HTTP/1.1 with request streams still running while the response stream has already finished.

For streaming I was thinking mostly of the duplex streams, i.e. what Websockets brought to the table, everything else would be pretty hacky. I suspect that grpc translates to regular browser-ready HTTP/1.1 REST pretty easily (outside of trying to decide how), minus the streaming bit, since you'd have to do some sort of comet/long polling/sse/websockets approach.

[0]: https://github.com/grpc-ecosystem/grpc-gateway


Has anyone considered using this or something else for IPC – that is, communicating between memory boundaries in the same application (e.g., Node/web, Swift or C# to a web container, etc.)?

Even just for Electron the built in IPC API doesn’t really cut it long term since you don’t get typing nor proper request/response or topic based streaming support so bugs are more likely to sneak in.


The state of gRPC on the browser is the main reason why my company selected Twirp instead. Being able to natively call a simple HTTP endpoint with a JSON payload is fantastic; its the benefits of protobufs when you can use it, with a graceful fallback for clients that need normal HTTP. Of course you can get this with grpc-gateway, but now you have two things to maintain.


I've had issues with npm and binaries of gRPC in firebase related to node versions and electron versions. It seems like this could be a good alternative if it was used in firebase since it's implemented in typescript - not sure I completely understand it, though.


I'm quite surprised they're not using websockets for streaming. Anyone know why?

EDIT: Better question: is anyone aware of a system like gRPC but built with bidirectional streaming for the browser from the beginning?


I think you've unwittingly asked for DDP and Meteor.

Meteor was seriously ahead of its time, much like all good systems that suffered under their complexity before the infrastructure was there to support it. At its core, its a websocket-based RPC protocol that could return database cursors, with asynchronous downloading of database documents. The client ran a miniature instance of MongoDB inside the browser, which enabled clients to issue normal MongoDB queries client-side with optimistic evaluation. Data updates would hit the local mongo instance, be optimistically rendered, and then be sent to the server. Then the source-of-truth would be delivered back to the client via the mongodb oplog.

In other words, it's bi-directional mongodb oplog streaming. Absolutely fantastic considering it was released in 2012.


Websockets have head-of-line blocking, which is one of the main reasons HTTP/2 exists.

HTTP/2 (i.e. gRPC) is a bidirectional streaming protocol, and you can use the fetch API in JS to use it. The reason gRPC-web exists is because browsers artificially hide some of the headers, which have been part of the HTTP standard since the beginning. If that was fixed, gRPC would just be plain XHR or fetch requests and gRPC-web would go away.


I'm the author of this blog post and one of the maintainers of the Improbable grpc-web implementation.

I don't explicitly mention it in the post, but no browser has support for fetch request streaming yet, so true bidirectionality would not be possible even if you had control over the headers. This will come eventually, and then grpc-web will have proper bi-di streaming support. It is doubtful whether it will actually have access to raw HTTP/2 frames which would be required for the gRPC HTTP/2 protocol.


Correct me if I'm wrong, but my understanding is that HTTP/2 only half solves the head of line blocking problem. It's true that you don't have message-level blocking like websockets, but it's still TCP so streams can block each other if there's packet loss. QUIC seems like the full solution.

Also, is head of line blocking really that big of a problem for RPC? It only seems to really be a big deal if your latency tolerance is really tight.


Check out WAMP? It's websockets + RPC.

http://wamp.ws

And a server that implements it:

http://crossbar.io https://github.com/crossbario/autobahn-js


Both protocols implement their own framing scheme(s). Once you start looking at the details of such a suggestion, it becomes kind-of a hack to use websockets for streaming gRPC and likely to have scaling issues. One can do it as a proof-of-concept and may work fine in low-traffic scenarios. I think it will not be a reality until browsers expose some sort of fetch-like http2 primitive to developers.


I guess a more implicit part of my question is why didn't they design it to work well on top of websockets from the beginning, rather than doing their own framing? If you assume you have a reliable in-order messaging protocol to build on top of, then you can swap that protocol out for different cases (ie websockets/WebRTC on the web, reliable UDP/SCTP/etc for other cases). I say this with very little background in gRPC. I'm working on a protocol that aims to do exactly this[0] (but for streaming only, not RPC), and I'd like to learn as much as possible from smarter people who have done similar projects.

[0] https://github.com/anderspitman/omnistreams-spec


Improbable's client does have experimental websocket streaming.


I'm surprised that streaming is not implemented for web client. Streaming seems to be a huge feature for gRPC. Why does it take so long?


I'm excited this as an option in the toolbox.


What's the advantage of using this over something like XMLHttpRequest?


grpc-web indeed does use XMLHttpRequest internally to make the request to the server. Main advantage is type-safety of the overall communication protocol and some efficiency gains. Cannot be stated strongly enough how important compile-time type-safety guarantees between client and server is (particularly in a "microservices" environment where everything is disconnected at runtime); eliminates an entire class of bugs, frees developers to focus on more business-relevant tasks rather than worrying about serialization.


XHR is really just an HTTP client, despite the name. It doesn't enforce anything regarding inputs and outputs of an HTTP request.

gRPC gives you guarantees on those inputs and outputs, along with a bunch of extra features. It's basically a layer on top of XHR.


XMLHttpRequest does not have a remote procedure call (RPC).


I meant what's the advantage of using RPC instead of XMLHttpRequest to communicate with the server.


Well, XMLHttpRequest alone doesn't cover much so you're really comparing with HTTP requests combined with JSON and whatever you use to document your JSON API.

The advantage of protobufs is basically static typing with backward-compatible serialization (you can add new fields), that's compatible with many server-side languages.

It's honestly an awkward fit for web apps, though. Basic things are different, like int64's aren't native to JavaScript.


> An efficient JSON-like message encoding

how is this still "the future"?


Another team at my company developed a microservice that uses gRPC that my team depends on and it’s been an absolute nightmare. Protobufs are just too limited in what they can do to make them useful except in very specific, narrow cases.


That's surprising, since Google uses protobufs very (very!) extensively internally, and it'd have to be a very exotic use case for protobufs not to be useful. Which isn't to say they're the be-all-end-all, but I'd be curious to hear your use case.


I haven't looked into gRPC at all, but I know for a fact that Protobufs can do whatever is needed in regards to serializing data.

Being limited to types that Java supports is a huge limiting factor for some use cases, thankfully there are extensions to the spec to get around these limitations (but then you have to use libraries that all supports the same extensions, and that limits your choices, so yes this can become an issue!)

On the flip side, compared to JSON with its anemic type support, Protobufs looks great!

Of course at the end of the day, a huge % of apps have to talk to the web browser, so everything gets dumbed down to strings and doubles. :(


> compared to JSON with its anemic type support

Am I the only one who wants to keep serialization as far from my type system as possible? I want to have my internal data model for my software, and when I want to serialize it, I should be able to do it in any number of different ways (redacting values, using HATEOAS links for REST routes, using a lossless format for storage) depending on the situation.

Protobuf couples data modeling and data serialization into a single inseparable concern, and it becomes difficult to do anything else once you start using it.


It is a performance trade off.

Sure you can use JSON and make everything strings and do something like

    { type: 'int8', value: '52' }
 
but then parsing out the data ends up being a huge overhead.

I once saw an XML variant of this, where someone decided to make the ultimate Distributed Computing System and serialize every single function call, take up over 1/2 of a CPU core for de/serialization for just the one app running on that machine.

While I could admire the purity of the design, it was utterly insane as an actual thing to bring into the world.

Protobuf is a good mixture of "strict typing so everyone is on the same page", "freedom to do stuff", and "perf isn't horrible."

There are other encoding systems out there that offer even more freedom, e.g. cap'n'proto, and better performance, but with other trade offs.

On the opposite side of things, my team had been serializing straight C structs out over the wire, but every field addition was a breaking change, and communicating the changes to our structures across teams was a nightmare of meetings and "has your team merged the changes Bob made so we can roll out our new format yet?" We needed 3 teams to roll out changes at the same time!

With protobufs, we were able to make changes to our wire format incredibly rapidly, we had a nice source control managed asset that defined our format, there weren't any confusions as to how data was laid out, and clients using older versions of our definitions just missed out on newer features, nothing actually broke.

It was an insanely large improvement, and honestly for the managed platforms, the performance wasn't appreciably worse than trying to convince Java to read in uint8s and write out uint8s.

My team was working on an embedded platform with RAM measured in kilobytes, and it was worth us eating the overhead just to get rid of the countless meetings we had to hold whenever we made a change to any of our structures.


Your internal data model should be separate from protobufs. Protobufs are for modelling the contract between parties of the serialized content so that the producer knows what to produce and consumer knows what data to expect. Only when you have a need to serialize your internal data model into protobufs would you touch protobufs, just like you would with any other serialization format.

The structures that are generated by the protobuf tool are simply there to help with transforming your internal state into the format needed to satisfy the shared contract, allowing your language's compiler to tell you if you have violated that contract. Theoretically you could produce/consume protobufs without them, but they are provided as a convenience so that you can deal with serialization transforms directly in your language of choice instead of banging bytes.


I think most people don't take the time to discriminate between their models and protobuf's. They see automagically created data models and think WELP LET'S USE THIS EVERYWHERE.


>Protobuf couples data modeling and data serialization into a single inseparable concern, and it becomes difficult to do anything else once you start using it.

This is only true if you decide that your storage proto and wire proto are the same. That's not at all necessary (and generally not recommended). More to the point, FieldMask exists in proto, so redaction is fully supported, though I rarely see them used.


Not true - protobuf has a JSON representation: https://developers.google.com/protocol-buffers/docs/proto3#j...


really? I mean besides the fact that nearly every message across Google is protobuf-encoded (with a fairly rich schema), protobufs are general enough to encode any message within them (you could use it to just pass bytes around), it's hard to see why this would be true. Can you be more specific?


What kind of limits are you seeing?


I'm also curious


It’s the repeated/oneof issue that’s causing me problems right now.


You mean like:

  repeated oneof actions {
    ActionTypeOne type_one;
    ActionTypeTwo type_two;
  }
or like:

  oneof thingy {
    repeated string first_option;
    string second_option;
  }
I can see why you'd want to do either of those things, but it doesn't seem like a huge deal to me to wrap those in another message.


  oneof thingy {
    repeated string first_option;
    string second_option;
  }
Not the most beautiful solution, but I've seen it in production and it works fine.


And why can't you wrap it in another message?


Can you go into detail on this?

Perhaps you would prefer something like Ion: https://amzn.github.io/ion-docs/guides/why.html




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: