
A new Go API for Protocol Buffers - zdw
https://blog.golang.org/a-new-go-api-for-protocol-buffers
======
dgellow
> The github.com/golang/protobuf module is APIv1.

> The google.golang.org/protobuf module is APIv2. We have taken advantage of
> the need to change the import path to switch to one that is not tied to a
> specific hosting provider.

That's such a weird choice, and will be quite confusing. Now every time you
see protobuf being imported, you have to be sure to pay extra attention to the
domain used, and correctly remember which one is v1 or v2. If only go modules
established a clear way to differentiate API versions using the import path!

~~~
icholy
> If only go modules established a clear way to differentiate API versions
> using the import path!

It did, they just chose not to follow it :/ I wonder what rsc thinks about
this decisions

~~~
Insanity
Rsc?

~~~
BarkMore
Russ Cox
[https://news.ycombinator.com/user?id=rsc](https://news.ycombinator.com/user?id=rsc)

~~~
Insanity
Thanks!

------
AaronFriel
Not to be too Rust evangelist here, but why the emphasis on reflection instead
of further support for generics?

On the Rust side, macros can[1] transform .proto files into data structure
definitions at compile time, and instances of the structs passed around have
the benefit of strong typing, code completion in editors, lints and tests can
easily validate properties of protobuf objects and so on. And all without
adding more codegen steps which is nice - the macros can replace a build
system having to do codegen as a separate step.

How often do you need to mutate _any_ protocol buffer and remove (redact as
they say) fields generating a completely invalid protocol buffer object? Will
anything downstream match the ".proto" of an object that's missing a bunch of
required fields? Wouldn't it be easier to just all of the sensitive data into
an optional "sensitive_data" field that's defined by another .proto, and strip
that?

[1] - [https://github.com/danburkert/prost#generated-code-
example](https://github.com/danburkert/prost#generated-code-example)

~~~
stubish
A Go API for protobufs can't support generics because Go doesn't support
generics, yet. I guess the protobufs team didn't want to wait until Go v2,
probably a good call given Go v1.x is going to be around for years.

In Go, a code generator converts the .proto files into Go code, which give the
same benefits you see with the Rust macros. Its arguable if a code generation
step is better or worse than macros.

The reflection is needed if you have received a protobuf the code don't have a
definition for. You can now implement something like a gRPC proxy or gRPC load
balancer in Go, without needing to compile it with code from the specific
.proto files. You also appear to be able to access annotations on the message
definitions, which are not embedded in the generated Go structs. Rust may well
have similar features in its API. Java certainly does. A gRPC proxy is a use
case redact sensitive data, for when you use it to create an audit log of the
messages in the requests and responses.

~~~
jsnell
If you don't have the proto definition, there's nothing you can do except pass
the object through unmodified. And you should not need a reflection API for
that (unless the v1 API was totally messed up in other ways).

Without the definition (and knowing the type!) there won't be anything around
to tell the reflection API what the names, types and annotations are. All you
would have is field numbers mapped to opaque blobs of data.

~~~
uluyol
You can pass around a type descriptor (itself a proto message) along with the
opaque message.

------
guessmyname
> _The github.com /golang/protobuf module is APIv1._

> _The google.golang.org /protobuf module is APIv2._

> _We have taken advantage of the need to change the import path to switch to
> one that is not tied to a specific hosting provider._

> _(Why start at version v1.20.0? To provide clarity. We do not anticipate
> APIv1 to ever reach v1.20.0, so the version number alone should be enough to
> unambiguously differentiate between APIv1 and APIv2.)_

Stuff like this is what makes Go confusing to newcomers.

For some time I couldn’t understand why people were complaining so much about
_—the now deprecated—_ GOPATH environment variable, and later the confusion of
how to use all the different and incompatible dependency managers. Then,
suddenly last year, it hits me… Go is confusing to people who haven’t followed
the language from the beginning.

I started programming in Go in 2013 so for me it was very easy to adapt to
every change in the language and the community.

I can only imagine all the ambiguous stuff people have to understand when
learning the language in 2019-2020.

~~~
breakingcups
I loved Go's simplicity when it came out.

However, it has become clear to me that it was a little _too_ simple to handle
all the new requirements and additions that came after v1.

That's how you get weird comments-as-code (// +build linux,386 darwin,!cgo),
strings-as-attributes (tags), tacked-on modules versioning / vendoring, non-
typed generics( interface{} everywhere), somewhat-typed generics (go
generate), etc.

Go was not _quite_ ready to handle all these directions people wanted the
language to stretch to and as a result it's now already full of annoying
warts.

I'll still pick Go if I need a simple service with amazing performance.
Funnily enough though, I have no need for any of the (non-performance)
additions that came after 1.6.

------
zdw
Interesting that even Google decided to do complex versioning gymnastics and
introduce a new namespace in order to avoid how `go mod` adds the version as a
path component.

~~~
neild
Nah, we just really wanted to stop using an import path tied to a specific
hosting provider. Once we decided to change to an entirely different path, we
waffled a lot over whether to tag it v1 or v2. There were arguments for and
against either, so we eventually picked one and went with it.

~~~
jzwinck
Sure, but why version the new package as v1.20 instead of v2.x?

~~~
farslan
Because they would had to add a `/v2` suffix to the import path and seems like
they didn't wanted to add that. The question is, why is this not released as
v1.0.0 then? Still trying to understand the reasoning for this.

~~~
neild
Reasoning went something like this:

We could tag the new API v2:

\+ Makes the "v1" and "v2" distinction very clear in the import path.

\- Confusing: google.golang.org/protobuf@v1 doesn't exist, but v2 does.

\- In ten years, hopefully nobody cares about the old
github.com/golang/protobuf and the confusion is gone.

We could tag the new API v1:

\- Less visually distinct in the import path.

\+ Seems to make sense for the first version of google.golang.org/protobuf to
be a v1.

\+ If we decide it was a terrible idea, it's easier to go from v1 to v2 than
to roll back from v2 to v1.

We waffled back and forth for quite a while on this, and eventually decided
that the first version of google.golang.org/protobuf would be a v1. Then as we
got closer to release (but with a certain amount of usage of v0 in the wild),
we decided not to second-guess that decision but to start with a version that
wouldn't overlap with any version of github.com/golang/protobuf to avoid
confusion when someone reports a bug in "v1.0.1".

Maybe it was the wrong choice. If it was the worst choice we've made in the
new API, I'll be happy!

~~~
tsimionescu
> The google.golang.org/protobuf module is APIv2. We have taken advantage of
> the need to change the import path to switch to one that is not tied to a
> specific hosting provider.

This seems like a good example why tying the name of the package in the code
with the place where it can be downloaded was a very bad idea. Simply changing
hosting providers is now a backwards compatibility break for your clients.

If Go imports didn't work this way, you could have simply offered the old v1
from both github.com and google.golang.org, and used a clear v2 for the v2.
Sure , you would have similar problems if you wanted to change the logical
package structure for some reason (say, when an open-source package moves to a
different organization), but that is a much rarer case than switching code
repos.

However, given that that ship has probably long sailed, you probably picked
the right choice.

I would also note that v2 in the import path feels a bit strange, since it
makes users of the newest code need to know about the history of the package,
but there are also clear advantages which probably out-weigh this (I believe
this is absolutely not the case for the decision to include the hosting
provider in the import path).

~~~
neild
Indeed, tying the name of a package to a hosting provider is a bad idea!
That's why we changed to an import path which isn't tied to one.
google.golang.org/protobuf is not tied to any particular provider, and we can
redirect it wherever we want.

See "go help importpath" for details on how this works.

~~~
randallsquared
> _\- Confusing: google.golang.org /protobuf@v1 doesn't exist, but v2 does._

> _google.golang.org /protobuf is not tied to any particular provider, and we
> can redirect it wherever we want._

I agree, this _is_ confusing. Why doesn't google.golang.org/protobuf@v1 return
what's at github.com/golang/protobuf, then, if only to provide a tidy answer
for those who wonder what v1 looked like?

~~~
tsimionescu
Because the Go package name _is_ github.com/golang/protobuf. Serving it from a
different address is already a breaking change.

------
cube2222
Congrats on the big release!

One thing which saddens me a lot though, is returning types you have to type
assert because they had to avoid an import cycle. I don’t know if it could be
avoided (probably not) but it’s still unergonomic.

> (Why the type assertion? Since the generated descriptorpb package depends on
> protoreflect, the protoreflect package can't return the concrete options
> type without causing an import cycle.)

------
cbolton
Does someone know if this new version uses fast generated code for
(de)serialization? The previous version was pretty slow (because based on
reflection I think) so many people used
[https://github.com/gogo/protobuf](https://github.com/gogo/protobuf) for good
performance.

------
justlexi93
It is odd that the Go team is all in on semantic import paths, but basically
everyone else is like “I will blow up the moon if it means I don’t have to use
v2 in my import path.

~~~
zadokshi
I wonder if we will start seeing people add v1.example.com, v2.example.com
etc... to their git servers to avoid having to change the path's in all of
their private projects.

~~~
Gibbon1
Seems like you need something like a DNS server for repo's.

------
atombender
Hopefully this is good news. The old Protobuf package had ossified, with lots
of ergonomic issues that the maintainers seemed uninterested in addressing.

One perennial annoyance is how awkwardly some Protobuf stuff ends up being
represented in Go. For example, "oneof" types:

    
    
      message Event {
        oneof payload {
          Create create = 1;
          Delete delete = 2;
        }
      }
    
      message Create {
        string id = 1;
      }
    
      message Delete {
        string version = 1;
      }
    

You get these messy types (I've elided most of the yucky marshaling-related
stuff):

    
    
      type Event struct {
        Payload isEvent_Payload `protobuf_oneof:"item"`
      }
    
      type isEvent_Payload interface {
        isEvent_Payload()
        MarshalTo([]byte) (int, error)
        Size() int
      }
    
      type Event_Create struct {
        Create *Create `protobuf:"bytes,1,opt,name=create,proto3,oneof"`
      }
    
      type Event_Delete struct {
        Delete *Delete `protobuf:"bytes,1,opt,name=delete,proto3,oneof"`
      }
    

There are multiple problems here. One: Why the extra struct? For example, to
create a single event, you have to do:

    
    
      Event{
        Payload: &Event_Create{
          Create: &Create{ID: "123"},
        },
      },
    

...instead of just:

    
    
      Event{
        Payload: &Create{ID: "123"},
      },
    

Secondly, isEvent_Payload is private! So given a _Create or an_ Delete, you
can't store it in a single variable:

    
    
      func nextEvent() isEvent_Payload { // <-- not possible
        if something {
          return &Create{...}
        } else {
          return &Delete{...}
        }
      }
    

The reason for this appears to be so that the generated serialization can be
very dumb and not rely on type switches. But I don't buy that this is how it
has to be.

There are other issues. Overall, the whole package seems designed from the
bottom up for machines ( _mumbles_ possibly also _by_ machines), not
humans.The upshot is that using Protobuf types as "first-class" types —
meaning the types you actually use internally in the meat of your app, as
opposed to in the controller glue that lives in your API and mediates between
the API and the internals — feels super messy.

As an aside, anyone know what issues this paragraph refers to?

    
    
      The google.golang.org/protobuf/encoding/protojson package
      converts protocol buffer messages to and from JSON using the
      canonical JSON mapping, and fixes a number of issues with the
      old jsonpb package that were difficult to change without
      causing problems for existing users.
    

Did they finally fix the zero value problem with jsonpb.go [1]?

[1]
[https://github.com/gogo/protobuf/issues/218](https://github.com/gogo/protobuf/issues/218)

~~~
lelandbatey
To quote one of the maintainers of the golang/protobuf implementation:

> The jsonpb package is intended to the be faithful implementation of the
> proto<->JSON specification, for which the C++ implementation is considered
> the canonical "reference" implementation.

And that C++ implementation does some kinda dumb things. For example, it
marshals the protobuf type of int64 into a string when marshaling from a
protobuf struct into JSON.

[https://github.com/golang/protobuf/pull/916](https://github.com/golang/protobuf/pull/916)

I believe that the package they mention there is meant to be a more
"canonical" protobuf <-> json marshaller than the existing jsonpb package.

~~~
al2o3cr
IMO marshaling int64 to string is required for interoperability; RFC7159
recommends not assuming that more than IEEE754's 53 bits of integer precision
are available:

[https://tools.ietf.org/html/rfc7159#section-6](https://tools.ietf.org/html/rfc7159#section-6)

------
DelightOne
Hmm. I think it would be quite simple to build an authentication
proxy/firewall based on this without having to know the exact underlying
message.

Maybe tagging fields with auth_userid and/or auth_token.

------
mastrsushi
Hackernews, where the highlight of mentioning common libraries is the fact
that they're done so in Go and Rust.

~~~
sethammons
I find that to be a feature. If I open the article and/or comments and find
out this was about a library for Ruby or Java then I wasted my time because I
don't use those. Likewise, a non-Go or non-Rust dev can ignore Go and/or Rust
tagged library articles.

------
7532yahoogmail
Umm ... When will pb allow me to decode a field in buffer without having to
decode the entire object in the buffer? I'm talking a field in a pb message
NOT a byte stream I hand encoded without reference to a message eg more like
flat buffers.

------
chairmanwow1
This article doesn't do a great job explaining exactly what new features are a
part of this release. Seems to me like it's finally enabling some of the
useful type iteration stuff that is possible in other versions of protobufs.

------
sigzero
I thought the general consensus on protocol buffers was "Meh"?

~~~
myko
I think for large organizations they probably make sense. They're potentially
a lot less data per request than JSON.

For the average app? Overkill, sure.

~~~
iainmerrick
Overkill? It doesn’t take a lot of work to add proto buffers to an existing
project (if you’re already using a reasonable build system, anyway).

~~~
tmpz22
If you have weeks to spend optimizing build systems, linters/editors, CI/CD,
etc...

