

Google Protocol Buffers - Open Sourced - enomar
http://google-opensource.blogspot.com/2008/07/protocol-buffers-googles-data.html

======
gm
It's great to see Google has a brain about this. XML is sooo hyped up, and
there are so many fan boys that it's generally not worth getting into a
discussion about its merits lest you start a religious war. It's cool that
Google looks beyond the Dilbert-like mission statement of XML and recognizes
its failures as well as its strengths, not to mention the suitability to
purpose for their needs.

~~~
bprater
Agreed.

When I first saw Gmail without folders, I was like wtf!idiots. Then I used it
a bit and I was like wtf!sweet.

Same thing with Google Maps. I heard about them using Javascript and I was
like wtf!idiots. And then I finally got to see it and I was like wtf!whoa.

------
gcv
Similar to Facebook Thrift (<http://developers.facebook.com/thrift/>)?

~~~
sgk284
Thrift was inspired by ProtocolBuffers. Ex-googlers who went to Facebook
reimplemented PB. In fact Facebook was hiring people explicitly with
experience with PBs.

~~~
neilk
And by inspired, you mean "a complete rip-off of".

I talked with some of the guys who did Thrift, and while they didn't steal
code, the concept and config files are close to identical.

~~~
huhtenberg
Identical config files aside, the concept itself is rather simple and it has
been around for ages in a form of ASN.1 and its encodings. It also routinely
used in custom network protocols, e.g. IPC or RPC ones. Anyone outside of the
XML fanboy club is aware of these things :-)

File format wise, it's hardly a rocket science too. When developers re-
implement something, it's only natural to recycle an established
implementation element. Changing it _just_ for the sake of being different
from another vendor is frankly quite dumb.

So I'd be careful with bold "rip-off" statements. In the end " _Everything new
is a well-forgotten old_ ".

~~~
neilk
Well, I don't believe calling it a rip-off was necessarily criticism. As they
say: mediocrity borrows, genius steals.

Also, I must plead ignorance. I've never worked with ASN.1 or these other
formats. From my cursory examination, ASN.1 seems to be far more complex.

~~~
jhammerb
this is a complete fabrication. facebook's thrift was inspired by pillar, an
rpc library written in ml by former cto adam d'angelo.

~~~
neilk
I understand why Facebook might want to claim this in public, but don't call
people 'fabricator', okay? You never know when someone might just have
conclusive evidence that proves you wrong.

Don't get me wrong, I admire Facebook for making Thrift public, and it was a
great thing for the internet. It was probably foolish of me to focus on the
lineage of the thing.

------
dhotson
Is this something like modern day ASN.1?

~~~
huhtenberg
Yep, in encoding part it's quite similar to PER -
<http://en.wikipedia.org/wiki/Packed_Encoding_Rules>

------
JabavuAdams
Neat!

There's an article in the 1997 Game Developer's Conference Proceedings on a
similar technique that Naughty Dog used in their Crash Bandicoot tool-chain.

Given a binary file format, it's nice to be able to specify a reader or writer
code-generator with a declarative syntax that looks just like the file format
spec.

------
jagjit
I am sure the scale of their applications demands optimizing as much as
possible. I use yaml for this purpose. And find it very flexible and easy to
use. It is also more readable than xml.

------
tptacek
Never trust an IDL whose performance is benchmarked against XML.

~~~
ashu
The fact that Google said this is much faster than XML does not mean that the
IDL might be slower than other IDLs. In fact, the way it is implemented, I see
no reason why it should be slow. If you want to find a drawback, it's the fact
that the IDL is actually less _expressive_. Whether that matters or not in
real life is another matter completely.

~~~
tptacek
I'm not saying it's slow. I'm saying, "an order of magnitude faster than XML"
isn't saying much. They've chosen the wrong benchmark. How does it compare to
Pickle, Marshal, XDR, DCE, and Thrift?

For that matter, how does it compare to JSON?

I'm not really making a point about how fast Protocol Buffers are. I'm making
a point about how the article was written.

~~~
neilk
Pickle, Marshal, and JSON are all ways of describing the kind of data
structures used by Perl, Python, Javascript, and so on. They are way more
flexible than Protocol Buffers -- you can have arbitrarily nested structures
of arbitrary length. The format of the messages are basically self-describing.
By which I mean that if they want to put an array there, they just put a '['
in and away we go.

I think XDR describes fixed-format structures. I don't know anything about
DCE.

As described above, Thrift is a clone of Protocol Buffers.

Thrift and PB are a lot like XDR or god forbid, CORBA, but with a few twists.
They define message templates for how to parse and emit relatively simple
binary structures, which are then compiled into RPC clients and servers. PBs
are cross-platform, at least by Google standards: they work with both C++ and
Java very well. So you get the speed and convenience of making RPC calls with
binary data structures in a mixed C++/Java environment.

But the best thing about PBs is that the message structure can evolve. If a
message suddenly has a new field the receiver can't use, the receiver doesn't
panic or read in garbage, unlike many other binary formats. So you can upgrade
clients and servers in a gradual manner, without any downtime. This is what
makes PBs especially suited to cloud computing.

~~~
tptacek
You know, forwards compatibility is a feature of a great many binary
protocols, including DNS, RADIUS, SNMP, and DHCP (itself a forward-compatible
hack on BOOTP). So, PB is just another TLV encoding?

~~~
neilk
It seems like it. Before today I only had a vague idea how ASN.1 worked, and I
never really delved into the wire format of PBs either. But yes, it seems to
write a tag then a serialization of whatever the value is. Lengths are only
required for strings because the format doesn't define any other type with
variable length.

~~~
tptacek
ASN.1/BER (you really mean BER) is the example most protocol dorks bring up
when they argue against IDLs and structured protocols. It's incredibly
complicated (though not as hard to implement as CORBA/IIOP).

~~~
signa11
indeed. i have used ASN.1 for a bunch of SNMP and GGSN accounting stuff, it is
a pita. xdr on the other hand, is a breeze. also, xdr does support variable
length fields.

~~~
Maven
hey signa11, are you a software developer for cellphone networking equipment?

~~~
signa11
yes.

~~~
Maven
kool, Im a SGSN/GGSN engineer by trade, but I don't do the programming aspect.

