
The story of Google Guava and patches - michaelneale
https://plus.google.com/113026104107031516488/posts/ZRdtjTL1MpM
======
stephen
I'll call bullshit. Either they care about external developers or they don't.
This is saying they don't.

Google's culture seems insular and elitist. Besides Guava, they did the same
thing with GWT (which, as much as I love GWT, didn't work out in the project's
best interests, IMO), and now are doing the same thing with Dart (AFAICT).

Maybe in the 90s you could get away with this. But now if you don't have an
active external community, whenever your old guard of Guava/GWT/Dart
developers gets bored and leaves, the new guys that come in behind them aren't
going to care nearly as much about the Google internal technologies vs. the
true open source technologies they've been hacking on before/after their time
at Google.

So the Google/internal technologies will eventually stagnate.

Perhaps internally-driven projects can get more stuff done in the short term
(thanks to dedicated resources), but I think in the long term the external
community out-innovates internal projects (due to the internal teams getting
burdened with legacy requirements ( _cough_ GWT), politics, etc.). Dunno,
that's my impression.

~~~
cromwellian
I used to an be external contributor to open-source GWT, and then became a
Google employee to work on the GWT team and the situation really has nothing
to do with elitism or culture.

It basically boils down to a matter of resources. When I started as an open
source contributor to GWT, it was used by external developers, but not really
used internally by Google, so changes made by external committers couldn't
possibly break anything.

Slowly over time, more and more Google properties started using GWT, and
suddenly, you had the situation where an external user could submit a patch,
that passed all the GWT unit tests, but broke major Google properties (e.g.
AdWords, Google Groups, Wallet/Checkout, etc) Google builds everything from
head, so when you do an internal commit, not only do your unit tests run, but
the unit tests from every project that depends on GWT, so you find very
quickly if your patch broke real applications. This happens all the time. I
commit, pre-submit queue for GWT is green (all tests past), then hundreds of
other projects get their chance, and there's always 1 or 2 that break, not
always because of GWT per se, sometimes because of bad code in those projects.

The problem is, there is no way for external committers to get notified of
internal (potentially confidential) apps breaking on their changes. This meant
that every external commit would need to be reviewed and proxied by someone on
the GWT team.

Now, back when we had over 20+ people working on GWT, it wasn't hard. Now
there are only 5 full time committers, and it has become a lot more difficult
to keep up with the external community and support internal users.

I have been internally advocating that we "re-open source GWT". That is, we
make the "source of truth" be an external repository, possibly re-hosting it
on GitHub or Google Code, and fork it off from the internal version. We grant
all our of best and dedicated contributors rights to administer and commit on
equal footing with Google employees, and we run external continuous build
systems for it.

On the innovation front, I think it's true for gwt-user, but for the compiler,
I've hardly gotten any external contributions for optimizations and almost all
of the improvements in speed and code size have arisen internally. This may
indicate that we should run separate open source projects for the
compiler/tools and the libraries, splitting them up and separately managing
them.

But the GWT team IMHO was never elitist, just saddened that contributions were
piling up and we lacked the bandwidth to review and commit them in a timely
manner. I feel bad about it, given the time people put in, and I've been
spending time recently trying to collect all outstanding patches for landing
into GWT 2.5.

~~~
stephen
Hi Ray. You make good points, more than I can justly respond to right now.

Briefly, I did not know GWT was initially not used widely within Google--I had
assumed it was from near-day-1.

The re-open sourcing sounds cool, although it would be interesting to see how
it could play nicely with the internal build-from-head system you guys have.
E.g. to avoid effectively forked projects.

Speaking of building from head, I'm sure it's a net win, but, as an outsider,
the handcuffs of backwards compatibility seem overly tight. More frequent
major-point releases that could clean up cruft might be nice. Not sure how you
guys handle major-point releases internally? If at all?

And, yeah, elitist was too strong, especially to apply to individual
developers. However, not just recently, but over the lifespan of GWT, it
hasn't spawned an external dev community (AFAIK), so it seems like something
is off.

~~~
cromwellian
The history of GWT was that it was started by Joel and Bruce and acquired by
Google. Internally, at the time, Google had been using Closure Compiler and
millions of lines of Javascript code, so just from inertia, there would not
have been much use in the beginning, because it's not like the GMail team is
going to rewrite GMail in GWT over night. Really, the first high profile
consumer facing project done with GWT was Wave. AdWords is also GWT, but not
very sexy.

I come from a background of using Maven to build my projects, and Google's
internal build system is somewhat maven like, but it doesn't let you specify
versions in dependencies, so you always end up depending on HEAD. To me, this
is the root problem making it hard for projects that live simultaneously in
the open and closed worlds.

It would be interesting to see how the Guice team handles it, but maybe
they're patch velocity is small.

As for why GWT didn't get a huge external community of committers? I do think
it has something to do with the fact that it is a gated community, that people
feel like they don't "own" it, Google does. Maybe re-open sourcing it and
rebranding it as "Open Web Toolkit" or "Community Web Toolkit" would somewhat
remove those mental blocks.

I would love for the open community to be true owners of GWT, and Google as
just a contributor. I've been lobbying to make it happen, and I hope it does.
Too many external people have put in a lot of work, they deserve it.

------
vineet
Highlight: Stop submitting patches to Guava - it is too much work for us.

My Favorite response (from Martijn Verburg): "...could you guys work with the
community to teach them to submit better proposals/patches? Many open source
projects are able to do this from the Linux Kernel through to hobby projects
like PCGen. Perhaps talking to their committer teams might give you some
insights."

~~~
drats
Interestingly I heard that the Java people at Google rail against using Python
for large projects because they supposedly get out of hand..

~~~
jey
What does it mean for a project to "get out of hand" in this context?

------
spaznode
Maybe I'm alone in thinking this but having been in the position of reviewing
more than a few non-trivial bugfix patches myself I think I might tend to
agree with Kevin.

Sure it's great people are excited and want to contribute but all that
excitement is due to the love and care people sweated in to making every
single line in that codebase as perfect / performant / easy to understand as
possible.

Patches almost never add to something like that. ESP not on such a small
focused library. Truth be told most of the time on open source projects you're
accepting patches simply to get more community involvement and acceptance.
Guava doesn't need acceptance, it has been lovingly accepted already. If you
want open armed love go to apache commons.

If you want perfect performant code you can use and trust consistently go to
guava.

I'm grateful and happy that it exists and it is a pleasure and delight every
time I incorporate a little bit more into my codebase, slowly.

~~~
cheatercheater
Gee, how does Linux ever make it! All of it is contributed! Oh: Torvalds just
sits down and comes through on looking at the submissions.

------
dhanji
A rebuttal: <http://rethrick.com/guava>

------
st3fan
Am I the only one who is really annoyed by links to Google+ that can only be
seen after signing in? I thought it was considered bad style to do that for
NYT links here. Maybe the same holds true for G+ links?

~~~
dewitt
This post should be publicly visible without being logged in (at least it is
for me). But this is the second HN thread in a week where I've seen a comment
like this, so can you send me your details (web browser, etc), so I can debug,
please?

~~~
protomyth
Visit the link on an iPad.

~~~
dewitt
Thanks—will take a look.

------
peeters
You really have to take the good with the bad when it comes to Google
sponsored libraries.

I'm usually won over by them because Google does truly great Java API design,
their releases are relatively high quality (there is some assurance of quality
when it's used internally at Google) and their libraries almost always enforce
good design (they don't accept things that you would consider helpful if they
think it will be easy to use incorrectly or abuse).

But with that comes the bad. If something's not helpful to Google, it won't
have sponsorship to be added to the library. The library will always support
only versions of Java that Google internally uses (Kevin has said before that
it is unlikely that Guava will be expanded to cover even Java 6 any time
soon).

So I'm enjoying using their libraries while they are current, but am fully
aware that they might need to be forked eventually.

------
ary
The point that seems to be missed here is that Google is eating their own dog
food. Doing such they are hesitant to fix what "ain't broke." Were this merely
code that was being thrown over the fence from time to time I'm sure you'd see
a higher patch adoption rate.

------
robryan
It seems there is open source which is setup for community contribution and
open source which isn't. We tend to only really think of open source, ideally
at least, in terms of projects that allow community contribution.

From what I have read Android is pretty similar? Very hard for developers to
actually get some of their code merged.

Wondering now what other Google projects are like for outside contributions,
Chromium etc.

~~~
MatthewPhillips
Very few Google projects allow outside contributions. Go is one that does; no
surprise there of course. There is an attitude pervasive at Google that open
source is a great marketing tool, but that it's a one-way street. I think it's
because of the high employment standards at Google; they cannot fathom how
someone without an @google.com email address can do better than them.

Dart is the perfect example of this: developed in secret, in a dark room, and
then dumped on the open source community. When no one was excited about it
they shrugged their shoulders, confused about what they had done wrong.

I have no interest in Java, and I really hope someone forks this project and
treats it like a real open source project.

~~~
DannyBee
I'm not sure where you came up with any of these assertions, as they are all
simply false.

First, plenty of Google projects allow outside contributions. There are over
1400 open source Google projects (The number is actually much larger, but i've
only counted those that wanted to be identified specifically as Google
projects), and >98% of them allow contributions the last time I looked. In
fact, it's easier to simply list the ones that don't than the ones that do. I
have to imagine you have your own list of what you think are "Google
projects", and are working off of that when you made your assertion.

Second, there is definitely no "attitude pervasive at Google that open source
is a great marketing tool, but that it's a one-way street". As the guy
generally responsible for helping teams that want to open source stuff, I can
tell you that in 6 years, I've run into this attitude maybe 5 times out of the
(again) 1400+ projects that got released. That's not to say everyone open
sources stuff at Google for the same reason, but there is certainly no
pervasive attitude like you describe. The reality is a lot of folks at Google
have released open source projects for a lot of reasons, and of those reasons,
"marketing tool" is pretty far down the list.

BTW, none of this is to say I agree with Kevin's approach to running Guava; I
don't, for various reasons. But in the end, he (and the rest of the guava
folks) are the ones doing the work, and the issues here will work themselves
out in the normal way (either people will grudgingly accept it and keep using
guava, or some fork will become more popular eventually and take over). In
either case, Kevin making a clear statement on the situation helps move things
along one way or the other, and is a lot more than you can get out of other
OSS projects that do something similar.

~~~
bradleyjg
What do you mean exactly by "allow outside contributions"?

Do you have stats on the number of projects with at least one commiter that
doesn't work for google? The number of projects with non-trival LoC written by
non-employees? The number of projects with an external mailing list as _the_
place where project decisions are hashed out?

~~~
DannyBee
1\. I mean are willing to accept outside contributions of code if people
submit them, and put them in the codebase if they are acceptable.

2\. I have the stats, but given that the _vast_ majority of open source
projects (Google or otherwise) don't ever grow past a few people, I don't see
why it would be relevant? I also don't see why it's relevant whether they work
for Google or not. It's not like we require projects follow a different
process for Googler committers vs non, so i don't see how it's any different
from a project where the committers are all really good friends who work on an
OSS project together. We also hire a lot of committers to our open source
projects. I'm guessing you want to make a distinction between "corporate open
source projects", and "non-corporate open source projects", but in reality,
making such a distinction would be a mistake, because the typical differences
are in policies and preferences, and they apply equally well to either. IE
What matters is the policies the project applies to committers and
contributors, not whether they all work for the same company.

3\. Again, I have stats, but for the vast majority of open source projects,
just because most are willing to accept them, doesn't mean anyone ever
contributes. This has nothing to do with Google, of course. If you look at the
hundreds of thousands of projects on say, sourceforge, you will find the
number that have either at least one not-same-email-domain-as-owner committer
or non-trivial LOC written by a not-same-email-domain-as-owner committer else
is quite low.

4\. This seems to be a governance and social issue, I don't track it formally,
as it would be quite difficult to do so. It would also be wrong for us to try
to force a model on folks. We give folks info about what we thinks are best
practices, and in fact, free copies of the producing OSS book (Karl used to
work with us :P), how they run their projects is generally up to them. We are
happy to give them advise when asked, and are happy to consult in general.

~~~
bradleyjg
Thanks for your reply. I appreciate the engagement.

\---

I think you set up a false dichotomy between corporate open source and non-
corporate open source. A better division would be between single organization
projects and multi-organization projects. For example, OpenJDK is very much a
corporate project - but in addition to Oracle, IBM, Apple and RedHat all have
people heavily involved in development. That means that if Oracle were for
whatever reason to lose interest the project wouldn't necessarily die (leaving
aside patent issues).

On the other hand look at GWT, see in particular your colleague cromwellian
excellent comments upthread. Here was a technology that Google was at one time
devoting a great deal of resources to and was moving quickly and in exciting
directions. A lot of companies built businesses on top of the GWT library.
Now, for what seem like very good and valid reasons, Google has scaled back
its efforts on the project. That's fine, but there is no community ready to
pick up the slack. The reasons there is no one to pick up the slack are not
technical or legal, but as you describe it "governance and social issue[s]".

When there is no path to becoming a committer _, you are walled off from
discussions about the future of the project and your patches are accepted
reluctantly at best - why spend the time to deeply familiarize yourself with a
codebase?

_ Except by being hired away by Google, which isn't exactly going to make your
company thrilled to sponsor your work on a project.

------
MatthewPhillips
tldr: It's more difficult to maintain a Java util library than it is to
maintain the Linux kernel, so patches are not welcome. They'd love the
community to do their bitch work though.

------
zem
a better point is the preface to the guava project docs:

"The Guava project contains several of Google's core libraries that we rely on
in our Java-based projects"

given that, it is fair to say "this is essentially an internal codebase, and
we would prefer to develop it ourselves so that it fits our internal practices
and standards; however, it is an extremely useful set of libraries, and we are
happy to share it with the open source community so that you can use it too,
if you like"

------
spaznode
Sounds like a good article, wish I could read it on my mobile iOS device.

~~~
michaelneale
For those who can't read the G+ post (yes, it can be a pain):

\----------

The story with #guava and your patches

Guava users,

Many of you, when you request a feature for Guava, have submitted a patch to
us with the implementation (or even pasted code directly into bug reports).

And we have almost never accepted any of these patches, let alone even read
them. And I know this makes us look all manner of self-absorbed, arrogant and
unappreciative. That's what I'd think in your shoes. So it's time I tried to
explain to you more fully why it's like this.

I realize that from your perspective, you're handing us a shiny new feature on
a silver platter. It should be making our decision easy, since the work is
already done. It's a gift of your time and effort and you've already solved
the problem and all we need to do is just accept it! Looked at that way, we're
either idiots or jerks for not being interested.

But here's the part that I don't think many of you understand: the work you've
done to produce that patch is actually minuscule compared to the total amount
of work we have to do to put it in Guava. I know that it feels to you like
you've certainly gotten us more than halfway there, but trust me, it's only
scratched the surface.

\- We have to work out whether the problem it's trying to solve is truly the
right problem \- We have to work out whether the solution presented is truly
the best solution we can come up with \- We have to find evidence in the
internal Google codebase that users will actually use the proposed feature if
we create it. If we are adding methods to our libraries that don't get used,
it hurts our case when we try to argue to management that we're doing
important work and need more staff. \- We have to figure out how it relates to
the piles of legacy code we have floating around our libraries (that you,
lucky folks you are, don't even see!), and how we would deal with migrating
those users if they exist. \- We have to decide the best name and location for
the new API. This is hard! We spend a lot of time in our API review meetings
just batting names around. \- We have to review the code deeply. Our code
reviews are grueling and go on for many rounds. When you look at the code in
Guava it tends to look "obvious", but we work very hard to achieve that
quality. It's only obvious in hindsight. \- In almost every case we have to
completely rewrite the javadoc that first gets submitted to us. And this is
very hard. Writing good documentation is probably the biggest challenge we
ever face. \- The tests that were first written are rarely sufficient; we're
going to need to add more. When we do, some usually fail. \- If the change
touches on any existing functionality, we have to submit it to Google's global
submit queue and analyze test results from many thousands of projects to make
sure we won't break any internal users with it. \- If the change goes in, we
have to deal with the machinery that gets that change integrated out to you in
Guava. \- We then become responsible for fixing any bugs with it that come up
over time, and dealing with the related feature requests it will touch off. \-
And the code never "stays finished' in general; we are constantly performing
various maintenance tasks over our whole library (or even the whole codebase
of Google), to make various cross-cutting improvements, and every bit of new
code added increases that burden.

There's more I'm leaving out, but you get the idea. Guava did not get to the
level of quality it has by accident, but by us being really obsessive about
these things.

Now, when the patch comes from outside Google, we have additional mechanical
overhead. One of us has to sponsor the patch as if it's their own, converting
it into an internal patch that can merge correctly (which isn't always as
trivial as it sounds), and sending it for review to another member of the
team. And because we are the ones most familiar with our own style,
conventions, practices and pitfalls to avoid, etc., sometimes just doing that
plus "cleaning up" the code to get it ready for review is already more time-
consuming than if we had written it ourselves from the start. That doesn't
even mean that the code sent to us in the patch was bad. It can be very good
by most standards but still need a lot of rework for our purposes.

Remember, if your feature is valuable, then we're going to want it in Guava
whether you provided a patch or not. Providing the patch doesn't make it more
likely that we'll decide it's a good fit for Guava -- if anything it just puts
us more on guard against that seductive temptation to think "but it's already
mostly done anyway, might as well!"

And here's the last thing. Be honest: if you were going to sign yourself up
for doing all that work above... wouldn't you at least want to have the
pleasure of writing the code for it yourself? I love writing code -- that's
why I do this! -- but such a large majority of my time goes into activities
like those described above. If my job were all about just applying other
people's patches, I would inevitably start hating it after a while. Let me
have some fun sometimes, okay? :-)

I really hope this helps to understand why your patches seem to go into a
black hole. I know that no matter what I say it will probably continue to seem
unappreciative and condescending, and I apologize. I do recognize that you are
just trying to help. But, if you really want to help, then keep an eye out for
the times when we will ask for help on a particular issue, because that's
where your time and energy will really do the most good!

Rantingly yours, KB

------
vilda
Branching.

A simple feature everyone is trying to avoid, but sometimes it's like am open
door from golden cage.

Guava is the "from inside out" project presented as "take it or leave it". I
would not personally beat Google because of their attitude towards changes.
But they have to state it explicitly, otherwise more contributors - after so
much work they invested - will feel betrayed!

------
cpeterso
> _If the change touches on any existing functionality, we have to submit it
> to Google's global submit queue and analyze test results from many thousands
> of projects to make sure we won't break any internal users with it._

Is there any public info about "Google's global submit queue"? I would love to
learn more about such a huge automated test system.

------
nolliesnom
Guava is a work of art created by the team that maintains it; what's the big
deal if you can't add your own code to it? Merely a lost opportunity to
advance your own vanity?

Respect their boundaries and let your experience inform your feedback to them.
If you read what Kevin is saying, it is clear they are interested in hearing
if, how, and why their library is helping or hurting your own project. They
will probably listen if your feedback provides the answers they seek.

------
spullara
The hard part of guava is deciding what to include and exclude and they base
those decisions mostly on what is useful at Google. The code is easy and
handing them a patch is pointless.

------
steinbrenner
"Sam Berlin - Disclaimer: I don't know what patches to the Linux Kernel are
typically like, nor PCGen. But, +Martijn Verburg, there's a pretty big
difference between submitting a patch to a library and submitting a patch to a
"project". Patches to libraries are typically changes/additions/removals to
the API, whereas patches to 'projects' are typically changes to the internals.
It's a whole lot easier to change the internals of something than it is to
change the API. Changing the API means the effects can bubble outwards.
Changing the internals is usually just optimizations or bug-fixes."

haha wow. this is why I love java programmers

~~~
jey
How is this Java-specific? Same problem exists in any language.

------
codeonfire
The simple solution for dealing with Google-scale bureaucracy is to fork and
continue pushing forward. Then when that fork gets locked down, create a new
one. Not every project is going to be bureaucracy impaired, but there is a
correct procedure when it is. Multi-round "grueling" code reviews, and API
review meetings? W.T.F.

------
1010011010
Java is stupid.

------
eta_carinae
Guava is one of the best Java libraries available today, and the fact that the
bar for submitting patches is so high is a simple consequence of that.

You can't have it both ways.

