Hacker News new | past | comments | ask | show | jobs | submit login

1. Why does Google develop their own tools?

2. Why doesn't Google open source them?




For point (2), Google did open source Bazel https://bazel.build/ which is essentially just the internal build system Blaze.


> 1. Why does Google develop their own tools?

Considering the age of the articles and tools themselves, they probably predate everything you're using these days. All the GitHubs and other shiny SV startup things.

IIRC they also don't ever want to share code with other companies which makes most of the SaaS offerings a no-go for them.

Hence why it might not make sense to use the same tools as they do :)


(googler here)

For 1 I guess it also boils down to the mindset you approach a problem. On my past companies when you have a problem the first thing you do is to find a OSS or product that solves that for you and even when you don't find you try to reframe the problem in a way that will fit that solution that you already have in your mind, at google it normally involves on you solving that by yourself or reusing much lower level of abstractions.

This doesn't mean that you will need to always create your own database, but when you really need to you have the skills to do a decent job.


Many of the things that are trendy in the industry today were invented by Google years ago, and Google had to invent them because they didn't exist at that time.


> 1. Why does Google develop their own tools?

Often there's no OSS version yet.

Usually they have better performance.

They want cluster and multi-cluster services.

> 2. Why doesn't Google open source them?

Mainly dependencies.

But they release academic papers that OSS implements.


> 1. Why does Google develop their own tools?

Because it benefits Google.

> 2. Why doesn't Google open source them?

Because that would benefit the competition.


Disclaimer, I work for Google, but what follows is my personal opinion.

Google open-sources a lot: Tensorflow, K8S, Apache Beam, ... . And even if it doesn't end up as fully open sourced project, Google still releases white-papers on the subject that allows startups to create something similar (Cockroach-DB for instance).

However, while I admit that some decisions might be made to avoid benefitting competition (I think, that kind of stuff is way above my pay-grade), some things cannot be open-sourced for purely technical reasons (without a complete rewrite, that is). For instance, within Google everything is a protobuffer, and tools rely on that assumption heavily to work. Outside Google people don't use protobuffers nearly as much and the usage of those tools would be very low.

Other tools are tied to Google having many datacenters and multiple fibers between each of them for redundancy. Like Spanner, which also requires atomic clocks to work properly.


What Google open sources and doesn't open source is strictly a business decision. The technical details don't matter. Things like Spanner remain proprietary because Google thinks they can make money with it. They charge $10/hour just to have the replicas up and serving traffic; letting Amazon install it and charge $9.95/hour is not something they think is going to make GCP a lot of money, so we don't get the source code. Things like Kubernetes, on the other hand, are open source because Google wasn't "winning" in that area -- to get people to use GCP, they had to get people to break the dependency on proprietary things like CloudFormation. Otherwise, people would just stay at the market leader and not switch to the second place option.

Plenty of people outside Google use protocol buffers, for example; I've run into them at every job I've had since Google, and in plenty of strange places that probably never cross-pollinated with Google (the most surprising place I found them was in Hearthstone). They're pretty popular and people aren't really surprised to see them anymore.

I think there is also a middle ground where people inside Google don't think there's interest, but there is. For example, I very much miss Monarch. I don't think the code is making them a lot of money; my understanding that the Cloud monitoring stuff is completely different. But it is way better than Prometheus or InfluxDB. Queries that are trivial in Monarch you simply can't do with those products. (The one thing I found most valuable in Monarch was that pretty much every query started with an "align" step. And I just haven't seen that anywhere else, so it's hard for me to reason about what the query is actually doing.)

As other people mention, the mere task of picking the transitive closure of dependencies out of google3 is hard. In fact, maintaining a bunch of non-monorepos is a huge chore compared to monorepos once you have the right tool. It's thankless work, literally, so I can believe that's one reason why there aren't more internal Google tools open sourced. But, it can be done if there is some thanks for doing the work. When Google split into Alphabet, work was done to let companies leaving Google take their chunks with them. There just had to be some sort of business reason to justify the tedium.


Same reason stuff gets deprecated. Just maintaining the code against the dependency tree is work. Trying to make it functional in the real world that doesn't have the 50 billion libraries it relied on that are also internal only (or entire systems) is work ^ work.

Unless its low level infrastructure or based on a research paper, its not an easy task.


(Googler here)

> Because it benefits Google.

Developing internal tools does not always benefit Google. Sometimes there is just no alternative at the time it's needed, so Google has to develop something that might become a liability later, accumulating technical debt and stagnating compared to a newer open-source shiny thing. Sometimes it's NIH syndrome, or, in other words, wariness to adopt external solutions there's no control over and that do not fit Google very well. However Google does have a healthy internal ecosystem with clear product life cycle and balanced planned/organic change.

> Because that would benefit the competition.

Usually Google benefits from its protocols/tools/projects being used out in the open (TensorFlow, gRPC, Kubernetes, Angular, Android, Chrome), and that includes competitors.

There are plenty more mundane reasons this does not happen:

- Some of the tools are so dependent on the internal ecosystem that it would make little sense to open source them in a standalone way. There are many things at Google that only exist in one single deployment in the world and turning them into a deployable product for another setting would be a huge task without clear purpose. Also, it's hard to opensource operational knowledge and expertise.

Google Cloud is an example (positive or negative, depending on your point of view) of efforts to repackage many internal services in a way that is well-documented, supported and accessible to anybody.

- Open-sourcing is a spectrum: just throwing code over the wall (the worst one), controlling a project and allowing external contributors, cooperating with other big companies on a standard solution, supporting a more hobbyist-oriented project in line with its own needs. Every option has its own set of challenges and coordination problems. It's not always easy to reconcile the internal development model and open source workflow. It's not always easy to keep delicate power balance of working with a project instead of taking it over by engineering weight and influence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: