Hacker News new | past | comments | ask | show | jobs | submit login
An ex-Googler’s guide to dev tools (sourcegraph.com)
459 points by azhenley on Nov 26, 2020 | hide | past | favorite | 211 comments



> Many years ago, I did a brief stint at Google. A lot has changed since then, but even that brief exposure to Google's internal developer tools left a lasting impression on me.

Perhaps the operative phrase being "Many years ago". I currently work at Google, previously I worked at Square. Of the two, I generally prefer the OSS and off-the-shelf tooling at Square. Some things really are better at Google (code search and Blaze are definitely a big improvement over what we had), but many of our monitoring and CI tools feel antiquated and cluttered with inscrutable debris. Or take Gerrit, where each change has 3 different IDs and the developers decided that every workflow status should be expressed as an integer in the range [-2, 2] for some reason [1].

They were probably amazing (and much simpler) tools back in the day, but the world has moved and we're somewhat constrained by what's familiar.

[1]: https://gerrit-review.googlesource.com/Documentation/images/...


> I currently work at Google, previously I worked at Square. Of the two, I generally prefer the OSS and off-the-shelf tooling at Square.

I wrapped a 2-year stint at Facebook recently and my experience was similar to yours.

Some tools were amazing and cool and I really appreciated how they evolved over time to manage huge amounts of resources. But most were subpar when compared to OSS tooling - outdated, deprecated functionality, very little documentation, very few people working on them. The common approach was to learn about "the duct tape" way to make something work, then pass it on to new engineers.

An example would be tool X for working with diffs (PRs). It's the latest and greatest, except that it only covers 75% of its predecessor's, tool Y's, functionality, so you end up learning both. Tool Y has been "deprecated" for the past 3-4 years. Some of its features don't work, but you'll only know when you try and execute them.


You have to make some allowances for first movers. The designs of new tools are often ill conceived because they have no templates to steal ideas from. But the emotional investment often keeps them from switching to better tools because We’ve Always Done Things This Way.

The thing that bugs me most about my company’s tools are the ones that are enough younger than OSS solutions that someone either didn’t look very hard, or didn’t want to find anything (so they could write their own). Other priorities come along and those tools eventually can’t keep up with OSS alternatives, but the apologists take over.

You will not continue to get accolades for tools you wrote three years ago. The only “value” you derive is the time and effort it saves, offset my the effort expended. The time and effort expense of external tools are often lower. And, when the tools are annoying, you can commiserate with your coworkers instead of being the target of their criticisms. Which is often under-appreciated.


> in the range [-2, 2] for some reason [1].

This was actually pretty useful in OpenStack back in the day. The reviewers who couldn’t yet approve code could only put +1s on there (and there were a lot of low quality reviewers who slapped these everywhere) so it was very obvious when a patch still needed attention from a commit-privileged dev.

Then -1 was standard review feedback of stuff that needed improving and -2 would come from commit-privileged devs when the patch fundamentally didn’t fit with the project direction.


Yes, that range looked odd to me at first glance. But your explanation made me realize that I actually use it all over the place. Except on a 1-5 scale. Starting with good enough at 3 (or 0 in this case):

1 = WTF?

2 = Needs Improvement

3 = Good Enough

4 = Better than Good Enough

5 = Almost Perfect

Daniel Kahneman recommends a scale like this for evaluating job candidates across core competencies. It's a surprisingly powerful little heuristic.


The [-2,2] scale made immediate sense to me because it's zero-indexed:

-2 = WTF

-1 = Needs Improvement

0 = Good enough

1 = Better than Good Enough

2 = Almost Perfect

It's even intuitive, since anything above zero is 'above and beyond' and anything below zero isn't good enough yet.


The system makes sense but it feels like there is probably a more intuitive easy to express those than with numbers. Naming is hard, of course, but off the top of my head maybe something like

-1: Changes Requested

-2: Declined

1: Approved

2: Accepted


Why would that matter? This is for developers, not English majors.


I didn't down vote you, and your question deserves a serious reply because the same principle is broadly useful.

Communication is hard. People actively in the team probably have the context to understand the differences, but that implied context is what tribal knowledge is made of.

OP indicated that there were non-trivial "non-linear" differences going from 1 to 2 and -1 to -2. The use of numbers implies a relationship that doesn't really exist. The use of accurate language makes the actual meaning immediately obvious without the need for implied context. For example, I had no idea that only privileged devs could give 2s. "Accepted" and "Declined" do imply a finality that more accurately mirrors the intended usage.

This may seem overly analytical for something that is easy to just explain, but this kind of thing build up in layers.


Exactly this. It's not as if two +1s equals a +2 (which one night intuitively think based on the nomenclature).


Note that Critique (Google's main code review UI) doesn't have the same issues that Gerrit apparently has.


I work at Google and mainly use Gerrit because of the products I work on and yes, Critique is miles better.


Critique seems a well-guarded secret inside Google. Can't find more information on it. No images. Interesting


Not really. The "Software Engineering at Google" book[1] has a whole chapter on it with screenshots.

Gerrit, Reviewable and Phabricator are very similar, except that Critique is much more polished.

Having used all three, Gerrit works best and the new UI is starting to look pretty good. Reviewable has great UX, but inherits some of the disadvantages of GitHub PRs.

[1]: https://books.google.de/books/about/Software_Engineering_at_...


Your link just goes to the Google Books homepage in my browser


Same for me. I think it's https://books.googlecom/books?id=WXTTDwAAQBAJ. It has no preview of the Critique-related fragments.


It's a reimplementation of an earlier system called Mondrian, which you can find some information on.


I used Mondrian for the last couple months it existed, and I recall critique being fairly different.

I think they kept the same keystrokes, though.

That is one nice thing about Critique is although it's a webapp you can drive it pretty much entirely by keyboard. It has the feel of an old usenet newsreader like nn, in a way.


Here I'm sitting here thinking could I draw it from memory?


It went through a re-design recently. I don't think I would call it a well-guarded secret though, if you look at Phabricator's Differential, it is very similar [0].

[0] https://www.phacility.com/phabricator/differential/


Reading through that page it seems like every single component is an unrelated, seemingly random, “cute” name. Are cli, triggers/actions, and rules too generic? Out of context, what meaning do Arcanist, Maniphest, or Herald have?


Related answer from one of the developers: https://www.quora.com/Phabricator/Phabricator-Why-is-the-Bro...

But yeah... even I find the names difficult to handle and remember, even after using Phabricator for more than 4 years now. Nevertheless, it is a great piece of software, especially the code review process is top-notch.


I agree - grafana is wayyy better than viceroy and the other monitoring dashboards that were available when I worked at Google.


Interesting. Viceroy can put way more pixels on the screen than grafana with the same cpu and memory usage, and panopticon is specifically written to send little traffic over the wire, so it’s useful to an oncall engineer over a poor mobile backhaul like GPRS. Grafana is a massive resource hog in all dimensions.


Hasn't pcon been deprecated for a year and a half now? :P

More pixels != to more information. The visuals and varieties in ways to present data in grafana always seemed more intuitive for me, being able to lay stuff out in different ways was great. I worked on a G-on-G project at Google and used both grafana and viceroy. I never had an issue with grafana on mobile, but I was never was able to see viceroy on a mobile browser because of the Corp network.


On viceroy if you unintentionally put 2000 lines on a graph, you get a visual mess. If you do that on grafana your tab will hang for ten minutes. Big difference.


True, but that only happens when setting up graphs. Graphs are not changed as often as they are viewed.

I'm sure viceroy is more performant, and I'm not denying that, I may also be scarred a bit because I found writing mash queries an arcane nightmare, and those are related to viceroy.

The default (rpc latencies and etc) views and metrics you get are amazing however.


You can also get tons of lines when modifying group by expressions, which you can do ineffectively.

Mash is indeed arcane, although gmon (viceroy) makes it worse than usual.


Gerrit is configurable and it is common to use it like Critique with blocking comments and LGTMs.

It's still a big step up from, say, GitHub PRs, and the UX has improved significantly.


> They were probably amazing (and much simpler) tools back in the day, but the world has moved and we're somewhat constrained by what's familiar.

Exactly my feelings about Phabricator. GitLab simply runs circles around it if you consider the issue tracking, CI etc. as part of the problem.


We moved from Phabricator to JIRA, Github and CircleCI. We couldn't be happier. It's one thing to have a single tool with 70% of the functionality you need, it's another to have 90%+. That 20% makes such a huge gulf it's not even funny. We were able to ditch several thousands of lines of cobbled together glue that moved us 10% forward but added overhead for any new project which set us back a week or more.


I would've loved to have Gitlab at our company, but unfortunately we're on a tight budget so they didn't want to go for anything paid.


Gitlab CE would be perfect, then, since it's free. If you're on a tight budget it actually seems like one of the best options available.


Tried self hosting once and it was challenging just to get setup. That said their free hosted solution is feature such enough for most of my projects.


I self host it at home. Haven't had problems but probably only because of how small my instance is (10 people). The omnibus installer makes it relatively simple to maintain!


> integer in the range [-2, 2] for some reason

This sounded awesome to me (imagining some Fortran code behind it), but disappointingly the cited screen shot only has the range [-1,0,1] which is hardly the same.


I think this might come from a typical feedback on patches in Apache (or all of OSS) - -1 rejected as is, +1 approved. I've also seen -0 (don't approve but won't block/reject) and +0 (ok with it but didn't review/won't approve)


-2 and 2 can be disabled, which I guess is the case on the screenshot you saw.


Yes please, Github PRs are much, much better than Gerrit or Ritveld (urgh), and the +1/+2 quirks of it

No, I'll take Github PRs any day. They could be better, of course.


Github PRs (at least on the enterprise version) still have the problem of collapsing comments where the associated line of code changed in some way that had nothing to do with the comment. They also require a lot of scrolling if the overall diff is rather long. And, they still don't provide a way to comment on commit messages themselves.


Collapsing comments when the code changes is the correct thing to do 90% of the time in my experience, and for that other 10% you can manually mark is as unresolved. It's the correct default IMO.


> Collapsing comments when the code changes is the correct thing to do 90% of the time in my experience

Why not just leave them uncollapsed and allow one to mark them as resolved when the change is actually resolved. The other problem is that it's difficult to see what actually changed when a line is marked out of date without having to scan through the entire updated diff and then jump back and forth between the diff and conversation view.

Collapsing them by default makes it difficult to find the comment by scanning. If they were uncollapsed until I marked them as resolved, then I could at least use the find feature in the browser and search for my username to find my comments.


Agreed on the comment collapse, about comment on the commit message I'd say it's a commit on the main PR (since by the time the PR is done the commit msgs are pretty much not changeable, except for the last one)


> about comment on the commit message I'd say it's a commit on the main PR

Except that if the overall diff line I choose to comment on about the commit message changes in some way, then the comment is collapsed, which makes it difficult to keep track of what needs to change.

> since by the time the PR is done the commit msgs are pretty much not changeable, except for the last one

They can be changed via a git rebase.


Stacked reviews are the crucial thing missing from GitHub pull requests. Truly, once you get used to working wit hthose, going back is intolerable, to the point where I now have a hacked together workflow where I live in git rebase -i and have it automatically push specially named branches for each commit in a stack.


So a few months ago I actually found a tool for this! The tool makes it easy to manage stacked reviews on GitHub. It's working super well for me. I also showed it to a bunch of people I work with, and a lot of them have taken it up.

Link: https://github.com/ezyang/ghstack

If you're interested, I'm happy to talk you through it - just book some time here: https://calendly.com/ericyu3/15min


At my place of work, we've sort of implemented that feature by using fixup or squash commits. That is, if a comment is made on a diff line in a PR, use git blame to determine which commit was responsible for that line of code, update the line and commit it with a commit title like: fixup! <original commit title>. If we want to update the commit message, we use "squash! <original commit title>" and put the updated commit message in the commit message body (using the --allow-empty switch when committing so that we don't have to change any code to make a commit).

Then, we can run git rebase -i --autosquash --keep-empty to handle applying the changes requested to the correct commits at the end of the review process.


What are stacked reviews?


You have a task to build a Thing, which involves creating the basic "Thing framework" and then implementing subfeatures A, B and C of into/on top of it

Maybe you'll build the framework & subfeature A together, because you need some bit of meat in there to properly figure things out and be able to test things

On Gerrit you'd probably create two commits which get pushed as two CLs

* CL1 Create Thing framework * CL2 Implement A for Thing

They're now pushed for people to review. While they're looking at them, you continue working on, making CL3 implementing B

* CL3 Implement B for Thing

One of your coworkers points out an issue in CL1, so you fix it (by amending the commit) and repush the stack. Now your stack is

* CL1 Create Thing framework [v2] * CL2 Implement A for Thing * CL3 Implement B for Thing

CL1 & CL2 get approved so you merge them (though typically with Gerrit they're cherry picked - CLs are individual Git commits). You push up an implementation of CL4, so your stack now looks like

* CL3 Implement B for Thing * CL3 Implement C for Thing

The important point is that CLs are atomic (and are reviewed atomically) even if they depend upon each other (i.e. are a part of a stack). When you're working in Git you typically (unless for whatever reason you have multiple unrelated stacks on the go, which is relatively rare) just work off of the master branch, so all the commits between `origin/master` and `master` - i.e. the set that automatically show up in `git rebase -i` and similar tooling - are your stack. When you pull, you `git pull --rebase` (or rather set the config for your machine or repo to default to that). When you've revised something in your stack or want to add something to it, you just do "git push HEAD:origin/refs/for/master" update Gerrit with the latest version of your stack

It takes a little time to get used to (and you have to learn how to hold `git rebase -i` properly), but when you're used to it it's immensely more productive than the Github/clones PR flow (and doesn't involve manual branch juggling, etc). I can't express just how badly they deal at handling reviews of stacked PRs - either you create your stacked PR targeting master (and it gets all of the changes from the underlying PRs merged into its changes list, which makes reviewing it harder) or you target it at the branch of the underlying PR (which is non-obvious and painful and you have to manually remember to shift it across when the PR beneath it gets merged)

When I'm in this situation when dealing with something which is PR based, I tend to end up merging CL1 & CL2 (because its easier to review things when you have an example) and hang on to CL3 & CL4 on my local machine until the first one gets merged


> The important point is that CLs are atomic (and are reviewed atomically) even if they depend upon each other (i.e. are a part of a stack).

What prevents a situation like CL2 getting approved and merged prior to the resolution of CL1? Or more generally, what ensures that the ordering of commits in a set of CLs in a given stack is preserved prior to merging?

> I can't express just how badly they deal at handling reviews of stacked PRs - either you create your stacked PR targeting master (and it gets all of the changes from the underlying PRs merged into its changes list, which makes reviewing it harder) or you target it at the branch of the underlying PR (which is non-obvious and painful and you have to manually remember to shift it across when the PR beneath it gets merged)

Rather than stacked PRs (which essentially doubles the number of commits since Github (and possibly Gitlab) introduces a merge commit for each PR even if it's a fast-forward merge), would it not be better to just combine CL1 through CL4 into a single feature branch where each commit corresponds to a CL? It would make for a large PR, but it can be reviewed on a per commit basis.


Yes, stacked reviews absolutely are a game-changer for a sane commit history and iterating patches over a few rounds of review.


(Googler here) The article is a good description of the tools, and I was not aware of the blaze clones plz and pants.

One thing that is maybe not obvious: For an API author, code search in combination with a monorepo and the somewhat hermetic universe which is Google's code base provides immediate access to all uses of a library. You can see what worked well and what didn't, and it enables effective refactorings. That also means when writing a client, you can quickly figure out from other client code how stuff is supposed to be used. All this makes code search such an effective tool in Google's development environment (in the broader sense).

If anything like this were possible for open source (indexing code that depends on a library/API, across subrepos in any version control system, in a way that gets near complete coverage), it would enable similar possibilities of systematic improvement.

Alas, it does not seem realistic except in a few niches where the number of clients is bounded and code owners are willing and able to follow a protocol (approve changes to their code that unblock such global improvements.)


> code search in combination with a monorepo and the somewhat hermetic universe which is Google's code base provides immediate access to all uses of a library. You can see what worked well and what didn't, and it enables effective refactorings. That also means when writing a client, you can quickly figure out from other client code how stuff is supposed to be used. ... If anything like this were possible for open source (indexing code that depends on a library/API, across subrepos in any version control system, in a way that gets near complete coverage), it would enable similar possibilities of systematic improvement.

gets me thinking — this is totally doable for open source code. someone just needs to build some giant indexes across git repos + package repos (maven/npm/pip/etc)...

Then write plugins for VScode/intellij, and when developing you can right-click a symbol and “Find uses in open source code”

Does something like this already exist?


Author of the post here. We're actively working on this at Sourcegraph. Up until recently, we've focused on working well on large private codebases, but looking forward, we have two big efforts that will make this a reality on sourcegraph.com:

1. Vastly expanding the size of our global search index to cover every public repository on github.com, gitlab.com, and bitbucket.org. 2. Enabling LSIF indexing (https://lsif.dev) in every major language for compiler-accurate code navigation (go-to-def, find-refs) across repository/dependency boundaries.

The latter is already working on a subset of languages for private Sourcegraph instances, and we want to scale it to the entire open-source world. We think a single search box that covers all the visible code in the world and allows you to seamlessly walk the reference graph is super super powerful and someday will be a thing that most developers use every day.


Neat. Are you considering adding other large source repositories such as the Debian / Ubuntu / Fedora / RHEL / CentOS source package archives? At least the main section of Debian should offer reasonably high assurances that using the sources in that way is legal.


Don’t get me wrong, I think this is very cool, but is there a way to opt out aside from making a repo private?

Do OSS licenses distinguish between corporate usage of repository source as code vs as data.


Licenses that comply with the Open Source Definition or any of the similar definition from the free-as-in-freedom software community inherently must allow this as a legal use.

That said, the community often takes authors' wishes into account.

Out of curiosity, why would you want to restrict this? It's not like Sourcegraph has any exclusivity to the concept or the ability to implement it.


I don't think so. When you consider that you'd have to index every library and every version of it. That's a huge index. Google has the "live at HEAD" [1] mentality so in reality you don't need as much. I don't remember if xrefs for old versions of the code we're retained.

1: https://abseil.io/about/philosophy#upgrade-support


It would be huge but storage is cheap, you could build in phases based on popularity. Maybe paid subscriptions for the full index?

But yeah, it could get expensive to keep the indexes current as things grow.


Yeah but you also have to understand the dependency management systems for every open source project on GitHub to get the right dependency, which is a sisyphean task. Google has the advantage there is only one way to build stuff.


There are only so many package managers out there. Most languages have a "preferred" one which makes things even easier. Stuff built with ad-hoc systems might be more difficult, but I don't think that's the norm.


A while ago I came across Nx (https://nx.dev/) from Nrwl, which sounds a lot like this. It's open source and has a lot of tooling for managing monorepo's. I think they even worked with Google for it. Altough I think the main focus was JS projects (as far as I know).


> gets me thinking — this is totally doable for open source code. someone just needs to build some giant indexes across git repos + package repos (maven/npm/pip/etc)...

Sounds a lot like sourcegraph.


GitHub is moving in this direction (they now show interpackage dependencies at the package level, and intra dependencies at the symbol level.

However, neither works great—maybe 30%-80% coverage?, and far from "in a way that gets near complete coverage" that the parent stated about the benefit of Google's monorepo.

It may be a case where you need 90 or 99 or even higher coverage before it becomes a game changer. But I think GitHub is headed there quickly.


Hi there! I manage the GitHub team responsible for the Code Nav part of this. Re 30-80% coverage, can you elaborate on the biggest missing piece for you? Is it the number of languages that are covered, or too many false positives in the results we show? (Or something else?)

For the latter, it's a known limitation in that we're only doing "fuzzy" or "c-tags-like" Code Nav — defs and refs are only matched by their textual unqualified symbol names. For some languages that's not a big deal, because symbol names tend to be unique. But for a language like Go, you might have dozens (or hundreds!) of methods named "Parse", and we don't currently distinguish those in our Jump to Def UI.

That's something we're actively working on, though! One benefit to the current fuzzy approach is that it's incremental (we only have to analyze changed files in a commit), and does not depend on tapping into a build or CI process. That makes it much faster — we have a long-running indexer service live and hot, and typically have new commits indexed and live in the UI within 1-5 seconds of receiving a push. And it requires no configuration on the part of the repo owner — no need to tell us how to build your project, or to configure an Actions workflow to generate the data. That makes it easier for us to support entire ecosystems when we roll out support for a new language. We've also tried to make it fairly easy for external communities to add support for their languages — this is all driven by the open-source tree-sitter parsing framework. This file, for instance, is where the symbol extraction rules for Go are defined: https://github.com/tree-sitter/tree-sitter-go/blob/master/qu...

We've been working on a new approach to generate more precise symbol data, while keeping those incremental and zero-config properties. It's not ready for public launch, yet, but it's close!

For those who want to learn more, I gave a talk at FOSDEM 2020 (back in the Before Times) going into more detail on some of these constraints and design decisions. (Though please note that this talk references a legacy way of defining symbol extraction rules, which relied on writing code in our open-source Semantic project.) https://dcreager.net/talks/2020-fosdem/


Yes, definitely. And what they already have is often very useful, I find myself spontaneously traversing code bases by going to definitions and references. Quite an impressive (and non-intrusive) addition from a UI perspective.


There are a couple of search engines that do this. Sometimes when I get stuck on a underdocumented Python library, the big G will take me to one via a search on a method or constant. I can then see the code from other open source libraries/projects.

I don’t have any links saved, since they feel kinda sketch (ads all over the place). But it has been very helpful, unlike searching within GitHub for a string that I know is in the repo.


Can I ask a monorepo question: what do you when you’re trying to upgrade dependency XYZ, and a moderate or large refactor is required in another team’s code?


(Google employee here)

If the change is small or can be automated (i.e. changing # of parameters or function names), we run a script to make the change over the entire monorepo. This enormous CL is then approved by one of the Global owners.

If the change is complex (or non-obvious), you generally introduce the new API in one CL, change each use manually (say one CL per team), and then remove the old API in a final CL. In that case, you need each team to sign off. This isn't too hard in practice, teams are generally expected to approve cleanups.


Another Googler here.

We usually have a 3 steps process. 1st new API interface/service using the new dependency is added and runs in parallel with the existing one, 2nd we blast email to users of old one to migrate to the new one, 3rd when usage of the old one hits 0 it is removed. When a team slacks off in migrating (usually we have months between receiving the first notification and service shutdown) things escalate and a director might get involved.

Anyway, this means that migrating to the new interface is clients' responsibility.

However there's an exception: if the changes are trivial (like changing the the name of one of the API endpoints), we have an automated tool that basically performs this change across the whole codebase. These kinds of changes still need to be approved by the codebase owners and shouldn't break any test, so they must be really trivial. In this case is usually the team owning the API/Service that performs that.


In practice it's optimized to a 2 step process, since step 3 almost never triggers.


Fair enough



Also note that "upgrading a dependency" is a relatively rare occurrence in the sense the I understand the question. Google definitely uses third party code but it is outstanding how much is proprietary Google owned and authored. For this code you don't release new versions and update, instead you evolve APIs gradually (as the other commenters have explained).

For third party dependencies there are two main options.

1. Do it all at once. This is good for relatively small numbers of users or small API changes.

2. Introduce the new version, migrate users, then delete the old version.

Note that for third party libraries Google does maintain a fairly strict one version policy so 2 is always a temporary measure.

https://opensource.google/docs/thirdparty/oneversion/


Usually that problem is dealt with combination of:

* Automation.

* Ability to change other people's code safely, possibly gated by code review approval. Cultural and technical (e.g. unit tests must exist and be maintained) barriers may apply.

* Making incremental changes that are backward compatible (at least until you migrated all clients so you can cleanup later)

I recommend this talk https://youtu.be/tISy7EJQPzI (even if the scope is broader than your question)

(ex-googler)


> I was not aware of the blaze clones plz and pants.

At BazelCon 2020, Borja Lorente from Twitter gave a talk [0] about how they were migrating away from Pants to Bazel. Seeing as Twitter was heavily involved in the development of Pants, not sure what that means with respect to adoption and evangalism once they complete their migration.

Lately it seems that Bazel has basically won in the space of open source projects using Bazel clones as more and more companies switch over.

[0] https://youtube.com/watch?v=0l9u-FIaGrQ


The one tool everyone missed was marketing with being able to call yourself an ex-googler being the biggest tool in the toolbox. Let me share the other side of the diaspora (I dislike this term). Tons of former X, pick your popular Silicon Valley company, employees with little else to sell other than that they worked foe x company. The vain CEO hires them hoping that they’ll sprinkle some magic fairy dust on their dysfunctional company. Never mind that the reason it’s disfunction is because the CEO hired his buddy from high school with a degree in poly sci to run things and is having an affair with the head of hr. Our intrepid developer constantly mumbles “we’ll at x we used to do whatever.” Afraid that the CEO will realize that the fairy dust is just a combination of cool aid and glitter starts throwing people under the bus and the knives come out.


I have one experience with that. Guy, late 20's or early 30's at best, ex-Googler joins a bank and becomes CTO. Pushes for them to replace Angular (which they went all in on, migrating from a set of JQuery UI components) with, drumroll, Polymer.

Nothing finished yet, performance issues (because guess what, polymer's routing turned out to be pretty much the same as a 'tab panel'), aaand Polymer 2 rolls around, backwards incompatible. Scramble to make all their components suitable for Polymer 2, and they're just about to sigh a breath of relief aaaaand polymer 3 rolls around and becomes deprecated in favor of lit-html at the same time.

Moral of the story: Ex-Google doesn't mean shit, and don't force the use of experimental technology for your whole fucking multi-billion company.

They should've stuck with Angular, it's fine, or migrate to React instead which has / is becoming an industry standard.


One of my worst decisions as a developer was to build upon a component framework maintained by Google.

Material Components for the web is constantly introducing breaking changes that make building upon the framework a nightmare. Lately they've even set this expectation in the project description. The problem is that the components are rather buggy, so you must regularly update them to have upstream bugs fixed.

> Material Components Web tends to release breaking changes on a monthly basis, but follows semver so you can control when you incorporate them. We typically follow a 2-week release schedule which includes one major release per month with breaking changes, and intermediate patch releases with bug fixes.

https://github.com/material-components/material-components-w...

The development team's decisions have been questionable [1][2][3], so consider this a warning in case you are tempted to use the project.

[1] https://github.com/material-components/material-components-w...

[2] https://github.com/material-components/material-components-w...

[3] https://github.com/material-components/material-components-w...


I feel like the breaking changes must be because they're used to working in a monorepo, where you can just change the code and the clients at once. No need to worry about compatibility


When someone says "ex googler" usually during the beginning of a call,I immediately multi-task, let them finish with their chest pounding and move on from them, trying to avoid them going forward. If you can't convince people with the technical merits of your plan, I guess you just try and name drop (company drop). Some people buy it, most don't and they sit in some enterprise architect role with maybe a few internal fans. If you want to waste tons of money, scale your company for billions of users when you have 2000, hell don't even scale just say you can and claim success before you even do it, it works there are tons of these guys and gals ripping apart companies for their perfect "designs" that worked at google.

(i never worked at google but did interview, argued with the interviewer and hung up, he thought computers worked differently than they do in reality and just spoke with this heir of authority, without having a spec of it )


The funny part of the story is that Angular is from Google.


It's surprising that the fact "Something that works for Google doesn't necessarily work for company X" is not common knowledge.


One of the things that amuses me most is ex-FAANG engineers who are absolutely stymied when they don't have access to the tools and support systems available at their former employers. They've never had to deal with some of the very rough systems the rest of us deal with on a daily basis just to make the business run.

Hubris runs rampant in the ex-FAANG crowd, and businesses who aren't tech companies who hire these folks do so precisely because they don't understand that what worked at ex-FAANG won't necessarily translate to their business.


There are many companies where ex Googlers have made an impact as founders or senior engineering leaders. Your anecdata has many counters.


At this point there are probably 100k ex Google employees.


It wasn't an absolutest statement but interesting that you've tried to make it so.


It is called cargo cult culture [0]. The idea is to succeed by replicating what other successes have done. The problem is that from the outside, we only see how the successes got where they are in a superficial way.

[0] https://en.m.wikipedia.org/wiki/Cargo_cult


I agree that it isn't a perfect predictor of future success, but it is a free indicator that carries a decent amount of signal.

And I would encourage anyone who loves the field IMO should have a goal to do a stretch at one of the big ~5 (an internship may be enough). I speak as someone who was very anti-big tech until 2014 when someone convinced me to get my head out of the sand.

Your technical abilities can still improve (as long as you keep practicing yourself, because dev speed will slow down at a big org), you will gain tons of practice working with a lot of talented people, but irreplaceably you will understand more about how the industry is piloted.


> CEO hired his buddy from high school with a degree in poly sci to run things and is having an affair with the head of hr.

Thats pretty close to the Google story.


didn't understand.


Yes... It's a huge pet peeve of mine when recruiters, companies, tools, say something like "written by ex-googlers", "work with ex-googlers", etc. Like it means anything at all that you have an engineer from a company w/ 100k employees. It's absolutely meaningless. And you see it all the time, especially in link-bait Hacker News headlines.


Did you even read the article ? It's incredibly well written with pros and cons of each tool. I know you have a hatchet to bury against Google but why don't you read the article first.


Why are you carrying water for Google?

I did read the article and yes it does have some good parts.


I carry water to douse the flame you hold.


For me, logs processing system and Dremel were probably the most marvelous tools that I've ever seen in my SWE career. Millions of machines are producing trillions of logs records everyday. Then Dremel runs some arbitrary query over those records in an order of minutes (seconds for simple queries), without indexing.


There are a few oss dremel alternatives with Clickhouse being my favorite

To me their absolute best technology edge is still their storage system (colossus)


Nice, but tools are only one thing to miss about development at Google (or any other of the FAANGs).

I found the more meaningful thing is the ecosystem of smart engineers, and the ability to find others who face similar problems, exchange ideas and solutions. It's a skill of its own to find these people and learn their "language", but once you do, it's a huge multiplier that is hard to find elsewhere.


I would say there are lots of smart engineers at non-FAANG companies too; especially engineering-first orgs.


I think what the OP intends is its size and diversity of the talent pool. Obviously, there are so many smart engineers outside FAANG, but it's also true that having world-class experts on cryptography, networking, distributed system, web technologies, machine learning, microprocessor, compiler etc etc... in a single company is something very rare.


I work at a FAAG-adjacent and the single most valuable thing for me personally is being able to hang out in the Slack channel where all the old timers idle. If I need context on something and nobody on my team knows, 9 times out of 10 I can ask there and someone will at least point me in the right direction.


The article is based on the assumption that what developers absolutely need are the overengineered and bureaucratic tools of a big monopoly company wealthy enough to hire the absolutely best for even the most menial task.


What are the tools used in the most cash-strapped companies facing fierce competition yet prevailing?


Vim, gcc, notepad++, python, mspaint, libre office, gimp,


For some of the things mentioned, got pull. If I want to jump around to see how a function is defined or used, I don’t need to be able to click in the web interface like you can in Critique. Pulling the code locally and using the open source tools I’m already used to works even better.


I agree with what I think is your argument that “bureaucratic” tools used at a company with many thousands of engineers aren’t necessarily appropriate for a much smaller company. For example, the system of ownership by directory in a monorepo alluded to in other comments is unnecessary if you have a handful of engineers or a few dozen. Or the system like their “readability” reviews that you can read about elsewhere online (eg https://www.pullrequest.com/blog/google-code-review-readabil...)

It’s when you get to a bigger size that you (probably) need to worry these cases.


you're assuming the egg is after the chicken. There's a lot to argue for in stating that orgs should create tools to increase the volition of their devs.


Refreshing article! For people with less time check out https://github.com/jhuangtw/xg2xg which tracks open source implementations of Google internal tools (also linked in the article)

Also I have been looking to move away from Makefiles and bash scripts and Bazel does come up more often. What are your experiences working it and how does it compare to Blaze?


I'm a big fan of bazel. It's very similar to blaze, which outside of Google is a good and bad thing.

The good is that I find BUILD files easier to read/reason about than Makefiles and it also is multilingual.

The bad is external dependency is hard, especially if you want to take advantage of all of bazel's features like hermetic builds. Most larger C and C++ don't use bazel to build and don't have BUILD files, so you end up maintaining your own build system for that project OR use something like [this][1].

1: https://github.com/bazelbuild/rules_foreign_cc


Can't say much about Bazel. I understand it's used at Google for their increasingly polyglot builds. I just needed to isolate the incantations to get j2cl + closure-compiler (= their migration target for gwt apps) running together in a somewhat quality-assured way such that I don't run into problems related to non-aligned versions since closure-compiler, while very powerful, has always been something of a bitch to setup and run. Ended up analyzing Bazel logs to get at the needed command-line parameters, configs, etc. Would have preferred a doc about running j2cl/c-c outside Bazel rather than bundling it, since both of these programs are still just Java command-line apps. As I understood more of Bazel, I didn't find it half bad. That says a lot since my tolerance for over-reaching build tools is near zero - a couple months ago I even did the opposite of what you're suggesting and migrated my old Java builds from maven (and ant) to straight Makefiles, for a significant win in clearness, robustness, performance, scriptability, and network traffic.


I'm a big fan. In order to accomplish its core mission of making build & test so much faster, it has to have near complete knowledge of your build graph (the source files and precisely how they depend upon one another). This means that to get the main benefit, you need to go all-in.

Plug: for those interested in codesearch within bazel repositories, my vscode plugin for bazel has a special feature for this: https://stackb.github.io/bazel-stack-vscode/searching


I also think that one of the parts of the success is that insane amount of system configs and other data are also source controlled or provide other means textual access for search (bigquery, etc).

This makes code search a super power.

(googler)


Isn't *this...common? IaaS is good practice these days. I work at an eCommerce co and we do this as well.


Perhaps, but what's notable about Google is the extent to which everything is a text config file, and the fact that everything is in a monorepo.

As an example that surprised me, even our oncall rotations are kept in a text file along with a list of upcoming assignments. The rotation tool checks out that file and appends the next few names in the list to the end of that file to prepare the "calendar" for the coming weeks.


Unsurprisingly, this reminds me of upstream Kubernetes where code reviewers/approvers and project maintainers are all just text files named OWNERS, and the tools just read the files to enforce the rules. Very simple and elegant!


That’s a google-introduced thing :)


Yeah, which is why I said it’s unsurprising. I imagine Kubernetes gives you a glimpse of what development processes at Google is like.


That fact was pretty useful when making the tooling that ensures that xooglers are not on-call in the future. Please don't ask me how I know.


Yes, but on the other of the spectrum, GCL itself has turned into another half baked programming language. Editing these configs is the bane of my life.


GCL is the thing I was looking forward to never seeing when I left Google. Now, GCL is one of the things I actually miss.

Pretty much all other "configure things that are at least moderately complex" are actually bad enough that it makes GCL/BCL look like a really good idea.


The company I work for uses Google Calendar, which is worse in every way than the approach used at Google for calendaring.


What does google use for calendaring?


Google Calendar.


Good practice, yes. Common... it depends.


I found this super useful. It has a bit more suggestions than the article.

https://github.com/jhuangtw/xg2xg


Thanks this is the table I was hoping the article would have


It's linked in the article, and the next sentence explains the motivation for not just starting with that list: "This list is comprehensive, but a bit overwhelming. So where do you start?"


Isn't code search more table stakes nowadays? Both github and gitlab have it.

There isn't anything said about end to end, integration or functional testing here. I'm in a world where everyone hacks their own system together onto the same runtime, leading to some wonky outcomes and lots of operational support. Would be interesting if there was a 'google' way to do it.


Last I checked, GitHub's code search was so bad that it's useless. There's definitely a use for good code search.


Search on GitHub is very bad IMO.

The recent semantic references stuff they've added is helping, but that only seems to be available in certain language/setups and doesn't work cross repo. Google's xref system allowed you to browse essentially the whole Google codebase - it was amazing. Third party code was indexes too, I remember my team used code search to track down a bug in NGINX once.

Githubs normal search feature is bad. I can't even quote stuff for exact matches. I usually end up using bigquery for GitHub wide searches [1] or just pull down the repo and grep locally.

[1]: https://www.google.com/amp/s/cloudblog.withgoogle.com/produc...


I use Sourcegraph to search GitHub code most of the time because GitHub's search is awful. Since it has most popular repos indexed already, and it'll clone new ones that you point it at, it's quite handy.


It’s not useless but it leaves much to be desired.

I was recently able to use it to find all repos in my org using git-lfs by searching for .gitattributes with certain properties.

And I was able to search all projects for a particular secret string.


(Disclaimer: Googler)

Internal code search is miles ahead of GitHub/Gitlab search and is super fast and reliable. I have not used source graph (which everyone seems to talk about) but in my past experience in other companies nothing comes close.


I don't know what Google's internal code search is like but if you want to see Chromium's code search it's here

https://cs.chromium.org

also android's

https://cs.android.com/

So you can compare


Nice. It looks like a subset of internal tool. Even my favorite shortcut is working (l r - reference to current line in current commit)

https://source.chromium.org/chromium/chromium/src/+/master:i...


> Both github and gitlab have it

If you want to search code sitting on your local hard disk try this tool (built on top of Lucene):

https://github.com/Rajeev-K/eureka

I made it when I was frustrated by existing tools such as sourcegraph and opengrok.


Google used to offer code searching for public repos but then, expectably, they abandoned it. Now it's only used for google-owned/run projects. But when it was around it was really amazing. Github's code search is quite lame in comparison. I often just run ripgrep locally instead.


regarding google-associated projects: that's true enough, but note that you can e.g. navigate a version of LLVM at

https://cs.android.com/android/platform/superproject/+/maste...


Google’s internal code search is far ahead of GitHub, Gitlab, and even sourcegraph. But you can use it, because they open-sourced it. I don’t know why more people don’t use it.

https://kythe.io/


Author of the post here. I would say that technically Kythe is the open-source version of part of Google's internal code search (specifically, the component that provides precise code navigation), but it doesn't include the search index or the UI. So it's not on its own an end-user product.

That having been said, I love Kythe, and we've actually considered using it as a semantic backend for Sourcegraph (and still might in the future). For the time, we're using indexers that emit LSIF (https://lsif.dev). This allows us to build on top of the substantial body of work provided by the many open-source language servers (https://microsoft.github.io/language-server-protocol). But Kythe has a far richer schema that can capture all sorts of useful relationships in code. It's awesome and I wish more people were building indexers for it.


Maybe sourcegraph just doesn't exploit the real power of language server indexes, but the C++ language server seems pretty impoverished compared to Kythe. If I ask sourcegraph to find all references to absl::string_view::string_view(const char* str) it instead finds the substring `string_view` in any context, which is quite a useless result. Kythe gives me the actual call sites of that function signature and not the other forms, and Kythe knows the difference between absl::string_view and std::string_view.

Is it just a case of the visible implementation being a bit behind the ultimate capability of the system?


(kythe googler here)

Like beliu said, the Kythe schema is far richer; it has fully abstract semantic layer in the graph, and is a superset of what can be represented with LSIF. It's not tied to specified text regions -- there are representations of symbols/functions/classes/variables/types that do have pointers to/from text regions.

Note that because of the richness and abstractness, it's theoretically feasible to drive much more than code navigation from the Kythe graph.

And yes, the open source is just part. The large scale pieces are basically (1) do instrumented build (2) run through Kythe indexers (3) post-process output for serving.

The Kythe OSS project offers solutions for (2) for C++/Java/Go/Typescript/protobuf (and early Rust support). We do have plans to open source support for at least some other languages at some point in the future. (Hedging as best I can here.) Note that the best candidates for Kythe indexing are those languages that admit solid static analysis.

(1) is inextricably tied to the build system. Bazel support should be nearly turnkey; other systems require more (maybe significantly more) work.

There's not-full-scale support for (3) available. (Clearly we use something far more sophisticated internally.) While we'd like to see this fleshed, expansion of that will depend on non-trivial community contributions.


We need to stop treating "ex-Googler" or "Googler" as some form of elite credential.

It's not, and Google is hardly any authority on good software.


Did you even read the article? The guy never claimed some elite stance. Google is well-known for its internal tools. It would help to educate yourself a bit.


If you read between the lines, the article sounds a lot like "I know better", implicitly "because I worked at Google" (as per the "ex-Googler" qualifier in the title).

Consider this:

> Introducing code search and monitoring doesn't require asking anyone on the team to change existing workflows. Changing the code review tool, however, does.

It basically sounds like the author wants to take over and tell everyone how to do things.

After time at Google, comparing them to other organizations all the way from 8-person start-ups to other FAANG+Microsoft, there are a LOT of downsides to their tools and their isolated ecosystem. A lot of things are more difficult and take longer to just get done than elsewhere at the same quality.

> Google is well-known for its internal tools. It would help to educate yourself a bit.

This may also be exactly the attitude OP was talking about.


I think what's missing from the tools discussion is the infrastructure that the tools sit on top of:

1) CitC (Client in the Cloud). Mounts your development environment on FUSE filesystems that exist in the cloud. The entire monorepo is mapped into your CitC directory. You can access it from your desktop shell, your home laptop, or a Web Browser based IDE. Any edits you make are overlaid onto the (readonly) source repository, looking seemless and creating reviewable changelists on the fly. Effortless sharing of editing between multiple machines. ObjFS, which also sits in your client, allows blaze(bazel) build artifacts to be shared as well, between clients, even between users. In other words, if I work on 3 machines, I don't need to "check out" my work 3 times. In fact, I almost never "check out" anything at all. I work in a single monorepo with Mercurial, edit files, which produce reviewable changelists against the main repo. I don't need to decide what files to check out or track, nor decide which machine I will work on, and I often switch between IntelliJ locally, IntelliJ via Chrome Remote Desktop on my office computer, and a VS-Code like Web IDE.

2) Skyframe (https://bazel.build/designs/skyframe.html). Imagine parsing the entire monorepo and every single BUILD file into a massive pre-processed graph that knows every possible build target and its dependencies. This allows ultra-efficient determination of "what do I need to rebuild? what tests need to be re-run" across all of Google. I guess the closest thing to this is MvnRepository.net or BinTray, but Skyframe doesn't just parse the stuff and give you a search box, it informs CI tools.

3) Citc/Critique extensions to Mercurial -- take a chain of commits and make them a single code review, or take a a chain of commits and make them into a stacked chain of code reviews.

4) Critique presubmit tools (e.g. errorprone, tricorder, etc). Google has a huge number of analysis tools can run on every review update, for bugs, security problems, privacy problems, optimizations, data-races, etc. Yes, these are usually available outside, but it's just so easy to enable them internally compared to doing it on GitHub. Lots of other codehealth tools, for automatically applying fixes, removing unused code, auto-updating build files with correct dependencies.

5) Forge -- basically Blaze's remote build execution (what Bazel calls RBEs). Almost every build at Google is extremely parallelized, and if you need to run flake tests, running a suite of tests 10,000 times is almost as fast as running it once.

6) Monitoring's been mentioned, but monitoring combined with CodeSearch hasn't been touched on. Depending on configuration, you can often see from Critique or CodeSearch what release or running server code ended up in and what happened to it (did it cause bugs?). CodeSearch has an insane number of overlays, it can even overlay Google's Sentry-like exception logger being able to tell you about how many times some line of code produced a crash.

A lot of Googlers use maybe 25% of all of the features in CodeSearch and Critique.

Here's a in-depth article from Mike Bland https://mike-bland.com/2012/10/01/tools.html


> 5) Forge -- basically Blaze's remote build execution (what Bazel calls RBEs). Almost every build at Google is extremely parallelized, and if you need to run flake tests, running a suite of tests 10,000 times is almost as fast as running it once.

This is available outside Google now - start at https://github.com/bazelbuild/remote-apis or https://docs.bazel.build/versions/master/remote-execution.ht... . It's a standardized API, supported by a growing set of build tools (notably Bazel, Pants, Please) with a variety of OSS and Commercial implementations. At this point almost anyone can set up Remote Execution if they wish to, and remote caching is even easier.

Minor terminology correction: RBE generally refers to Google's own implementation of the same name; Remote Execution (RE) and the REAPI are used to refer to the generic concept.

(Disclaimer: I work on this at Google.)


I used Docker to install sourcegraph yesterday on my MacBook. It took a while to locally cache about 2 GB of data from my 130 GitHub repos. I think this is a game changer, but I need a few weeks with it to make sure that the ceremony is worth the effort. I should probably be running it in a VPS rather than my laptop. I worked as a contractor at Google in 2013 and I do miss their internal dev environment.


Ex Googler here

I really liked this article. It makes great points about the order in which to try to improve things. I feel seen about trying to push a build system before having built social capital. Great to learn about other build systems. I am still having trouble explaining "what's wrong with just using Makefiles"... working on it.


In the table for "Write Code" / "Inside Google", I think you mean Cider, not Critique.


I suspect the author meant to mention "cider", not critique, for writing code.


While it is always interesting to read what successful company is up to, I have always been wondering what is the motivation behind ex employees writing memoirs about what was going on in the bedroom. For one, the possibility of such thing happening in the future keeps companies in check, so the company must ensure they don't do evil but also that reveals any weak points to the competition. Google is of course one of those companies that are too big to fail, but what such article would mean for a small company operating in a dog eat dog environment?


I think it's just useful as an idea what others are doing and why. Bigger companies had more time to work on their processes and approaches. They usually also have an interest in increasing their developer productivity.

It's more to see the lay of the land than anything else - might be something very cool for you to pick up. Most of it will be an overkill for smaller teams.


Sourcegraph seems interesting but the pricing seems off for me. The “Teams” plan costs $150/month for 25 people and that works out to $6/user/month.

Using GitHub it’s between $4-21/user/month.

So this means that insight from my source costs sometimes more than actually managing my source.

I’ve never worked at Google but it seems like one benefit is that they’ve figured out how to scale the costs of this kind of functionality so you don’t have to run into conundrums like this.


It only needs to save your developers a few minutes of time on average to be worth it.


I get that, and understand value-based pricing. But when it starts being relative value that it’s hard to line up.

Let’s say sourcegraph saves a few minutes and it worth it. GitHub saves way more time so it’s relative value is much higher.

I think this is a problem with many SaaS products in that it’s more “efficient” to just be software. It also seems like the per user costs are relatively low and the effort is in the code base indexing and whatnot.

I think it’s a good idea and think it’s great if users are happy. But I’d rather have an OSS product that I could install as part of the tons of other stuff I use than be on the hook for a monthly charge for every user.

I expect that they’ll be bought by GitHub or GitLab and rolled in at some point and will make more sense to me value wise when it’s like $.17 or something of an overall source management cost.


It also needs to compensate the users for the time spent setting it up, learning it, playing with it until it actually solves a valued added use case.

We developers often overestimate the value vs. cost of the tools we use ... many tools have a negative value vs. cost eventhough they are free.


GitHub's $4/user/month pricing is very hard to compete against and actually make money as a business. Charging half the price of GitHub for example isn't sustainable.


There must be a name for this psychology:

“Product A is great value and product B is more expensive than (or same as) product A therefore product b is bad value and I won’t buy it.”

The ROI or otherwise of a product is not related to the first one. Unless your budget is so limited it’s either-or, why is this fallacy so prevalent?


First, anchoring is very real and humans judge things in comparison to others, especially when compared to unclear things.

I think that fairness is a factor in pricing and it’s not just an ROI calculation.

It’s like those epi pens that went up in price right? From an ROI perspective it’s great because you pay $400 and get to live. But know it costs little makes me think it’s less just.

Relative ROI is also important as we have to prioritize. GitHub seems a much better value to cost than this tool.

It’s perfectly fine for them to price however they like and it seems from their site that lots of companies pay this.


> First, anchoring is very real and humans judge things in comparison to others, especially when compared to unclear things.

This isn't anchoring in the true sense, though. That's my point. It's like saying "my car costs £10000 and my house is £300 000 and my car goes everywhere so this house is a rip off". It's apples and oranges. People are being irrational about software purchases because MS etc. are cheap.

> It’s like those epi pens that went up in price right?

it isn't, because the price of the SaaS hasn't changed.

> Relative ROI is also important as we have to prioritize. GitHub seems a much better value to cost than this tool.

For something that costs tens of dollars per month for a company that employs software developers, there is zero debate about whether they can afford it. Provided ROI > cost, why on earth wouldn't you?

This just seems like irrational behaviour caused by the fact that software has near zero marginal costs and most people don't understand that.


Permissions for repos (which is not a necessity but has many many uses) immediately kicks you from Teams into "$Contact Us" Enterprise territory, too.


I boomeranged google after working at few other large techs. Google tooling was miles better 5 years back. Now the OSS tools have caught up. They are much more stable, have better UX and are transferable skill.


Did Google copy Google in the first place? You can use it as your study case but think about making it better, put your progress in the table, innovate.


very interesting article. how do people stay updated with latest tools outside of google's ecosystem and do they feel like they are missing out.

lastly I am just curious to know why don't companies adopt opensource tools that are being used everywhere so skills can be transferable easily


How many ex-Googlers are there in the world now? There must be a few hundred thousand.


Article author interned there for "less than a year" -- LinkedIn.


Plenty of time to get an idea of the engineering workflow.


Yeah I agree, however its still worth pointing out. OP has been working for their own startup for 7 years and interned at google almost 10 years ago. It's basically using "ex-googler" for clicks.


Really? Maybe to know of it, but to know the ins and outs? As an intern? Probably not. Also it's years out of date. Practically just clickbait.


1. Why does Google develop their own tools?

2. Why doesn't Google open source them?


For point (2), Google did open source Bazel https://bazel.build/ which is essentially just the internal build system Blaze.


> 1. Why does Google develop their own tools?

Considering the age of the articles and tools themselves, they probably predate everything you're using these days. All the GitHubs and other shiny SV startup things.

IIRC they also don't ever want to share code with other companies which makes most of the SaaS offerings a no-go for them.

Hence why it might not make sense to use the same tools as they do :)


(googler here)

For 1 I guess it also boils down to the mindset you approach a problem. On my past companies when you have a problem the first thing you do is to find a OSS or product that solves that for you and even when you don't find you try to reframe the problem in a way that will fit that solution that you already have in your mind, at google it normally involves on you solving that by yourself or reusing much lower level of abstractions.

This doesn't mean that you will need to always create your own database, but when you really need to you have the skills to do a decent job.


Many of the things that are trendy in the industry today were invented by Google years ago, and Google had to invent them because they didn't exist at that time.


> 1. Why does Google develop their own tools?

Often there's no OSS version yet.

Usually they have better performance.

They want cluster and multi-cluster services.

> 2. Why doesn't Google open source them?

Mainly dependencies.

But they release academic papers that OSS implements.


> 1. Why does Google develop their own tools?

Because it benefits Google.

> 2. Why doesn't Google open source them?

Because that would benefit the competition.


Disclaimer, I work for Google, but what follows is my personal opinion.

Google open-sources a lot: Tensorflow, K8S, Apache Beam, ... . And even if it doesn't end up as fully open sourced project, Google still releases white-papers on the subject that allows startups to create something similar (Cockroach-DB for instance).

However, while I admit that some decisions might be made to avoid benefitting competition (I think, that kind of stuff is way above my pay-grade), some things cannot be open-sourced for purely technical reasons (without a complete rewrite, that is). For instance, within Google everything is a protobuffer, and tools rely on that assumption heavily to work. Outside Google people don't use protobuffers nearly as much and the usage of those tools would be very low.

Other tools are tied to Google having many datacenters and multiple fibers between each of them for redundancy. Like Spanner, which also requires atomic clocks to work properly.


What Google open sources and doesn't open source is strictly a business decision. The technical details don't matter. Things like Spanner remain proprietary because Google thinks they can make money with it. They charge $10/hour just to have the replicas up and serving traffic; letting Amazon install it and charge $9.95/hour is not something they think is going to make GCP a lot of money, so we don't get the source code. Things like Kubernetes, on the other hand, are open source because Google wasn't "winning" in that area -- to get people to use GCP, they had to get people to break the dependency on proprietary things like CloudFormation. Otherwise, people would just stay at the market leader and not switch to the second place option.

Plenty of people outside Google use protocol buffers, for example; I've run into them at every job I've had since Google, and in plenty of strange places that probably never cross-pollinated with Google (the most surprising place I found them was in Hearthstone). They're pretty popular and people aren't really surprised to see them anymore.

I think there is also a middle ground where people inside Google don't think there's interest, but there is. For example, I very much miss Monarch. I don't think the code is making them a lot of money; my understanding that the Cloud monitoring stuff is completely different. But it is way better than Prometheus or InfluxDB. Queries that are trivial in Monarch you simply can't do with those products. (The one thing I found most valuable in Monarch was that pretty much every query started with an "align" step. And I just haven't seen that anywhere else, so it's hard for me to reason about what the query is actually doing.)

As other people mention, the mere task of picking the transitive closure of dependencies out of google3 is hard. In fact, maintaining a bunch of non-monorepos is a huge chore compared to monorepos once you have the right tool. It's thankless work, literally, so I can believe that's one reason why there aren't more internal Google tools open sourced. But, it can be done if there is some thanks for doing the work. When Google split into Alphabet, work was done to let companies leaving Google take their chunks with them. There just had to be some sort of business reason to justify the tedium.


Same reason stuff gets deprecated. Just maintaining the code against the dependency tree is work. Trying to make it functional in the real world that doesn't have the 50 billion libraries it relied on that are also internal only (or entire systems) is work ^ work.

Unless its low level infrastructure or based on a research paper, its not an easy task.


(Googler here)

> Because it benefits Google.

Developing internal tools does not always benefit Google. Sometimes there is just no alternative at the time it's needed, so Google has to develop something that might become a liability later, accumulating technical debt and stagnating compared to a newer open-source shiny thing. Sometimes it's NIH syndrome, or, in other words, wariness to adopt external solutions there's no control over and that do not fit Google very well. However Google does have a healthy internal ecosystem with clear product life cycle and balanced planned/organic change.

> Because that would benefit the competition.

Usually Google benefits from its protocols/tools/projects being used out in the open (TensorFlow, gRPC, Kubernetes, Angular, Android, Chrome), and that includes competitors.

There are plenty more mundane reasons this does not happen:

- Some of the tools are so dependent on the internal ecosystem that it would make little sense to open source them in a standalone way. There are many things at Google that only exist in one single deployment in the world and turning them into a deployable product for another setting would be a huge task without clear purpose. Also, it's hard to opensource operational knowledge and expertise.

Google Cloud is an example (positive or negative, depending on your point of view) of efforts to repackage many internal services in a way that is well-documented, supported and accessible to anybody.

- Open-sourcing is a spectrum: just throwing code over the wall (the worst one), controlling a project and allowing external contributors, cooperating with other big companies on a standard solution, supporting a more hobbyist-oriented project in line with its own needs. Every option has its own set of challenges and coordination problems. It's not always easy to reconcile the internal development model and open source workflow. It's not always easy to keep delicate power balance of working with a project instead of taking it over by engineering weight and influence.


Honest opinion on FAANG tools is that the big ones are now antiquated and slowing these big tech companies down. Eventually they will need to rip out all these tools that once gave them an edge and replace them, but they cant bc of how much depends on them, and whala, you have a large failing slow company


Declaring a FAANG company failing because it‘s toolset is antiquated is the rare hn elitist opinion you see here more and more. Are startup people in the valley more and more becoming blind to the real world?


There is a certain kernel of truth to it. I work at Google, and doing anything at all is a chore - it's almost as if we're back to programming with punch cards.

It's not surprising that Python doesn't work well in such a high-delay dev environment.


I haven't worked at Google or any other big tech company, but I understand that Google's culture is far more like academia anyway. Compared to Facebook, Google prioritizes code quality over development speed. Other companies seem to fall somewhere in between.


> Google prioritizes code quality over development speed.

That's putting it nicely. Based on experience, I'd say it's arguing about preferences while demanding that code be coupled because "that's how we do things here".

> culture is far more like academia anyway

Academia is actually based on research and correctness to a much higher degree. The lack of focus on correctness in code worked on at Google is, honestly, appalling. Until, of course, you realize that all other FAANG employees still write loads of concurrency bugs and pronounce themselves gods for doing so.


> That's putting it nicely. Based on experience, I'd say it's arguing about preferences while demanding that code be coupled because "that's how we do things here".

That's certainly not been my experience (and I get the exact opposite impression from people involved in the formal readability process).

> Academia is actually based on research and correctness to a much higher degree.

This depends, greatly, on what part of academia you're in. In many ways, Google is much better about correctness than much of academia (reproducibility, for example, is often near trivial at Google but uncommon in most non-theoretical areas of academia).

> The lack of focus on correctness in code worked on at Google is, honestly, appaling

There are tradeoffs here. On the one hand, you have tens of thousands of engineers, you're not going to be able to enforce perfection by every single one with the tools available today with a reasonable cost. On the one hand, I see evidence that Google is willing to invest huge amounts into improving software correctness at the lowest levels (like proposing and upstreaming changes to languages to improve correctness-by-default).


Hi, can you elaborate the high delay part? Is it the case that all code is accessed through internal network using FUSE or something? Or is it because monorepo?


I'd say once you take a peak behind the curtain at some of these FAANGS you see that their once greatest assets are becoming liabilities. The magic of "FAANG" goes away. The innovative systems they all built in 2007-2012 period are now aging.


What great revolution do you suppose happened in the past decade that made these tools obsolete?

To my eyes, we still develop software in plain text and the same languages are still dominant. It's still Linux and http.


so right, but it won't make you any money or get you any linkedin/twitter fans if you speak the truth. We have AI being tossed around like it isn't just database queries most of the time.


A lot has improved over the past decade.


The revolution was the Internet, bringing together a huge collaboration of open source devs vs the aging ivory tower in each proprietory company.

It's not just Linux and http anymore. It's frameworks towering up to the heavens.


I don't think I agree with you. Systems have been evolving, some are being deprecated, rewritten, etc. It might be true that the OSS world has been catching up so it is worth less to have a custom made solution. But the fact that systems are aging does not mean that those systems are becoming irrelevant.


An analogy, if I may share:

On the one hand I agree with you, in that these liabilities you describe are much like the legacy systems I see in our local banking sector.

On the other hand, these legacy systems just work. And adopting newer / more modern systems might definitely have advantages, but doing so creates a bunch of other liabilities - as I've seen take place in our new startup banks.

Stuck between a rock and a hard place.

Of course that's just in banking. I most certainly don't have the experience to comment on whether the same liability tradeoff would happen at FAANGs.


FAANG's often don't have paying customers, banks do.

Its not being stuck between a rock and a hard place, you just need strong management that understands that changing whole systems is never a good idea without clear benefit, other than "technical debt" which is such a bad term for reliable software. Software doesn't age, im not sure how software engineers don't understand this, or maybe they do and want to write more s/w.


Software does age. Security ages, network protocols age, human related configs age (timezones, for example).

And that's besides tribal knowledge that goes away as people leave or retire.


New orgs are forming at big companies all the time. In my time at Microsoft, I’ve seen the tooling go from horrible monorepos with perforce to really nice git repos in azure devops. I also don’t think “failing” is the right term for these megacorps. Slow maybe, but even that varies per team.


What does it mean to have plural monorepos?


Org-wide instead of company-wide, maybe?


Correct, I was working in exchange where all of the teams and orgs working on that worked out of a single large repo and never touched other products.


Unless I’m missing your point, I disagree. Many Microsoft teams I’ve worked with use Azure Dev Ops or Github + whatever OSS tooling they need. They have all done solid testing, infra as code and CI/CD etc. The tooling is not at all antiquated.

It’s actually been pretty refreshing to be around.


Is “whala” an adaptation from the French “voilá” or of a different origin?


Its a boneappletea-ism.


Not clear. Look at https://en.wiktionary.org/wiki/wallah#Etymology_2

wallah is arabic slang for "by god"

voila is french for "look there", meaning "just look at that"

They are totally different semantically interjections originally, yet one fits almost everywhere the other does, and if Wiktionary is to be believed, in Danish one is a spelling for the other.


Its me not caring about the spelling haha...


Could be confused with "Vallah" though in this case: https://www.urbandictionary.com/define.php?term=Vallah


Few things nerdsnipe the HN crowd like a misspelling, lol.


Right if only they could get on such cutting edge tech as npm, cmake or just gnu make with a pile of bash like most oss projects out there


Wouldn't the central planning committee include a great leap forward in the next five year plan? This is capitalism, after all, and it's far superior to the socialism of open source software where it's survival of the fittest in a dog-eat-dog world.


Google's internal tooling, and by extension the developers who blog incessantly about it after doing a brief stint there are highly overrated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: