Hacker News new | past | comments | ask | show | jobs | submit login
Stacked changes: how FB and Google engineers stay unblocked and ship faster (graphite.dev)
467 points by tomasreimers on Nov 17, 2021 | hide | past | favorite | 279 comments



At my last job, I found our code review process was terribly slow and set about analyzing it and coming up with solutions. I found a lot of developers were blocked on reviews. They either task switched (expensive) or babysat their review (expensive). Even though we were using Phab, few did stacked commits because of how unfamiliar the tools were which made them feel brittle.

Improving the stacked workflow is one part of the equation but I'd recommend being careful of letting that mask other problems

I identified a series of busy-waits including:

- Post, wait, CI failure, fix, post

- Post: wait for reviewer, ping reviewer, wait, find feedback was posted, fix, post

Some of the root causes included:

- Developers weren't being pinged when the ball was in their court (as author or reviewer)

- Well, some notifications happened, but review groups were so noisy no one looked

- The shared responsibility of review groups is everyone assumed someone else would look.

- The tools we were using made it hard to identify active conversations

My ideal solution:

- Auto-assign reviewers from a review group, allowing shared responsibility of the code base while keeping individual accountability of reviews

- Review tools that ping the person the review is blocked on when they are ready to act

- Review tools that give a reminder after a set period of time

- Azure DevOps' tracking of conversation state with filtering (and toggling between original code and the "fixed" version)

- Auto-merge-on-green-checks (including reviewer sign off)

With this, responses are timely without people polling the service, letting machines micro-manage the process rather than humans.


Gerrit does many of the "ideal solution" items, as it replicates the internal Google workflow.

It's a code review system with browsing as an afterthought, as opposed to GitHub which is a code browser with review as an afterthought.


Gerrit is nice, but to me feels a bit dated, UX-wise. Before joining Google, I used Azure DevOps and I feel that it was easier to review code over there (less jumping around the page). At first I thought it's a matter of (in)familiarity with the tool, but I still feel like I'm fighting with it after 3 years of use...


The Gerrit UI is just bad (confusing). I don't like the "dated" criticism for work tools because whether it works well is mostly independent of it being fashionable - cf. Steve Jobs quote "Design is how it works". Gerrit's UI is both dated and confusing, but confusing is the major thing here. For example, "+1" and "-1" meaning "I like / don't like it" and "+2" / "-2" meaning "I approve / block it". The numbers are not summed up, the names are misleading. Or the workflow of writing a comment and then publishing it. Probably makes sense when making several comments (on different blocks of code), but it's too easy to forget the publishing, especially when writing a top-level comment where there isn't anything else to publish together.

Another thing I don't like about Gerrit is that they seem to play Tetris with UI elements, i.e. "wherever there's room to put it is where we put it". It's not as bad as it used to be and improving, and it looks less dated than it used to, too. There are also newish features like patch editing in the web interface.

All that said, patch queues / stacks and the rebase based workflow are nice.


Definitely, "feels dated" is very subjective. The other problem ("I feel like I'm fighting with the tool", "I'm jumping all over the place to do what I want") may be related to your "UI tetris" point, and I may be unnecessarily tying it to the UI feeling "dated".

ISTM that what is needed is to observe how people interact with the tool and then try to figure out how the workflow can be improved based on those observations. Or maybe I'm trying to interact with gerrit in some strange way - that's also possible.

The +1 / +2 for different labels aren't that confusing to me - additionally, in Chromium I found that gerrit is very descriptive when you on-mouse-over the buttons, but IDK if it's specific to this deployment or if that aspect got improved.


I really like that with Gerrit I'm not encouraged to force-push a PR branch. There's a couple other things that are nicer on Gerrit but GitHub/GHE is great that it has markdown in comments.


Gerrit does support some Markdown-like formatting:

     Some basic markdown-like syntax is supported which renders indented lines preformatted, lines starting with "- " or "* " as list items, and lines starting with "> " as block quotes
from: https://gerrit-documentation.storage.googleapis.com/Document...


Gerrit only has force push for the PR equivalent.

The nice thing it has is really easy to rebase off other people's in-progress work (for GitHub you need to to fetch the other person's commit by url and hash -- I guess after writing that down it's not a huge advantage).


There has been a lot of investment in Gerrit in the last few years and the UX is improving dramatically. It's still not perfect, but I've been very happy with it lately.


Totally totally totally agree. I think a lot of the slow down comes from not understanding when the ball is in your court. And Github doesn't prioritize this b/c I think this problem is less prevalent in open source, where there is usually no expectation of a reasonable time until review.

A lot of what you described is exactly what we've built / we're planning for the code review portion of the site - you should sign up and check it out :)


Agreed 100%

Making it easier to stack PRs feels like entirely the wrong direction - "when you're in a hole, stop digging" - stacking is giving the people in the hole better shovels to keep digging with, instead of getting PRs flowing through more easily so that stacking is not needed so much.


"Stop digging" is not an option when working on a project that isn't entirely internal to one (sub)organization. These are not only open source projects.


Github has been improving on this. I felt like they were rolling out the features I wanted right as I was enumerating them!

They have

- Auto-assign reviewers

- Pings for people (unsure how sophisticated)

- Auto-merge

The review conversation itself is the big issue.


To add to your list:

> Reviewer able to make tiny changes to the code without a further round trip to the developer. Things like "missing full stop here" or "Please put a comment warning about X here". Developer should get a heads up that their code has been merged with changes, just so the tiny changes don't go unreviewed.


I feel the hard part is putting automation around this.

I loved it when I worked on a small, high trust team where we were comfortable with each person making the decision on whether it was safe to do a post-merge review or that it needed attention first with a pre-merge review.

EDIT: Something I have liked at a couple of my company's was a "cowboy commit audit" where we allowed people to directly push but doing so would ping the code owners and they would have to review to clear the state. People understood they shouldn't but also understood when it happened.


Github suggestion comments usually do this pretty easily. You can prompt the submitter to accept a diff. It does have two annoying problems though:

- You must use a browser editor, no attaching a diff you generated elsewhere. Anything more than 1 line is hell because Github's editor gives no fucks about running gofmt or whatever. - If you request a change, approve the PR, and then the submitter accepts your change, it dismisses your approval.


i'm a bit ambivalent about this sort of idea - reviewers adding "tiny" changes without extra reviews could be a place where bugs leak in. what's tiny or insignificant might turn out to be big - that's why code is reviewed by multiple people!

I think for style formatting, making it an auto-format step during commit/push is the "better" way. Codified formatting means you argue and hash it all out once and be done with it forever.


Nothing stops an organisation having standards what is sufficiently tiny to use the process...

Eg.: "Touches comments and whitespace only". "Resulting compiled binary does not differ". "Variable and function renames are okay". etc.

Remember that with the original developer still able to review, they can always rollback a change they weren't happy with.


i've heard it said that pair-programming can alleviate the need for doing code review... or at least the typical in-depth reviews that usually become bottlenecks...

any experience with that?


My primary experience with pairing was not ideal. It was being used in a mentoring capacity rather than when people are more equal. I also found that the person doing the mentoring usually drove which meant that the mentee just saw things whizzing by on the screen and had no idea what happened and when they did they didn't have the first hand experience to commit it to memory.

I do remember a situation of hot seat programming where we had a critical problem right before release that crossed domains and three of us worked at one desk consulting the others or swapping out as needed. Again, it was worth it because of the urgency and priority but not something worth doing generally.


> the mentee just saw things whizzing by on the screen and had no idea what happened

That's a bad pairing experience, and I've had a handful of those in my last five years of pair programming.

On occasion I've said, "Look, why don't you tell me what to do, and I'll drive?"

But I think that the larger issue is that some developers are better at solo'ing than pair'ing, and there's nothing wrong with that, and at the company I worked (Pivotal), we'd find positions for the people who were more inclined to solo.


An easy fix is to have the mentee drive.


Now navigate to the FOO service

FOO is the background event scheduler, it's connected to BAR via RPC

Ok don't worry about what that is for now, just have a look at the service logs

So it's a daemon that running on port 5000

You'll have to tail the logs in /var/foo/logs

So type T. A. I. L, and then dash f

No, the hyphen character, not literally dash

And so on and so forth. It's not that a person more senior than you can't drive, but I have found a lot of time the implicit knowledge of so many systems, and topics and accreted domain knowledge can really be a productivity drain. The best pairing usually comes from equally matched individuals.


A software engineer at a tech company that doesn't know about commandline flags? That doesn't seem very realistic.

Anyway, this person will learn that dash means - , and not make that mistake again. It's a one-time mistake. So I think the pairing helps even in this case.


> A software engineer at a tech company that doesn't know about commandline flags? That doesn't seem very realistic.

Why not? 95% of most peoples work can be done in an IDE/text editor, and that last 5% is probably very exploratory anyway. I work in games, and gameplay programmers work almost exclusively out of IDEs with GUIs for source control and most other dev tools etc.

> Anyway, this person will learn that dash means -, and not make that mistake again. It's a one-time mistake.

Disagree that it's a one time mistake. On any given day/task there might be 5/10/15 of these. dash, ctrl + r, tab complete, !!, different shells, shebangs, flaky backend services, etc etc. If you want to teach someone how you work then yes this might be appropriate.

> So I think the pairing helps even in this case.

That said, I agree :) It depends though, is your goal to bring two programmers up to speed or is it to crunch through tasks right now? Because pairing helps with one of those goals but not the other


Okay, solution:

teach the person to use a computer first


Have you ever paired with a junior before? This is definitely the kind of thing you would encounter with a fresh grad/intern.


Sounds good -- I know that you're a great teacher. You'll still have those features done by tomorrow, right?


I read your parent as "that guy should not be anywhere near programming, he needs to learn how to use a computer first".

Okay I might exeggerate a bit but if someone literally writes 'dash' at that point, don't pair on something that has a deadline like that.

Do take him aside and let him know he gotta brush up on the damn basics and that you are there for him through it but you're gonna finish that feature with a deadline first while he takes a stab at that easy bug over there.


Exactly. The easy fix is not to have the mentee drive.


Tell the PM that we need to go slower to get faster.


Pair programming can't alleviate the need for code review. Good pair programming may reduce the amount of changes needed, and thus speed up the review phase, as a secondary effect of increased code quality.

Bad pair programming will suck up time by occupying two people at reduced efficiency, and won't lead to better code.

I've had some really great pair programming experiences, but we as a team aren't mandated to use it. We see it as a tool to pull out when cross-domain knowledge is needed, or when facing a particularly thorny problem. If you get the right two people together, they can fill in the gaps of each other's knowledge. The person not driving has more mental capacity available to think through potential alternatives.

However, the success of pair programming depends as much on the personalities of the individuals doing the pairing as anything else. They need to function as teammates, without any ego issues or any other thing that may prevent either party from feeling fully comfortable sharing their knowledge. Trying to pair program with someone who is defensive about their code/abilities is a true nightmare.


I've had great experiences with pair programming (and also bad experiences, but this is assuming a great experience), and I'd still want code review. The benefits can actually stack, if not compound, often enough.


I suspect something missing in pair programming that you get in a code review is someone stepping back and looking at the change as a whole, rather than the incremental bits as you go along. It can be helpful! I know at least when reviewing, there are things I miss when I only look at the update vs looking at the PR as a whole.


Pair programming doesn't scale for all the purposes of code review. Pair programming can be great for bursts and unblocking when there is good alignment, but code review serves as an enduring artifact for the whole team including future team members.


The way I'm used to pair program is that the pairing counts as a much more intense review than the typical corporate review case.

So the whole waiting aspect is completely gone.

Sure, this can have bad results with less skilled and/or serious people. But that's always true.


Once a review is done, it can help to sit with the person to avoid an endless back & forth via whatever app's hosting your repo. In my experience developers usually just put aside 1-2 moments per day to spend on reviews, so if there's 3 back & forths that can mean the changes don't get accepted for several days. Sometimes for just a few lines of code.


Well that's obvious, because the person who did the pairing is already pretty familiar with the code, so even if you do perform a review it will go fast.


that's like saying meetings alleviate the need for emails (documented decisions). Ideally you need both, but pair programming/meeting needs to be scheduled, with all the associated problems, async communication doesn't. If you have to pick one over another, then obviously that's async communication. A meeting without minute is a meeting that didn't happen.


At my company, we do exactly that using Github+Slack integration. Github Codeowners + Github Teams let's you to autoassign teams as reviewers. When the team is assigned then a random person from the team is pinged using round robin. Furthermore, with Github Scheduled Reminders you can get notifications in Slack according to some schedule and/or when some event happens, e.g. your PR was approved, got a comment or a mention, etc.


>Auto-merge-on-green-checks

ADO has that.

Post, wait, CI failure, fix, post

This is the worst, especially when the CI system is homegrown, rickety, and difficult or impossible to reproduce locally.


When I was doing product evaluation at my last company, I was torn between AzDO and Github. After being burned by the lack of progress on Phab (before it went away), one of our criteria was what I called a "living platform" meaning that it either has a vision that aligns with us and progress is being made or there is a vibrant community where we can collaborate. AzDO has a lot of good features but it doesn't seem like its getting much attention and everyone has been keeping their integrations with it private, killing off collaboration.


> This idea isn't new, and if you've worked at a company like Facebook, Google, Uber, Dropbox, or Lyft (among others), you've either had first-class support for stacked changes built into your code review platform (i.e. Phabricator at Facebook...

Phabricator got a lot of stuff about code review right (I may be biased from working at FAANG). I'm happy to see someone trying to improve the code review situation on GitHub. When I did use GitHub at the startups I worked at, we were never fully satisfied with the code review experience.

I hope they surface "Phabricator versions" as first-class objects. I never liked it how GitHub just clobbers the history left behind from force-pushes from to branch.


Gitlab got this right, Merge Requests are versioned and you can look at the differences between versions of a MR.

It's somewhat amazing and shocking how little the review experience on GitHub has improved over the last 10 years.


Honestly what i want is for github to purchase https://reviewable.io/ and make it native.

Maybe gate sone of the more advanced features behind an opt-in so beginners are not too overwhelmed (and because you probably don’t need them on smaller projects), but the current state of things is just appalling, and more importantly has not at all kept up with workflows on large projects or long / complicated PRs.

The complete lack of support for reviewing over force pushes is insane.


The ability to quickly manage branches and reorder commits within them in Mercurial is super powerful, something that Git and `rebase -i` could only dream of.

Doing the same in Phabricator is awesome. One blocking commit PR on the bottom of a stack can easily be moved to the top or a new stack, unblocking the rest of your changes. Re-writing and entire stack due to one small change is mostly eliminated.

(Good) Tooling is a major gain in efficiency.


> The ability to quickly manage branches and reorder commits within them in Mercurial is super powerful, something that Git and 'rebase could only dream of.

This is a git CLI wart more than anything. Magit (git interface in Emacs) makes juggling branches/worktrees and rewriting history intuitive to humans. I feel super-powerful in Magit and absolutely feeble using the CLI.


I completely agree. When Google decided to invest in a mercurial front end to their version control system a few years ago, I was confused, but after using it for a while, I can honestly say it has much better ergonomics for common workflows. Creating a tree of dependent commits is simple and trivial. I sometimes wish I could create a dag instead, but that's probably more trouble than it's worth.


One thing Phabricator did/does worse than GitHub is the interface for browsing the individual commits that make up a PR. I like being able to tell a story with the sequence of commits and their messages.


I honestly don't feel like Github did a good job there either, I mean at least on the web UI because the VSCode Github extension helps a lot (when it's not buggy).


Gerrit is the only thing I've seen do change exploration well.

Github is annoying as crap. Can't even comment on a line in the file that needs changing but was missed in the PR, can only comment on a tual changes.

Phab is god awful, it's like someone built that whole system with a purist mindset, everything is perfectly designed with perfect referential IDs. But every screen is a view into the db table, no screens join information and make it usable. This is worse when you have n:m relationships because they expose those tables via screens too, it's just plain awful.


> Phab is god awful, it's like someone built that whole system with a purist mindset, everything is perfectly designed with perfect referential IDs. But every screen is a view into the db table, no screens join information and make it usable. This is worse when you have n:m relationships because they expose those tables via screens too, it's just plain awful.

I actually like Phabricator, but the DB schema is insane. Like, literally insane. I spent hours trying to figure it out once before giving up in disgust.


The actual observed problem in the article is that the team's SDLC pipeline can't handle the current flow it's being subjected to. Work is getting "blocked" in code review.

Devising a coping mechanism for developers so that they can increase the inbound flow to an already overloaded system is not actually a solution.

The real solution to this rather common problem is to introduce a culture where "done" means "shipped" and not "I created a bunch of PRs for my team mates to read and while they all do that Im gonna create some more" and to use tech where you can to increase the flow the SDLC pipeline can handle.

This involves things like moving as much "code review" into the compiler/linter as you can, do a little more design/consensus building upfront, streamline all acceptance testing and deployment steps to their bare essentials etc...

https://trunkbaseddevelopment.com/


None of which works well if your reviewer sits in another timezone and your next business day is their public holiday. Waiting 3 days or more is not an option when I could just comfortably stack my changes and have the whole stack reviewed whenever my reviewer is available.


Why would reviewing have to be done by a specific individual? If reviewing is the blocker, you need to raise the priority and allocate more resources to it.


For me code reviews are falling in two categories:

1. Just pass it by a second pair of eyes to make sure I'm mot doing anything really dumb or nefarious. Here I don't care who reviews it, and in my team the person currently on-call is usually the choice. 2. I'm working on something complex or that I'm not yet familiar with. In that case, I want a specific person, who I know is familiar with the technology or the project.


I work for FAANG and making changes to code owned by a team in the US is very common. However I'm from Europe and nobody else around me can review that code.


> Waiting 3 days or more is not an option

Waiting for more than 20-30 minutes for a review is ridiculous and means some combination of three problems are present:

* commits are too big to be reviewed quickly

* Tooling/testing is not catching a lot of blatant problems before the pull request happens.

* Review process has turned into some bizarre management ritual


Or:

* Developers on the team value their sanity and timebox chunks of time for deep focus work, leaving only parts of the day for reviews. Some developers may even take vacations or go to talk to customers, leaving a small team with less review bandwidth.


> Waiting for more than 20-30 minutes for a review is ridiculous

Does your entire team just sit around doing nothing waiting to review your code? That timeline is asinine.


I don't know that it is optimal, but our team generally batches PRs once per day, to avoid interrupting them. So typically it's going to take about a half day for a review. As different people end their day at different times.

Exceptions of course, but this seems close to optimal to me (from the reviewers perspective).


If reviewing is the bottleneck for your team then having some slack where people sit around waiting to do reviews may well be the most efficient use of time (e.g. having a reviewer rotation where each week one person is a designated reviewer).


Do you have a dedicated review team?


I suspect the parent is saying something like "much code review is not necessary, can be replaced with automated checks". So once you remove the 80% of code review that isn't actually needed, the review capacity is 5x more.


That is kind of my point. Review capacity is meaningless if external factors require a certain review _latency_.

Even if the actual review takes 30 seconds, it does not matter if my reviewer is asleep for the next 8 hours.


I've worked at both FB and at smaller companies. The "stacked diff" workflow is quite good, although it can be replicated via Github's PRs via "logical commits." The basic workflow is:

1. Each commit is a Phabricator-style diff. Put the whole idea into a single commit via `git add -p` and rebasing. 2. Multiple ideas/diffs for the same feature? Split them into multiple commits on the same PR. 3. Review the PR commit-by-commit: you can click on the links in Github to the individual commits and see+comment on the changes there. Comments on commits automatically propagate back to the PR FWIW, so this works pretty well.

The mental model becomes: the PR is the overview of the full feature, but not how reviewers are expected to review. Reviewers review commits.

It works pretty well! Assuming you have buy-in from your team to operate this way. In fact, I actually somewhat prefer it: with "stacked diffs," it can be harder to get the birds-eye view of the entire feature's set of changes. Git and Github are very flexible, so in practice I find you do need some established workflows that everyone agrees to follow, or else it's chaos.


This should describe commits for a normal PR workflow too though - each commit is the smallest deployable unit with a meaningful title and description. I guess squashing merges by default allows / encourages people to put less thought into individual commits since they aren’t expected to become part of the permanent history?


But, when using GitHub, isn't the missing puzzle piece here (which Graphite and other "stacked review tools" mentioned here attempt to solve) that you then cannot just merge those individual PRs one by one – you either have to merge the whole PR or nothing? Which leads us to the second option: to use multiple dependent PRs, which causes its own issues explained in other comments.


Stacked diffs in practice were often deployed as a unit at FB, in my experience, with the "Ship stacked diffs" button. There's maybe a little value in merging individual commits in a PR, but I think not a lot: if all the commits aren't in master, the feature doesn't work anyway. Technically you get a little bit of an edge from merging individual commits if earlier ones are acceptable and later ones aren't, in that those commits land in master faster and are less likely to have to deal with merge conflicts later — but in practice I didn't find this to be a dealbreaker. Most PRs weren't enormous, and most files weren't subject to many engineers modifying the exact same lines concurrently, so overall review time didn't usually cause too many conflicts — and conflicts usually weren't hard to solve anyway.


The "stacked diff" workflow means each diff lands as a separate squashed commit on master, even you click Ship Stack to land all of them in one CI run. This has advantages for code archaeology reasons.


I think his/her perspective, and mine, is that we don't want to merge individual commits.

My personal feeling is that I need to review each commit and then I need to review the whole PR (list of commits). If both are good then the PR can be merged but each individual commit is not OK to merge unless it's part of a PR.


> you either have to merge the whole PR or nothing

That's not true, though. You absolutely can merge a single commit, via git push, there's just no button in the GitHub UI for it.


> You absolutely can merge a single commit, via git push, there's just no button in the GitHub UI for it.

Well, I'm asuming we are talking about PRs against some protected branch where manually pushing changes directly is forbidden.


But you can't amend a commit without destroying the old version of it.

Stacked diffs run in two dimensions. In one dimension is the stack of diffs -- logical incremental changes to the code, which make sense to code historians from the future. But each diff in the stack also has a history as it progressed through code review.


Second.

Squashing and amending commits is equivalent to the workflow described in the article.


I really didn't understand what the big deal was with Stacked Commits when I was just working in vanilla github, and it used to cause friction with members of my team who knew the stacked workflow, but didn't know the Github workflow. After moving to a company that uses stacked commits I will never go back to the pull request workflow. I think there are some famous people saying that Pull Requests aren't Code Reviews, and that becomes obvious once you start using stacked commits for code reviews.

This is essentially how cli git is set up to function, and mailing list driven workflows use this concept to have small incremental reviews. I am excited to see this tool, but honestly I have soured so much on github because of the lack of code review, that I would just as soon it didn't try to shoehorn into pull requests, and just did what people do with Phabricator, optionally use GH too host the git repo if you need to, but do everything else in a different tool


I still don't understand the big deal so maybe you could help: what is the difference between stacked commits and pull requests targeting branches in other pull requests?

I've used GH for code review so I'm not sure what's missing there, either. Reading this whole comment section leaves me a bit confused, like I'm the only one using a hammer to pound in screws when everyone else has moved on to a screwdriver.


The big win for me is when someone reviews my code I can easily rebase the entire stack and push it back up for review, and thats all done with `git rebase -i HEAD~n`. That means when someone is actually reviewing my code its not uncommon for me to change something somewhere near the base of the stack and push up changes to the entire stack. You in theory can do that with branch based workflows but its a pain. As I understand this is what graphite is automating.

Once you can easily do that, it naturally leads to being able to do review per commit which it makes it so easy to unblock your work because your not fighting against git, and you aren't worried about getting immediate reviews because I know if any changes are requested its trivial to make those changes, and I still have branches, so if I do have unrelated work I can still use branches to isolate them. The last thing is another tool feature, when I rebase and push up changes I have to give a comment on that new commit so reviewers can see the changes that caused the new revision.

Honestly I was a heavy user of Github and GitLab having instituted one of them at three different companies, and built out CI pipelines and development procedures for multiple large engineering teams, and I now feel like I was using a hammer to pound in screws and I will never go back.


Could you elaborate what you mean with Stack? I'm confused. Is it some kind of logical grouping of commits that aren't branches?


Correct it is just the group of sequential commits. You can keep multiple stacks on a branch, but the big idea is the branch is not the unit of review, each commit is the unit of review


I'm curious, how is this achieved?

The way I logically group commits is by stacking branches which are merged without fast forward. Works well with interactive rebase.


I think I would need to see it in action to truly understand. Reviewing commit by commit sounds good to me, though. I do try to keep my commits focused, reviewable, and "complete" (basically shippable) so that part would fit with my current workflow.


GH will also automatically retarget the dependent PRs to the `main` branch after the dependency PR is merged. I usually just make a note in each stacked PR, "Depends on #N", mostly so that the review and merge sequence is followed. But even this would be clear on its own by the PR number and target branch.

I agree that the PR code review process leaves a lot to be desired, but this particular workflow doesn't seem like a big problem. Though I've never used Phabricator and similar tooling, so there must be something we're missing.


Only for branches in the main repo, not those in forks.


It's equivalent but horrible to actually use. Try managing a stack of 40 that way.


That sounds miserable no matter the tooling tbh!


It's actually fantastic with the right tooling. Keep in mind that the Linux kernel works this way too, with long series of patches.


I figure I can probably keep two or three problems in my head at once before I start making mistakes, and my branches stay in my head until they're merged or abandoned. 40? It'd be impossible. Maybe I just have a weird brain though (or am just not a good engineer, based on what I've read about stacks elsewhere since reading this post.)


You wouldn't do 40 completely unrelated things, they would be 40 aspects of a single thing. I would think 40 is getting a bit big as I have trouble getting stacks of more then 15 past a reasonable review, but the good news is ( at least with some tooling ) you can commit partial stacks, so I can let my team review the first 10, while I am working on 20-30 and everything still just works ( even after I merge the first 10 into the main branch )


The stack of 40 is usually quite related and often would be a 2-3 PRs in GitHub -- it also lets you perform pipelining where you can add new commits to the end of the stack at the same time as merging commits earlier in the stack.


Even when the tool support it, I'm not sure it's a good practice.

Small code reviews often miss the forest for the tree. When you make the extent of the feature changes even harder to mentally trace back and put together by having a diff over another set of code that is itself an unapproved diff, the code review becomes pretty low quality.

I've also done this where my change that depend on other changes get approved, but then the other set of code gets critiqued so I have to change it and that breaks the code that depends on it so I have to change that as well, and now I still need to get it all reviewed again, so really I just wasted someone's time the first round.

What I do now is that I make commits over it in my branch, so I keep working while they review my first commit, and when that's approved I send the next commit to be reviewed, and so on. If there are comments I do an interactive rebase to fix them.


Google had a good paper on the benefits of smaller changes: https://sback.it/publications/icse2018seip.pdf

> A correlation between change size and review quality is acknowledged by Google and developers are strongly encouraged to make small, incremental changes

Heard though about wanting to maintain context within a stack. I think the best balance is tooling that both lets you create a stack of small changes, while also seeing each change in the broader context (forest).

The benefit of breaking out PRs instead of commits, is that each small change gets to pass review and CI atomically. I like your strategy of gating commit pushes until the first is reviewed, but I think the dream is to decouple those processes :)


Having worked in startups for most my career, and now recently joining a BigTechCo, I think some of the cross-talk comes from the commingling of concerns within a PR. PRs are used to review architecture, enforce security/code-conventions/style, check for logic bugs, catch potential conflicts with other efforts, and validate the logic of a feature.

In BigTechCo you have more coordination problems, so you want to merge code to trunk as soon as possible to prevent integration risks. But that means you drive discussion of the architecture and features to other venues than than the code review. Code review becomes more of a "will this break things" check, and enabling a feature-flag is more the review stage for the feature itself.


Thanks for that comment, now I understand why BigTechCo's like stacked changes.

If reviews are only "will this break things" I would gadly review their big PRs :D


Hum, thanks for the study link:

> Previous studies have found that the number of useful comments decreases [11, 14] and the review latency increases [8, 24] as the size of the change increases. Size also influences developers’ perception of the code review process; a survey of Mozilla contributors found that developers feel that size-related factors have the greatest effect on review latency [26]

I'll have to dig into those. It makes sense that it decreases latency, but I'm curious how they assessed useful comments.

That said, I'm not saying to make massive code reviews, because reviewers are lazy, they won't want to spend more than an hour to review no matter the size of it. But when the reviewed code is made on top of code that itself has not been reviewed, the quality of the review suffers in my opinion.


That's what we do as well but we've got merge block labels. If a pr depends on another PR, we put a merge blocked label on it and reference the PR in the comments. We've got a GitHub action which doesn't let a merge happen unless zero labels are present.


Usually when feature-branch-1 is still under review and feature 2 relies on feature 1, I just open a PR from feature-branch-2 onto feature-branch-1 to get the review process started, knowing that feature-branch-1 might still change which I can deal with by rebasing.


I came from Airbnb (vanilla github) and some friends converted me to a stacked workflow. I think the challenge with vanilla git is that rebasing and re-pushing becomes a brutal process if you stack 3+ changes. (At facebook people regularly stack over 10). Beyond recursive rebases, Github offers no support for navigating your stack, merging in a stack of approved PRs together, adjusting merge bases, etc.

Most of these practices are just inspired by what Phabricator and a few of the big companies are already doing well :)


>(At facebook people regularly stack over 10).

Meaning branch 1 depends on branch 2 which depends branch 3... all the way to 10?

That just seems crazy to me. What happens if branch 4 changes their implementation and borks everything depending on it?


Glue code is pretty easily split up into a stack. 90% of what I’ve worked on has been glue code.

To ship your change you need to update what’s there already. Tool A needs an integration point. Tool B needs a hook added. Tool C finally joins A to B.

Implementing the bits for A and B found two useful refactorings in each codebase. At the end of it all there was also some dead code.

That’s 8 commits. The downside is you end up yak shaving a lot.


You have to fix a lot of merge conflicts. I used this workflow at AWS -- code reviews can take some time, you're writing a lot of code, and you want your CRs to be small.

If someone leaves a comment on the CR for the first branch you fix the issue and rebase. It's not fun, but this is the only effective way to do it (that I know of).


So are the 10 branches mostly yours then? That seems slightly less crazy.


Yup. I rarely even pushed the branches — they’re local branches


Many FB devs don't use branches or maybe use one branch with the entire stack.

What you're describing is sometimes a problem but often not really one.


Are you sure about merge bases? I regularly use GH to specify a branch that is part of a different open pull request.


Yeah this shouldn't be an issue when following standard git branching practices. It's kinda the point


One problem with GitHub is that you can't do this with branches from a fork.


Can you be more specific what you think can't be done?

I just tried it, and I can without problem open a PR from a fork against an arbitrary branch of an arbitrary fork (or main repository).


You can't open a PR in the parent repository against a branch in the fork. You can create that PR inside the forked repository, but that doesn't allow the maintainers of the parent repository to do anything with it. So if you've got two stacked branch in a fork, you can't open two PRs for them in the manner GP described.


but if everyone works at the same company they can create the branches in the same repo


Zuul is an open-source CI/CD/project-gating-system that implements this for Github, GitLab, Gerrit, and pagure. It actually lets you stack dependencies between any of those systems. Just say "Depends-On: <url>" in the PR.

See https://zuul-ci.org/ and https://zuul-ci.org/docs/zuul/discussion/gating.html#cross-p...


Zuul is great. After using it for about a year in production, I must say Zuul is everything I hoped for and more (coming from a Jenkins deployment previously.) Some of the concepts take a bit of effort to wrap my head around but once I understood them they make a lot of sense. Leveraging of Ansible is great. Making jobs cheap and inheritable is awesome.


As the parent poster well knows (as they invented it) the trick here is that you write your jobs to install from a checkout the CI system does for you.

https://opendev.org/opendev/system-config/src/commit/0a27974...

is really nice practical example. What it does isn't even really that important; but it runs a simulation of deploying production code in OpenDev. This is a "devel" job, we deliberately test any changes against all the latest HEADs of the master/main/development branches of projects we use. This is a non-voting job -- a notice that what you're introducing might be fine now, but there is trouble brewing when our dependencies release their next version. Sometimes that's fine and a known issue, or upstream is broken, and other times it's something totally unique and needs to be fixed (this is why, despite AI being able to write code for you, so far the implications of using that code still need a human in the loop :)

The "required-projects" in the job definition tells Zuul what repositories this test needs.

You can clearly see how it handles various projects having "devel", "main" or "master" branches to pull from to get their latest versions.

In the "vars" section you can see we're setting variables that get passed to the job roles flagging "this is the devel job, don't install the latest release but use the source Zuul will checkout for you from here".

The amazing thing? If I make a change and this job fails, I may debug it and find that it wasn't actually my fault, but something in upstream Ansible committed recently. I can propose the fix upstream. They do all their CI, and that's fine. But I now put in my change comment

Depends-On: https://github.com/ansible/ansible/pull/1234

and magically Zuul will recognise that I want to apply that pull request to the Ansible tree in testing and set it up for me (as noted, Zuul can do this for all sorts of systems, not just github). Additionally, Zuul will not merge the change until the dependency is satisfied -- I can NOT commit broken code!


This is awesome! We should add support for this to the CLI, I took a note


For anyone interested, I've been collecting notes on various tools in this space:

https://github.com/epage/git-stack/blob/main/docs/comparison...

(granted the page doesn't mention git-stack since that is assumed)


Do you have a favorite?

I've struggled with this problem for a long time, but never knew that there were tools to solve it.


I'm biased as the author of git-stack but struggling with this problem is why I wrote it. When I started it, I didn't know there were so many tools in this space. However, if I knew then what I knew now, I would have still written it. Too many tools require adopting a specialized workflow which doesn't play well with everyone's situation or other git tools. Some do but they require manually defining each change which is too much boilerplate for me.

My goal with git-stack has been to automate my PR workflow and sit alongside other git tools. I've even tried to keep it open so you can use parts of it with Phab or other tools. You should be able to start using it out of the box with a standard PR model. It'll have better performance and behavior if you configure which branches are shared branches and mark them as "protected" in git-stack. If you don't use a PR model, then that might just don't run `git stack --push` but the rest should still do a reasonable job.

If I were to pick a tool besides git-stack, it would either be git-branchless or Graphite.


> How do you tell reviewers that your reactions PR depends on the comments PR?

It's at the top of pr.

"someone wants to merge 1 commit into random-branch from another-random-branch"

> How will reviewers see the relevant diff for the reactions PR (excluding the comments PR changes)?

It will show it by default? If they want to see comments PR changes, they will need to go to comments PR.

> How can you propagate changes to the reactions PR if you ever need to update the comments PR before you land it (i.e. to address review comments)?

Use git rebase --onto.

It would be nice if the marketing page told prospective users what exact difference their tool provides - since I'm already using that workflow and never seen one of those internal tools.


Hi, sadly no marketing people at this company. Just engineers and product people.

We were trying to get this to work on Github for almost a year before we decided to just build this ourselves https://twitter.com/TomasReimers/status/1325647290128850950

Going through everything that's different is hard, but I'll try: When was the last time you made a stcak of changes that was over 30 branches tall? Why?

Smaller PRs are generally agreed to be strictly better, and you've probably written a 2000-4000ln pr at least once (which could have been 30 100ln PRs). So why did you not break it out? What part of the tooling was broken?


> When was the last time you made a stcak of changes that was over 30 branches tall?

Is this a thing? I think the highest stack I've ever had was 2. I suppose it could depend on company culture/tech stack/code organization (monorepo vs. services).

> 2000-4000ln pr at least once (which could have been 30 100ln PRs).

I can't say I have. Maybe deleting a lot of code where it's very obvious what is being deleted and why. What sort of work takes 2000 lines that can't be broken down into smaller, incremental pieces?


You can break down a big feature into smaller bits of code, but what's the point of releasing a some new code that's not being called by anything yet, when you might change it as you converge on the final solution?

What's the point of a PR that's just an HTTP handler + adapter that's not exposed via an endpoint yet, and might change once we implement related the domain layer methods? What kind of context will a reviewer have on whether this is a decent handler + adapter when there is no other context about which methods the handler will call and what data will be returned?

If we release the domain layer first, we may want to adjust the domain data model in order to better serve the application layer, or work better with the infrastructure/datastore layer.

And so on.

IMO the PRs need to be big enough to encompass an atomic, complete unit of functionality or else it's basically impossible to tell if the approach and code is any good.

"It doesn't break anything" is not good enough, you can still create tech debt by releasing code that may need to be changed to complete the feature. Or someone will just implement the rest of the feature in a worse way in order to integrate with the non-ideal code that you already released when you didn't have a good enough view of the final solution


The point is that others see that it's there, they can start to take it into account when they develop their solutions.

Of course if there are problems with people depending on not yet final code that seems more of a higher level communication issue than git workflow issue.

Again, if atomic unit of functionality needs huge towers of stacked PRs that indicates that they are not really atomic and the design phase (communication) seems to be missing.

After all the reviewer should know about the design, the HTTP handler + adapter should not be a mystery to the reviewer. And so on...

...

Now, that said, sometimes looking at a big PR as whole takes up less review time than in daily/weekly increments. But if the reviewer only sees the thing late in the dev process plus the design is not agreed upon then that means the developer writes a lot of code that might go straight to /dev/null when the reviewer disagrees with the fundamental design of it.

The assumptions needed for these huge dangling WIP PRs to make sense seem very strange (risky and suboptimal) :o


It doesn't have to be an either/or. The method that I recommend to my reports is to develop the feature in a large WIP PR branch, get some rough reviews on the macro architectural choices in that PR, and then once you have a greenlight on the larger approach, break it out into smaller PR's that each represent a smaller change necessary to effect the feature.

There are definite benefits to shipping less code to production in one atomic deploy. When something breaks (and it will), you want to wade through as little code as possible while things are burning in order to identify what went wrong. Sure, you could just revert everything that went out with the last deploy, but sometimes things aren't that straightforward. Say when there are other sibling changes that are not easily roll-backable that you'd have to clobber to do a pure rollback. In these scenarios, it may be safer (albeit slower) to revert your individual change and deploy the revert by itself.


I get what you mean, but rolling out in chunks can make it even harder to pinpoint and revert issues.

If you roll out chunks A, B, then C, and things broke after C, the issue can still be in A, it just doesn't get exposed until it is accessed in a certain way by C.

How quickly can you determine which one(s) to revert?


You revert C, because that's what caused the issue. Then you investigate.


Better yet when possible, you turn off the feature flag that enabled the code path.


I’m working on having a large scale open source project support a new SQL database (Postgres) and the migration work is about 2500 lines currently since I have to touch basically every part of the project. I don’t want to do the migration piecemeal since turning on the Postgres unit tests for only sections of the code base and keeping track of that and worrying about all the dependencies of code changing as other people touch the project is a lot more overhead than doing it all waterfall style.

Now, I will say that these type of things are not everyday kind of PR’s, but they have certainly happened several times over the course of my career. Probably once or twice a year if I had to take a rough guess.


I think you have an useful feature, but needing to get someone else answer me what it is in comment (https://news.ycombinator.com/item?id=29256481) isn't a good way to explain the product.

Regarding your questions, the most I've done was 4, usually at most 2. My PRs are single commits that I amend and force-with-lease after getting useful code review feedback. I don't think rebasing one after another is particularly annoying, but I think not having to do it is a nice feature. I wonder whether it's a product tho. I wish you success anyway.


Parent did not imply that you have or should have marketing people. Your first sentence comes across as defensive. You should express thanks for this feedback and avoid sounding defensive, as that is alienating to people, stifles dialog, and casts you in a weak light. This is your moment in the HN sun, soak it up!


That did not come across as defensive at all to me. It was pretty clearly tongue in cheek. Your comment, in fact, comes across as judgmental and condescending


Thanks. I can see that I could have worded my feedback more tactfully, and will do so in the future.


Oh hey, I'm sorry! :(

That sincerely was not my intention, and a lot of context and playfulness can get lost in comments. For what it's worth, I was simply trying to make a joke about why our marketing materials may not be as good as they should be and continue the conversation. I apologize if that came off as defensive and will make sure I double check my future comments so that they don't come off in the same light.


Ironically, this reply is also an apology!

No judgment, I just found it funny.


No worries! My intention was just helping you all look your best -- which you are doing with this friendly reply.


Thank you for the feedback :)


The docs for the CLI and dashboard go into some more depth - might be what you're looking for?

- https://docs.graphite.dev/guides/graphite-cli

- https://docs.graphite.dev/guides/graphite-dashboard

Re `git rebase --onto`, the open source CLI offers a recursive implementation to prevent you from having to carefully rebase each branch in your stack (https://github.com/screenplaydev/graphite-cli/blob/main/src/...)

One of the reasons you cant use a simple rebase --onto is that you dont want to accidentally copy all downstack commits between the merge base and what you're restacking onto. The CLI tracks branch bases commits through git refs to ensure that restacking never duplicates commits.


>Re `git rebase --onto`, the open source CLI offers a recursive implementation to prevent you from having to carefully rebase each branch in your stack

Thanks - you've answered what I've been looking for and I'd prefer that this information was in the marketing copy.

I agree that this would be an useful feature. I don't think this is a product though - I don't see myself doing more than 3 or 4 stacked diffs when this functionality would really shine.


As far as I’ve used it, git rebases ignore duplicate commits.


They are not duplicate if you change something (like, using amend) in one of the preceding branches.


yes that’s a possibility too, but then the typical merge resolution rules should apply. If there are common areas modified that is going to incur a manual review. There’s no magic automation I’d really entrust to fixing that.


I second that. It's a bit surprising to me that teams could have such big PRs and branch stacks. I can only say from own perspective but I imagine this kind of nightmare would be our reality now if we didn't pay attention to improve the we work and use PRs along the way.

PRs work very well for us and if we find ourselves stacking them, then we keep an eye on getting them into master (yes, we don't use the 'main' bs) in reasonable order. Then Github is actually doing job for us changing the target branch once the underlaying changes are merged into it.


Merging a branch into another branch is fine, but then you have to merge that other branch up into main. And that shows effectively two branches' worth of diffs, which really muddles things if some of the changes have already been reviewed.


As long as you review the 'comments' branch first, merge it, then move on to reviewing the 'reactions' branch it should be fine.


Yeah one pain point is if you squash your commits on PR merges, you'll have a situation where the branches look like:

    master: 1, 2, 3
    feature-1: 1, 2, 3, 4, 5, 6,
    feature-2: 1, 2, 3, 4, 5, 6, 7, 8
Then we merge feature-1 into master (numbers in parentheses indicate squashing):

   master: 1, 2, 3, (4, 5, 6)
   feature-2: 1, 2, 3, 4, 5, 6, 7, 8
Then say we have another commit to master:

   master: 1, 2, 3, (4, 5, 6), 9
   feature-2: 1, 2, 3, 4, 5, 6, 7, 8
And now we need to rebase feature-2 onto master. We can't do a normal rebase because git will try to do:

   feature-2: 1, 2, 3, (4, 5, 6), 9, 4, 5, 6, 7, 8
Instead we have to do a slightly tricky maneuver where we rebase just 7, 8 onto master, getting:

   feature-2: 1, 2, 3, (4, 5, 6), 9, 7, 8
It's not impossible and you can memorize the process pretty easily, but I'd love a natural solution to this problem.


This is exactly the problem that graphite-cli solves (https://github.com/screenplaydev/graphite-cli)

It keeps track of branchs and their parents by storing a tiny bit of metadata in the native git refs. It uses that information to perform recursive rebases: https://github.com/screenplaydev/graphite-cli/blob/main/src/...

It ends up working seamlessly - you just modify some branch, and then run `gt stack fix` to recursively rebase everything. (and then `gt stack submit` to sync everything to github :)

docs here: https://docs.graphite.dev/guides/graphite-cli


Sounds great! I signed up for the waitlist. I’m always for ways to make version control more ergonomic and feature full


to my knowledge the easiest version of this manoeuvre, if you still have feature-1, is

    git checkout feature-2
    git rebase feature-1 --onto master
rather confusing. is there a better way?

edit: oh, you mentioned extra commits on master since the squash of feature-1. does that require two rebases?

    git rebase feature-1 --onto master-at-feature-1
    git rebase master


> does that require two rebases?

No. Just git rebase --onto=master feature-1 will do the right thing.

(Personally I merge rather than rebasing to avoid this whole problem).


so assuming you have merged feature-1 onto master and squashed, you check how many commits have been made since feature-1 on feature-2 (in my prior example 2 commits of 7, 8) and do:

    git rebase --onto master HEAD~2


I amend my commits. My PRs are always exactly one commit, so this problem does not exist.


Great? I don't get comments like this. It's like responding to someone saying their Ford's mileage isn't great with "I use a Honda so this problem does not exist". Good for you but saying the problem doesn't exist for you isn't helpful.


> The blocker here isn’t Git. In fact, the reason you can’t go with option #4 is that your code review tools probably don’t support it

I do stacked commits (branches off branches) all the time. Github certainly supports PRs based on such.


I still haven't found a reliable way to prevent merging branches that aren't the first in the stack :/


(Graphite engineer) We had the same problem. Coded in a feature that prevents you from casually merging branches that are not bottom of the stack :)


What I usually do is if I need to stack more PRs I open them as a draft to avoid accidental merges. I then convert each PR to a non-draft as I “pop” the stack.


How do you deal with rebases when you have to?


Like so:

    git switch offshoot
    git rebase --onto feature branchpoint^
Will replay your offshoot branch onto its underlying feature branch after the feature branch has been rebased. The branchpoint is the first commit that diverges from the feature branch.

Finding the branchpoint commit is the only real annoyance here -- if there's a ref or a better trick for finding it, I'm all ears.


My usual workflow to rebase a stacked set of feature branches off of main is:

    git checkout main
    git pull
    git checkout feature-branch-1
    git rebase -i -
    git checkout feature-branch-2
    git rebase -i -
    ...
There's most likely a more efficient way to do this, but I find that referring to the most recently checked-out branch with a single hyphen is a lesser-known feature, so perhaps it will be helpful :)


Nice, didn't know about `-`. What's your process once you're in the interactive rebase? The `git rebase --onto` I use is noninteractive and could be an alias, were it not for the business of finding the `branchpoint` commit.


I have an extra script in the CI job to detect rebased/squashed commits and fail the build if they're present.


The article manages to stay entirely free of useful information, despite such a captivating title.

See previous HN discussion for more background and substance: https://news.ycombinator.com/item?id=26922633


Some of the other blog posts and docs have some good deeper info:

- CLI docs: https://docs.graphite.dev/guides/graphite-cli

- Dashboard docs: https://docs.graphite.dev/guides/graphite-dashboard

- Storing stack metadata in refs: https://graphite.dev/blog/post/y6ysWaplagKc8YEFzYfr

- Tracing stacks of branches: https://graphite.dev/blog/post/hGn1nt8nFr5cMhvFJs87

- Lengthy conversations in the community slack :) https://join.slack.com/t/graphite-community/shared_invite/zt...


While the stacked code review process at Google is pretty smooth - if you're using the default version control system, keeping your chain in sync is a nightmare. If C depends on B depends on A and you made a modification in A, there's no automatic way to rebase anything and you have to start patching B then C manually. It's much much easier with the internal version of Mercurial, but the default VCS doesn't support this method.


I'm surprised Reviewable[0] hasn't come up in this discussion. It does a great job of allowing stacked code reviews and even handles rebases nicely; the reviewer sees the diff between commit #1 and commit #1' (prime = after rebase).

CockroachDB[1] has been using it since very early in the project.

[0] https://reviewable.io/

[1] https://github.com/cockroachdb/cockroach


I also switched my dev team to it once it grew sufficiently large (at a now acquired startup). The primary thing it enables is actually having a structured conversation about your changes, and having several _revisions_ of your changes through which the reviewer can navigate, much like Gerrit or its progenitor - Critique. Once you experience that, there's simply no going back to anything else.

The UX makes me want to gouge my eyes out with a dirty spoon, but it's still worth it, in spite of the UX. I'm baffled why MS/GitHub can't just go ahead and rip it off at this point, to replace its own utterly atrocious PR review workflow.


The part of Graphite that helps manage stacked changes locally is open source - you can see how it handles the recursive rebases here: https://github.com/screenplaydev/graphite-cli/blob/main/src/...


I have definitely had this problem before and wanted this solution.

Not sure if I want it enough to introduce a third-party tool to my relatively small team(s).

I wish Github just had it built in.


it is, it's just not exposed well. in git create two branches stacked on top of each other, and then in the PR creation window, the first PR goes from master to branch-1 as normal. Then create a second PR for branch-1 to branch-2. it's a bit of futzing around with git and GitHub but it's there/works without another tool. Or you could just use the tool, up to you.


I have had trouble with GH displaying the correct diff.

I guess in most cases I don't really want to merge branch-2 into branch-1 either, I want to merge them both into master -- but first branch-1, and only then branch-2.

I don't know if that's the use-case the third party tool is focused on?

When I try that with github alone, which I have... it just gets really messy. Sometimes I need to switch the PR to a different random base branch and then back to the actually desired one to get github to update the diff when the base branch has changed since PR opened.


I worked at Google and have no idea what they mean by stacked change. You could push multiple changes to a single CL, but those were rarely reviewed independently.

Maybe they mean keeping a stack of CLs ready, while waiting for a review? I often had 1-2 CLs based on.. rebasing was fun sometimes.

Or they mean diffbasing these subsequent CLs, for an early review? That was useful, but due to volatility dependent on the first review, a bit risky.

Plug for a code review project: https://reviewpad.com/


Stacked diffs is Facebook terminology for diffbased CLs (Google). Except Facebook has invested a lot of tooling into it to make it possible without many of the downsides (see the comments here about 10+ diffs stacked on top of each other).


Worth noting that this is coming to GitHub: https://github.blog/changelog/2021-10-27-pull-request-merge-...


So merge queue, while cool - is slightly different than Stacked changes. Stacked changes are a better client-side git workflow for dealing with branches on top of branches.


After seeing so much noise and struggle at the past couple places I've worked regarding people having to beg and make noise about getting reviews, I half jokingly think someone should make a "begging for PR reviews as a service" startup.


A few jobs ago I put a PR in a few days into our two-week sprint. Our team of purportedly full-stack developers was actually a team of two front-end and seven back-end developers which meant that the Jira tasks were written such that tasks in the same sprint were all interdependent on one another. This always resulted in long cycle times and a mad rush to get everything merged the day before sprint review. This was a bit painful as our team lead was obsessed with Git commit history and required branches to be rebased onto the main branch, so every PR that got merged ahead of yours meant that you had to immediately jump in and rebase yours to make it eligible for review again.

The PR that I put in before everybody else's was summarily ignored because everybody was busy getting their own stuff in. I spent cram day rebasing my PR at least a half-dozen times as commits flew into the main branch. A few minutes before cut-off I asked in a team call if I could get a review; my team lead told me that there was no time and that it would have to wait.

Needless to say I've since found a team that does the opposite (reviews each others' code in a timely and constructive manner). I really enjoy working with people that care about delivering value and helping others... it's vastly improved my job satisfaction.


Agreed. In my opinion, just goes for show that you need a good team organization/culture more than another tool.

While I'm sure a lot of people will find this tool useful, I believe it will also help disfunctional teams last longer, which just means that the real problems (like, why people don't review code in timely fashion) will take longer to resolve.

As always, just my two cents based on my experience :)


The problem is that in most teams code review is considered an auxiliary task that's off the books (from a ticket tracker perspective) where as in reality it should be a ticket in its own right that someone can assign themselves to and review the associated PR.


This is a team culture thing. Prompt and professional reviews have to be valued. If you're a senior engineer (or at least senior on a code base), you can add at least as much value to the development process by giving others timely and fair feedback as you can writing your own code—it's a force multiplier.

(Edit) For big changes I've seen people schedule 30-minute review meetings. We all hate meetings but I don't think it's a bad idea.


Everyone says they value good reviews, and I think that's actually true to some extent, but 'prompt' comes at a real cost as you have to stop what you're doing to take a look.


The strategy I used for smaller changes was twice-a-day queue review; noonish and end of day. "Prompt" doesn't have to mean instant, it just means not sitting on a change for days.


Not prompt reviewing has a real cost too. If it takes a long time to get a review, the submitter loses context and has to regain it when the review comes in.

The longer the cycle, the more contexts to be juggled and the more effort to juggle them.


Also, if you are building upon the code from the prior review, and you need to shuffle things around based on feedback from the review, then your more recent work also must be redone, resulting in waste.


Exactly - and waiting for that review leaves you in this holding pattern where you'd really like to be moving the ticket along, but you kinda sorta have time to do something else.


There's just a lot going on with them. The way they pop up and you can choose to either wait or interrupt your own work to process one. If everyone waits, then the person needing the PR gets hassled by their manager that they need to "go out and get it", which means more aggressively hassling other people, which is stressful and unpleasant for everyone.

There's some kind of game theory dynamic going on where I don't think the outcome is really ideal, even if most people involved are trying to make it work.


In theory time should be split into 30% research and planning, 30% code reviews and 30% writing code.

In practice most devs spend 90% coding and 10% hastily rushing through reviews.


Right, each one of us needs to finish the work before the deadline. I really don't like how we perpetuate assignments to individuals. I suppose it stems from our education system and ticket assignments but I would really like to see more teamwork built into our daily workflow.

Some people assign timeslots for reviews, but when deadline is pressing, all the timeslots go away.

The only solution that I have seen, and am happy to be a part of currently, is to have a close coworking team that's responsive. In short, it's not the tooling, it's the people.


We've actually started doing this, though it's working around the fact that I couldn't figure out how an author can send a PR back to a reviewer for another round of review so that it shows up in a queue nicely.

It's really nice. We have a kanban column 'Needs Review' and I've dumped tasks there for PRs, documents, and all kinds of other stuff.


Hmm it could work like those LinkedIn emails. Automatically send after two days, "Hey I know you're busy, aren't we all? And the last thing I want to do is give you more code to review. But you really should check out..."



You really don't need a bunch of complicated infrastructure for this. I did this in the past by just creating my own high level feature branch and having PRs into said branch. This was all my own branch, but the purpose of the PRs was to get feedback before the whole thing was ready to be merged back to main. While the final PR will be large, everything has already been reviewed so you aren't losing anything in terms of review quality.


What is the difference between stacked changes and branchless?

https://github.com/arxanas/git-branchless

based off of the branchless Mercurial workflows at large companies such as Google and Facebook


git-branchless doesn't (yet) aim to integrate with hosted Git providers in any particular way. You just use remotes and push as normal.

I think the CLI is mostly at parity in terms of features. git-branchless has `git undo` while gt has `git repo sync`. (Tracking issue for git-branchless's sync: https://github.com/arxanas/git-branchless/issues/174)

git-branchless integrates more directly with Git, so you don't need to learn any new commands if you don't want to. If you ruin your stack by amending an old commit, a warning will appear telling you to run `git restack`, which will fix it.

Besides that, git-branchless does rebase operations in-memory, which makes it faster.


A bunch of small things, but the biggest change is we also have a website to help with the authoring/review of PRs experience :)


I think this is a notable try at a strangely underserved space. Really interesting to see how code review is done in-house at FAANG+ and yet GitHub doesn’t quite get at the level of the tools of these giants.


I've been waiting for this forever. Stacked diffs is how I've always worked even before working at FANG, and github just doesn't support them right out of the box.

Kudos for getting this right!


how do you typically stack diffs?


Currently waiting on 3 PRs to get reviewed just so I can start developing on top them. This is a major pain point in my current development. My workaround is to have two separate copies of our monorepo locally and develop on both of them since our internal git wrapper doesn't allow two PRs from the same file. It probably wouldn't be that hard to extend our git wrapper to call graphite. Just signed up for the waitlist!


So dope! Been watching Tomas and Merrill work on this for the past few months, and now watching it spread through my dev team like wildfire.

Congrats Graphite team!!


Thank you for the kind words!


While I know to well that a lot of tools don't support it well aren't stacked changes the standard??? (Which now that I think about it makes it even more strange that github doesn't support them.)

I mean yes, e.g. on github it's somehow pain-full as you have to manually select only the last "n" commits (excluding commits from the previous stack) and then the "file viewed" function might not work correctly anymore :(, oh and the "changes since last review" function also like won't work well anymore (not that it does work well normally).

Still even with the drawbacks stacked reviews tend to be faster in small teams (e.g. startups) with a lot to do.

Through often you only start reviewing thinks "later in the stack" after a initial review of the first PR and any "major change requests" are done. Similar you have often a less strict security model/more trust into your co-worker to not maliciously abuse it to sneak something in.


Sooo, like Gerrit?


Gerrit is probably high up on the "Most underrated dev tools" list. People don't like it because it really uses git (which people don't really like) and doesn't have a flashy UI, but it is so good.


I do this with Bitbucket all the time. I just create a pull request for the "stacked" change and target it to the branch containing the original work. Once the original work is reviewed and merged, I retarget the stacked pr into the release branch.

If the stacked change gets reviewed first for whatever reason, the diff is displayed correctly.


Doesn't Gerrit already allow this?


> The blocker here isn’t Git. In fact, the reason you can’t go with option #4 is that your code review tools probably don’t support it:

No, git is not the blocker... but neither are the tools. The big blocker is a lack of confidence that someone won't ask me to drastically rework my PR.


> The big blocker is a lack of confidence that someone won't ask me to drastically rework my PR.

Another big blocker is the confidence to delicately review a PR and demand it to be drastically reworked.

Actually, I think they're the same problem.

When your livelihood is tied to other humans, you generally want to minimize conflict.


Not really the same thing. The point is that if someone wants you to totally rework the base PR then the follow-on work will need to be redone.


There are also control freak psychopaths in our midst who just love telling people what to do and exercising control. They don't really know anything about producing quality software, but they sure know how to tell people they're wrong.


Looks fantastic! Critique (at Google) makes my life so much easier, glad to know that if I ever move companies that there’s a reasonable alternative! Also finally something to recommend to friends that are jealous of Google’s dev tools


I don't like the fact that I need to grant read/write access to all public and private repositories to use the dashboard or create PRs. I know that github has no decent solution to limit access scope to 1 repository but this is the reason why I haven't tested graphite on any real work. I guess that the only sane way would be to create a separate github user which only has access to a single repo where I'd like to use/test graphite.

At the same time I'd probably be ok with the CLI having wider access, as long as the website does not have it. I can control/monitor CLI to some degree but not the web part.


When I moved from Facebook to a startup, I remember being surprised by lots about our GitHub-based review process: the prevalence of large PRs, the ambiguity in how PRs related to each other, and the frequent pings you'd get to review other's recently pushed work. I initially encouraged others to use stacked changes to help with all this, but after realizing how clunky it is in GitHub, I stopped. Based on what I've seen so far of Graphite, I think it'll go a long way to making stacked changes simple such that they should be the norm on engineering teams, excited to use it more!


Does this do anything beyond helping you and your team keep track of which branches your feature branch is based off of, and which order they should be merged in? I think you can already do this in git / GitHub. I usually do it by starting a PR with "based on #n which should be merged first". You can stack any number of PRs that way. You can even set the PR target to the base branch to narrow down your diff for review, and change the target to your main branch any time you want. What does Graphite add to this workflow?


Great question! It does lots of things. The CLI (https://github.com/screenplaydev/graphite-cli) lets you:

- Recursively rebase changes to keep your branches correctly stacked

- Allow you to shift half of a stack onto a different branch

- Open up PRs/push changes for all the branches in a stack

- Offer to delete merged branches from local, rebases the remaining branches, and adjust github pr merge bases.

The web dashboard lets you:

- See an inbox of PRs spanning any number of repos, based on which ones need action

- Navigate between PRs in a stack

- Modify Prs, review, all of which is synced to Github

- Shortcuts, client side caching for fast loading, Phabricator style interface, macros, landing a stack of PRs together, and much more :)

Give the tool a try, we'd love to hear your feedback in the community Slack! https://join.slack.com/t/graphite-community/shared_invite/zt...


What I sometimes do in this situation is just branch the second feature off of the first one's branch while it's in review, and then when the first one gets merged and/or updated, I rebase the second one. This is a bit of a pain because you have to keep the moving pieces in your head, but it usually works out cleanly enough. Of course you can't really put the second one up for review until the first one has been merged, but you can at least finish the implementation.


We tried manually rebasing too, before building Graphite. The challenge you face with manual rebasing is two parts:

1) If you update the bottom branch, you need to manually rebase each branch above it. That becomes brutal if your stack is >3 branches.

2) You cant perform a simple rebase-onto, because you'll copy all commits between the higher branch and trunk. You'd have to perform a three-way rebase, specifying the range of commits you'd like to copy onto the destination. This becomes infeasible by hand.

Graphite-cli gets around this by tracking branch metadata and storing it in native git refs (https://graphite.dev/blog/post/y6ysWaplagKc8YEFzYfr). When you rebase a stack, it recursively performs the three-way merge to fix things up smoothly.

On top of this, git provides no good mechanisms for submitting the stack. Graphite cli can submit/sync your whole stack as individual PRs, and can prune merged branches from the bottom of local stacks. Ends up coming together as a really powerful workflow :)

The cli is open source here: https://github.com/screenplaydev/graphite-cli, with docs here https://docs.graphite.dev/guides/graphite-cli. There's also an active Slack community which helps provide input on new features and adjustments.

Please let me know if you have any other questions!


> 1) If you update the bottom branch, you need to manually rebase each branch above it. That becomes brutal if your stack is >3 branches.

I do this relatively easily with `--rebase-merges` flag.


this is the thing I've missed the most about development at Facebook. It seems so small, but is a huge productivity boost. Especially if applied to an entire engineering team


If you have a stack of 5+ branches, the easiest way I've found to deal with this is to take the top branch with all the commits and rebase it. Then list all the commits in one window and list all the origin commits. You can then do a

  git branch -f branch-name commit
for each branch-name without having to switch branches or do any more rebases. It helps if you have short, easily distinguishable commit messages. Then `git push --force-with-lease` each branch.


This is awesome, I can’t wait to try it out! I remember when stacked changes first become a thing at FB, and it noticeably improved engineering output and code review ease


I'm confused. The solution seems easy. I have PR #80 in review and my next task depends on it. So I branch off of #80 and just starting working on #94. If there is a holdup on #80 and I want feedback on #94 I just create a draft PR with a note about #80. When #80 gets approved then I'll merge it and merge those changes into PR #94. If it gets messy, I'll just torch #94 and use my IDEs local history and do it manually.

What exactly do stacked changes do?


Right but your diff for #94 will include all of the changes in #80 as well, because it's being compared to the main branch. When you do this a few layers up your diff is suddenly massive and hard to navigate for review.

Also if you need to change anything about #80 you have to rebase #94. Which also means the same for any other branches that you or someone else created on top of #94 that depended on #80 landing


Make feature branch off main, push, make a PR, checkout main again, make a different feature branch with a different PR, repeat. Why would I continue using the other feature branch for the 2nd?


Because if the second feature requires the code changes from the first PR you can't do that...

If your team doesn't have a use for stacked diffs that's fine. Teams of one also don't use pull requests and might not understand why we need those.


It's very rarely come up for me that multiple open issues depend on each other. Maybe it happens more in some codebases than others.


I think there’s a real use case here, especially considering how so many of FAANG+ have decided to build their own code review platforms to improve code quality.


Ugh. I'm annoyed when homegrown is the first reaction (something else to support, fewer people to fix it), rather than community &&/|| industry collab.


Not a fan of phabricator. To me the whole stacked changes thing is at best a necessary evil but not something I would recommend for normal size codebase


This is just an ad for the startup. It literally doesn't tell you want the thing does. But FB / Google uses it so it must be perfect.


Looks like a great improvement to my workflow, does it require write access to GitHub or can it be read only? Any plans to support GitLab?


The CLI does not require any access to Github, and if you want to start doing code review on the platform, we ask for write access just so we can post comments and reviews :)

Gitlab is actively on our roadmap; we currently have a few CLI users that have made it work, but feel free to reach out if it's blocking you.


This has been the biggest pain point for me since leaving big tech! Really excited that there is a solution out there. Well done team!


That pain was what inspired us to create Graphite as well :) Thank you


Been playing around with this for the last couple of weeks and now I'm getting the whole team to use it at OfficeTogether.com.

Really nice!


Tiny commits seems normal. I'm wondering if this can squash regions of sequential commits by the same author where it makes sense.

The naming clash might be problematic because when I think of graphite, it means: [0]. A unique name would really help avoid confusion.

0. https://graphiteapp.org


The website lets you merge in a stack of approved PRs together, (similar to Facebook's land all). Helps when merging in tons of small changes together.

As far as the name, at least the CLI command is clean/similar to git - `gt` :)


I haven't been on the eng side for a while (PMs just get to pretend like we are technical) but when I was an eng, I basically did stacked changes but with infinite amounts of unnecessary complications. This tool looks fuego. Maybe I'll start coding again....

Kids these days are getting luxuries we never knew back in the day (in 2015 lol).


"Stacked changes" is just branching from your PR branch and rebasing when the PR branch changes, no? I am not so sure. I see what makes this idea so revolutionary. Seems pretty obvious to me that after your first PR is ready, if the reviewer isn't, you start working on the downstream topic anyways...


I proposed a feature along these lines as GitHub feedback. Would love to see more upvotes there if you support the idea! https://github.com/github/feedback/discussions/6125


Love to see tech devoted to improving the lives of developers. This has been a pain point for me for so long.


How is this different from ghstack? https://github.com/ezyang/ghstack (which is what Edward Yang for PyTorch developers to mimic the stacked workflow, although it works with other repos).


Hi! We love ghstack here and used it for a while before building this.

There are a bunch of small differences between this and the various open source projects that support this workflow (ghstack, git town, spr, etc.), but the biggest reason we chose to diverge was we felt that our code review tooling also had to support stacks in order to have a first class experience. That's why we built out the web dashboard to work in concert with our CLI.


Cool, thanks!

Yeah, stack-based PRs are awesome - excited to perhaps try it out.


I don't agree with the premise. There's option 4, start off a new branch for the new feature, make that a new PR. That solves most problems, you can push that new PR for review as soon as your earlier PR is merged in which will allow you to rebase on master.


That does not work if your second PR builds upon the first.


im an idiot but can someone help me understand the following questions?

- Doesn’t this mean graphite can access non-public codebases of large organizations that authorize it?

- What prevents graphite or its employees from stealing intellectual property or maliciously viewing code for their own purposes?

- are there risks in allowing graphite access to codebases in any way?

- any and all information regarding privacy with regards to code are welcome, i simply cannot give access to certain codebases to third party organizations without some proof that the data is not going to be maliciously used or accessed.

all this comes into question when simply trying to login to graphite website.


Having worked with several large commercial monorepos including Google's I can recall using stacked changes a few times but I don't remember them being the secret sauce by any means. They were more of an edge case.


Even if you're not a fan of stacking your changes, I think the web dashboard has a lot of fan favorites from the now deprecated Phabricator, such as seeing the PR comment timeline side-by-side the code, macros, showing lines that were simply copied, etc. We've also tried to make it as snappy and modern as possible, resulting in something that feels a lot faster than github for reviewing PRs :) Would love to get your feedback if you try it out!


Stacked changes are an essential QOL feature when working with large monorepos. Kudos to the Graphite team for expanding the visibility of this development pattern and making it more accessible to teams of all size.


GitHub allows stacking PRs you send PR1 to master and PR2 is a pull request to PR1's branch, the diffs look good, and when PR1 is merged GitHub automatically changes its target branch to master.


Yea, I don't quite understand how the article's product is different from this. I also don't understand how anyone uses Github in a professional context without using the workflow you describe...

EDIT: Ah, from comments further downthread, it seems that the level of stacking enabled is much deeper, as a lot of the manual process is automated. I stopped using Critique over half a decade ago so I no longer remember the details of how it handled this.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: