Hacker News new | past | comments | ask | show | jobs | submit login
Stacked Diffs versus Pull Requests (jg.gg)
104 points by cocoflunchy on Oct 4, 2018 | hide | past | favorite | 73 comments

The idea of "stacked diffs" looks like a good one, but this article doesn't do a great job of presenting them. It's way too long for the actual claims it makes. It combines two things: Working on a single branch and reviewing commits instead of branches.

Reviewing commits instead of branches makes a lot of sense, and would definitely result in a cleaner history than pull requests seem to do in practice.

The article also makes the claim "They’ll get reviewed in some order and committed in some order". If the changes in the second commit depend on the first commit, that's just as false as with pull requests that depend on each other.

The article names one concrete downside of the use of many branches: If one branch depends on the changes in another branch that's not in master yet, you end up with an annoying bunch of rebases or ugly merges. Committing everything to a single branch turns that into a single rebase.


So that's the content. Then there are the case studies, which are odd. Charlie the free spirited hacker magically turns from someone who...

> starts the day by branching master and spamming the branch with five commits that are only vaguely related.

into someone who

> blasts out five commits and five Diffs back to back. Each one addresses something specific.

All thanks to the magic of Stacked Diffs.

You are missing the subtle and hard to explain point that he tried to illustrate with the Tale of the Tree Icon. It's that our tools and their defaults shape the way we work. I watched a team switch from branches to stacked diffs (mercurial queues, way back in the day) and it was almost exactly how this article describes.

It's the tools and their defaults that leads me to my biggest criticism with the article. On darcs something like "stacked diffs" makes a ton of sense because the UI was built for it from top down, and cherry-picking is _magic_ (when it works), including that it tracks your dependencies between commits/patches for you.

In git, we have a dependency tracker in the DAG, but it requires the work of branching to make use of its power. In git, cherry-picking is a footgun with all sorts of potential problems. I've seen far too much work lost to bad rebases to trust junior developers with cherry-picking and rebase-heavy workflows in git. (Yes, I realize that a lot of people disagree and a lot of people use very heavy rebase workflows, this is just my opinion that in general rebase workflows are more danger than I want in my source control, as the resident source control guru on many of the teams I've worked on.)

Pull Requests actually work really well with this dependency tracking. If I create a PR for the database changes, a PR for the REST API changes, and a PR for the UI work that depends on both of the previous two, and all of those PRs are well merged branches (no rebases) using the DAG to encode dependencies, my reviewers have the option to review those in any order. If they review the two smaller subset PRs first, as that content merges to master most good PR tools simplify the remaining UI PR to just the UI stuff not in master. If they review and merge everything all at once in the giant UI change PR, then most good PR tools will automatically mark the other two PRs as merged and complete, no work needed for individual review of the two subset PRs. The choice there is left to the reviewers, and the tools just do the right thing, whatever the preference the reviewers have.

What PR tools are you referring to? We just use github pull requests, but stacking PRs seems impossible because merging the parent PR doesn’t seem to update the children PRs automatically.

I've used GitHub, Azure Repos (nee VSTS nee VSOnline), and Bitbucket (Cloud and Server), and I've seen some version of support for it in all of them.

This is a great model that scales very well to the largest projects and most complicated refactors. In fact, it forces you to think about your refactors differently, and that is actually great. You learn to boil the ocean incrementally more.

We got the decentralized workflows alongside concepts like pull requests... but there's no real reason why you can't use your decentralized workflows with trunk based development. The secret, is combining the two: decentralized workflow for local development, with modern CI and code review, that all lead into the trunk.

I've personally been veering toward this model for a while, and it's also somewhat similar to what Google is doing[1], and it works great across a lot of different circumstances.

[1]: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

This is silly. I use Phabricator and I don't commit my changes directly to `master`. I always make a branch, exactly the same as you would for pull requests. And I have multiple commits on that branch.

Why? Firstly it means I can still pull master and rebase my changes to update them fairly easily. Secondly it means I can work on more than one thing at a time (who only ever has one task in progress?), thirdly it means I can reorganise my commits easier (`git rebase -i master`), and finally having a change split into several commits can make code review easier.

As far as I can tell the only significant difference between these two approaches is that with `arc` the history in the repository ends up nice and linear, without loads of branches.

I also use Phabricator like you do, but the author makes a compelling argument to use a stack of master commits rather than individual branches.

> Firstly it means I can still pull master and rebase my changes to update them fairly easily.

Like the author says, you can do this, but it's easier when your commits are stacked on master (you only do one rebase vs. rebasing each of your branches).

> Secondly it means I can work on more than one thing at a time (who only ever has one task in progress?)

Read the article, the author uses interactive rebases to edit commits.

> finally having a change split into several commits can make code review easier

How? The reviewer doesn't see those commits, unless you submit them as separate revisions, in which case they wouldn't be on the same branch.

> As far as I can tell the only significant difference between these two approaches is that with `arc` the history in the repository ends up nice and linear, without loads of branches.

Either approach would have the same result, it's all about how things are organized in your local repository.

I'm not fully convinced (the tooling isn't exactly built for this workflow and you lose "arc feature"), but I like the idea.

Isn't this basically how gerrit is supposed to be used? At least that's how we use it (but it's a very small team so anything would work I guess).

You work on master, you rebase -i your commits to make them nice to review, you push them to gerrit, someone reviews them and pushes them to master or returns with stuff to fix. You usually have a few commits on top of remote master in your local master, and when any of them is pushed to remote master from gerrit you just git fetch && git rebase.

Yes, the Phabricator and Gerrit workflows are very similar.

It's basically the same workflow the Linux kernel uses, with Phabricator or Gerrit taking the place of git format-patch and git apply-patch.

> How do you git bisect a codebase where every 6th commit doesn’t build because it was jammed into the middle of a Pull Request?

This I think is the main problem with pull requests. A better default would be squashing the commits into a single one after review.

I have always been of the mind that every commit should be a working state. My coworkers who often commit with the message “WIP” disagree. I doubt I’ll ever sell them on it.

Yeah...WIP commits are fine in your local feature branch. I wouldn't ever want to see a WIP commit in my origin/master.

The stacked diff approach says: You get to have both, because it pushes you towards a rebase-heavy workflow, where you keep updating the WIP commits until they're ready to merge. This is a pain in the arse with git, but `hg absorb` makes it 100% painless

WIP commits in gerrit are perfectly fine. You can even push them to gerrit with all the rest commits. Just filter "WIP.*" out in gerrit view that lists commits to review.

This is also suboptimal. I’d rather each commit be an atomic unit: no larger and no smaller than it needs to be.

Edit: in other words, concurring with TFA.

Sure, but the best way to get there in my mind is `rebase -i`, which means you've made a few and revised a few commits along the way.

yes, but often on a feature branch you have WIP, change your code a few commits later and so on. I use rebase -i a lot and it make better code.

`rebase -i` is fine (`commit --fixup` and `rebase --autosquash -i` is better), squashing all commits down into one is not (at least doing it all the time is not).

Bisect on first-parent? Seems like a script exists (haven't tried): https://gist.github.com/ayust/2040290

Wouldn't this mean you just find the merge that caused the regression? The merge itself might bring in an awful lot of changes (of a single topic).

I think that’s desirable. First and foremost, you want to know whose work (in this case, a merged branch) broke yours. Afterwards you might want to drill into what specific piece (in this case commit) of that work was responsible, which can be achieved by bisecting the branch’s non-auto-merge/non-rebase history; this usually gets you close enough to “where the bug was actually written” in my experience, without requiring a switch to a specific workflow of what a commit is supposed to represent/how committed code should behave.

The merge won't bring more changes than if the PR was squashed in a single commit though. Hopefully if you have a test broken and the PR merged two commits: one that refactored the code and the second one that leveraged the refector to change the behavior, you can easily try to revert the latter to check if it fixes the problem. The commit is smaller, easier to re-review, and the bug likely easier to spot.

As others have pointed out indirectly, git bisect is designed with the patch stack in mind and could be made to support pull requests better.

You can start bisecting and then skip every commit on a feature branch that was merged in. Painful to script and I wish bisect had a --first-parent flag to do it for me

I think I agree with what the author is saying, by and large, but it doesn't matter: if it's a mental model that works for you, you can live in the "stacked diffs" world right up until you need to "curate" all your commits into something that can be considered for incorporation into the parent project.

Then you still need to make a branch, put all the related commits into it in order, and adequately test that branch before sumbitting it.

At that point, if changes are requested, you can make a choice about whether to work by rebasing that branch, or work on your "master" branch and then re-curate those commits. Now, working in the stacked diffs world seems like more work, but you do you.

A place I disagree with the author is about rebasing frequently onto master. Assume you are working in a healthy project with frequent and largely bug-free releases, but occasional trouble during development cycles where tests end up red for awhile despite everyone's best efforts. In this situation, you are better off starting your work from a quality release tag with no red; occasionally you'll end up dealing with more merge conflicts when you're ready to merge the branch, but IMO not so much that you don't derive benefit from avoiding all those rebases onto a new upstream revision that just might be totally bonkers.

If you do this, you should still validate that your branch, merged with or rebased onto master, doesn't make any new test failures or interact destructively with other work; deferring this to the end DOES have a cost, but in my experience it is outweighed by the benefits.

(And if you're in one of those now-rare projects where it's the norm to be making point releases of one or more old releases, PLUS having new development on a master branch, basing your branch at the right place means you can merge it into the old releases with ease...)

The Phabricator project itself uses two branches - a fast-moving master and a stable branch. Changes are commited to master and then promoted to stable.

Cherry-picking patches is a lot easier when the development team submits to a "one commit per idea" and "review single commits" workflow.

Hm, I thought the idea is that the only way something will land in master is if it reviews OK and CI passes all the tests on top of the current master?

"on top of the current master" is limiting: how do you handle concurrency? I.e. two developers kicking tests on their PR on top of current master, everything green, but they merge and the result of their merge is not building (or passing tests) anymore.

Most CI systems serialize builds on master, at least at the very end of the build pipeline. That can slow things down, but, if combined with a decently fast/thought-through build process with tears, I think the benefits of not risking untested states outweigh the costs.

This is not about serializing builds on master, but builds on the PR before they get merged on master. If you have a few hours of test plans, you can't serialize every PR one after each other.

If your testing is time consuming to the point that you can only test on your PRs and never on what master would be after each one merges, then you have to choose between somehow forcing PR authors to serialize their merges somehow (external convention or tooling or something) or not testing the code that will actually be released. The latter isn’t always the wrong choice, though.

*tests, not tears, but I guess it works either way...

Say two PRs get sent in by two different developers. CI builds both, passes them.

Reviewer comes and approves PR#1, it is merged into master. Depending how much CI power you have, PR#2 (and any other PR in queue) can be automatically built again on top of new master to verify it wont break anything.

You can do it lazily too and just re-check PR#2 once it gets approved by human - like other comment mentioned, CI coordinates and serializes all merges into master making sure master is always green.

That's not very scalable: with a team of 100 developers merging on average 100 PRs a day, you have lot of concurrency. Imagine everyone want to merge at the same time, only one can, the 99 others have to re-run the build/tests. Then only one merges and 98 others have to re-run the build tests, etc. On a project where the build/tests can take hours, this becomes difficult to justify...

And then, there is the "I merged a fix into the release branch, which auto-merge into master, the fix built and tested OK in the release branch but not when propagated to master".

"I think I agree with what the author is saying, by and large, but it doesn't matter: if it's a mental model that works for you, you can live in the "stacked diffs" world right up until you need to "curate" all your commits into something that can be considered for incorporation into the parent project"

That's funny because he spent the whole blog post pointing out why it does matter. The tools we use and their defaults structure the way we work. Having lived in both worlds I can add my testament to that.

I think GitHub's great, and I'm mostly okay with having to rely on it heavily at my day job, but I dislike how code reviews + PRs are necessarily tied to branches.

What are the chances of GitHub introducing stacked diffs (or a similar abstraction) as an option? It'd be great to try out this workflow on a non-trivial project, but it seems like a hard sell organizationally when it involves migrating to and supporting a completely different tool like Phabricator or Gerrit.

I'm not totally buying the Phabricator way, but yes, it would be so nice to be able to individually greenlight individual commits in a pull request, and when reviewing, be able to a: see all commits on top of each other on the same page without mixing their changes and b: be able to comment on commit messages.

In RhodeCode we have a similar concept that we call diff ranges. It's not available in pull-request view directly, however, users can see this on a source repo if they select a range from changelog view.

example: https://code.rhodecode.com/rhodecode-enterprise-ce/changeset...

I agree with the concept and it's sometimes very important to see how each commit produced the final combined diff.

You can do stacked diffs on GitHub by always using squash merges and rebasing on top of master, and it's a great way to get started.

It's just a lot nicer with the extra tooling on top that Phabricator and Gerrit provide (dependencies, command line tools, and so on).

It's true that you can do that indeed. Annoyingly, you have to manually rebase if you want to use the "merge & squash" feature. I don't understand why there's no "merge into one commit, rebase with master, merge onto master" feature. It's tedious to manually keep rebasing branches that were approved for review but have gone behind.

If you squash merge you just lost all the small commits that do one thing well that you spent time creating, unless you review them one by one. Bleh.

That's the entire point of a stacked diff workflow - you would submit smaller PRs which are the equivalent of one commit.

My pull request branches are of the size that it only makes sense together.

So basically what Gerrit offers as the "cherry-pick" submit type? You work on the master branch, then push commits as changesets to Gerrit. After review they are cherry-picked by the system onto the target (master) branch, in whatever order they happened to be reviewed. Or am I missing something?

So basically the author proposes replacing Pull Requests with "Cherry-Pick Requests", where each commit is a single reviewable unit (replacing separate branches with the PR model), the local master branch represents a list of unreviewed commits, and the remote master branch represents reviewed, "merged" commits.

It's an interesting idea, but even after reading this article I'm still not convinced it offers any serious advantages over a more traditional workflow. It also has the downside of requiring extensive use of more advanced features like interactive rebasing. (Which I personally have no issue with, but seems like a potential nightmare for developers who aren't as experienced with git.)

" It also has the downside of requiring extensive use of more advanced features like interactive rebasing."

This is true. In the early days of mercurial (back before it was clear that git was winning the popularity contest) there was an extension called Mercurial Queues[1] that made it really easy to work this way. The downsides were: it didn't handle conflicts well, and it could totally destroy your work if you weren't careful. Since then someone has written the Evolve extension which is Mercurial Queues done right. That, sadly, came about after git had "won" the popularity contest.

1. Best explanation of Mercurial Queues: http://stevelosh.com/blog/2010/08/a-git-users-guide-to-mercu...

How does this workflow work when there are 3 commits that need to be applied in order (A, B, and C), when C gets approved from a code-review standpoint before A and B?

You are prevented from merging commits at the top of the stack until the commits below are accepted.

The advantage is that each commit can be reviewed separately. You cannot achieve this with pulled requests as far as I know. I often see people creating a pull request and writing "this depends on PR #1233", the second PR is unreviewable because it contains changes from. The first PR. So the first PR has to merged first... It's a mess. But maybe I am just missing a big part of the PR workflow that solves this problem.

If the branches are properly merged so that the dependency is represented in the git DAG, most good PR tools just handle this correctly: if you merge the dependent PR first the dependee updates to show the commits as already merged and "simplify" their diff views appropriately; if you merge the dependee PR first, the dependent PR simply gets marked as completed/merged for you.

I find it is still useful to include helpful hints like "this PR includes all of PR #1234, if you prefer to review a piece at a time", but that's for the reviewers' sake, not the tools'.

Yes, that's nice. But the problem is that these are still shown as separate PRs, GitHub for example doesn't tie these together unlike Phabricator.

GitHub has enough information that it could surface that better. It's a useful feature request.

One way I've solved this is to open the second PR against the branch for the first PR, and explain what's going on in the PR message. Then, once the first PR is merged, update the second PR to be against the real target branch.

This works better with a system that requires the PR owner to actually hit merge (so it's not accidentally merged into the first PR's branch) and where approvals stick across PR changes (which is undesirable in general, though).

In the PR model, we can easily hook up a build system to automatically trigger builds on remote branches. How would that fit into this?

And that can be (and is) done with the stacked diffs model as well. CI is orthogonal to whether you use the PR model of the stacked diffs model.

But what commit do you run the CI against? If you run it against a commit that's stacked on top of 3-4 other unreviewed changes, then the CI will include errors from all those other commits, not just yours.

You don't stack against random commits. Only commits that have passed tests before and so are on master, or your own commits.

That's not what the article says:

> Yang sees that the Diff for the “fix” is out for review. Yang uses the Phabricator command line tool to patch that commit on top of master. This means that it’s not a branch. It’s just a throwaway local commit. Yang then starts working on the first change. Yang submits a Diff for review from the command line. Later, the “fix” has changed, so Yang drops the patch of the old version from the Git history and patches in the updated one via interactive rebase.

I guess the CI would have to cherry-pick the PR onto master before running the tests?

It seems that topgit can be used to achieve this workflow. The beauty of it is that it is built on top of git, so it has the familiar mechanics internally while providing a new way of doing patch series / stacked diffs.

Can someone explain why this is better than branch, commit and rebase from master if needed, then submit the pull request?

The case studies articulate the difference pretty well.

This is best workflow I've used so far (but usually I had separate branch for really unrelated changes).

Is this similar to Mercurial Queues?

Very much so. The difference is that it uses regular git commits, and you push and pop by doing interactive rebase.

Sometimes I really miss mq

This seems like a good approach. However, it's unrealistic that my team is going to move off GitHub. Is there any way I can do something like this locally, but then have it fit into a GitHub PR flow?

This is very long and i'm tired, is there a more condensed resource?

I think the tl;dr is “stacked diffs require running different commands than git pull --rebase”

The author complains about wasting time branching and merging, but other examples at the end still have the same number of operations and still hit the same merge conflicts.

They make the point of how dependant work requires multiple branches but I feel it’s a workflow constraint, or a technical one. It can be easily avoided by creating branches as late as possible! Usually, when I’m working on a problem that requires multiple merges, I

1. Create a branch to fix the problem

2. Work on it

3. Rebase it so I can merge the first few commits

4. Push just those commits with a different branch name

5. Get it reviewed

6. a. Merge the reviewed branch into master and rebase the problem branch on top of master

6. b. Update the branch under review with feedback fixes, push it & update the review & rebase the problem branch onto the updated branch under review & go back to 5

7. Go back to 2

Just don’t do what I end up doing on personal projects and let 6 months go past between 2 and 3

As the article pointed out, the problem here is you can't easily deal with multiple branches stacking on each other.

You may argue that it is possible to put all these changes into one branch, but "asking 6 people to review a 500+ lines change, in which only ~100 lines is meaningful for each reviewer" is not a nice thing to do and it obviously delays the code review process.

I feel like my workflow address that exactly: by making a separate branch per reviewable change from the base of the problem branch.

Another code review tool that lets you work this way is Review Board: https://www.reviewboard.org/

Interesting, does anyone know of any scripts to allow you to use stacked diffs locally when your team is using tools based around pull requests (i.e. GitHub)?

We recently shifted from Gerrit to Bitbucket. I don't miss Gerrit. Perhaps I am missing something.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact