Reviewing commits instead of branches makes a lot of sense, and would definitely result in a cleaner history than pull requests seem to do in practice.
The article also makes the claim "They’ll get reviewed in some order and committed in some order". If the changes in the second commit depend on the first commit, that's just as false as with pull requests that depend on each other.
The article names one concrete downside of the use of many branches: If one branch depends on the changes in another branch that's not in master yet, you end up with an annoying bunch of rebases or ugly merges. Committing everything to a single branch turns that into a single rebase.
So that's the content. Then there are the case studies, which are odd. Charlie the free spirited hacker magically turns from someone who...
> starts the day by branching master and spamming the branch with five commits that are only vaguely related.
into someone who
> blasts out five commits and five Diffs back to back. Each one addresses something specific.
All thanks to the magic of Stacked Diffs.
In git, we have a dependency tracker in the DAG, but it requires the work of branching to make use of its power. In git, cherry-picking is a footgun with all sorts of potential problems. I've seen far too much work lost to bad rebases to trust junior developers with cherry-picking and rebase-heavy workflows in git. (Yes, I realize that a lot of people disagree and a lot of people use very heavy rebase workflows, this is just my opinion that in general rebase workflows are more danger than I want in my source control, as the resident source control guru on many of the teams I've worked on.)
Pull Requests actually work really well with this dependency tracking. If I create a PR for the database changes, a PR for the REST API changes, and a PR for the UI work that depends on both of the previous two, and all of those PRs are well merged branches (no rebases) using the DAG to encode dependencies, my reviewers have the option to review those in any order. If they review the two smaller subset PRs first, as that content merges to master most good PR tools simplify the remaining UI PR to just the UI stuff not in master. If they review and merge everything all at once in the giant UI change PR, then most good PR tools will automatically mark the other two PRs as merged and complete, no work needed for individual review of the two subset PRs. The choice there is left to the reviewers, and the tools just do the right thing, whatever the preference the reviewers have.
We got the decentralized workflows alongside concepts like pull requests... but there's no real reason why you can't use your decentralized workflows with trunk based development. The secret, is combining the two: decentralized workflow for local development, with modern CI and code review, that all lead into the trunk.
I've personally been veering toward this model for a while, and it's also somewhat similar to what Google is doing, and it works great across a lot of different circumstances.
Why? Firstly it means I can still pull master and rebase my changes to update them fairly easily. Secondly it means I can work on more than one thing at a time (who only ever has one task in progress?), thirdly it means I can reorganise my commits easier (`git rebase -i master`), and finally having a change split into several commits can make code review easier.
As far as I can tell the only significant difference between these two approaches is that with `arc` the history in the repository ends up nice and linear, without loads of branches.
> Firstly it means I can still pull master and rebase my changes to update them fairly easily.
Like the author says, you can do this, but it's easier when your commits are stacked on master (you only do one rebase vs. rebasing each of your branches).
> Secondly it means I can work on more than one thing at a time (who only ever has one task in progress?)
Read the article, the author uses interactive rebases to edit commits.
> finally having a change split into several commits can make code review easier
How? The reviewer doesn't see those commits, unless you submit them as separate revisions, in which case they wouldn't be on the same branch.
> As far as I can tell the only significant difference between these two approaches is that with `arc` the history in the repository ends up nice and linear, without loads of branches.
Either approach would have the same result, it's all about how things are organized in your local repository.
I'm not fully convinced (the tooling isn't exactly built for this workflow and you lose "arc feature"), but I like the idea.
You work on master, you rebase -i your commits to make them nice to review, you push them to gerrit, someone reviews them and pushes them to master or returns with stuff to fix. You usually have a few commits on top of remote master in your local master, and when any of them is pushed to remote master from gerrit you just git fetch && git rebase.
It's basically the same workflow the Linux kernel uses, with Phabricator or Gerrit taking the place of git format-patch and git apply-patch.
This I think is the main problem with pull requests. A better default would be squashing the commits into a single one after review.
Edit: in other words, concurring with TFA.
Then you still need to make a branch, put all the related commits into it in order, and adequately test that branch before sumbitting it.
At that point, if changes are requested, you can make a choice about whether to work by rebasing that branch, or work on your "master" branch and then re-curate those commits. Now, working in the stacked diffs world seems like more work, but you do you.
A place I disagree with the author is about rebasing frequently onto master. Assume you are working in a healthy project with frequent and largely bug-free releases, but occasional trouble during development cycles where tests end up red for awhile despite everyone's best efforts. In this situation, you are better off starting your work from a quality release tag with no red; occasionally you'll end up dealing with more merge conflicts when you're ready to merge the branch, but IMO not so much that you don't derive benefit from avoiding all those rebases onto a new upstream revision that just might be totally bonkers.
If you do this, you should still validate that your branch, merged with or rebased onto master, doesn't make any new test failures or interact destructively with other work; deferring this to the end DOES have a cost, but in my experience it is outweighed by the benefits.
(And if you're in one of those now-rare projects where it's the norm to be making point releases of one or more old releases, PLUS having new development on a master branch, basing your branch at the right place means you can merge it into the old releases with ease...)
Cherry-picking patches is a lot easier when the development team submits to a "one commit per idea" and "review single commits" workflow.
Reviewer comes and approves PR#1, it is merged into master. Depending how much CI power you have, PR#2 (and any other PR in queue) can be automatically built again on top of new master to verify it wont break anything.
You can do it lazily too and just re-check PR#2 once it gets approved by human - like other comment mentioned, CI coordinates and serializes all merges into master making sure master is always green.
And then, there is the "I merged a fix into the release branch, which auto-merge into master, the fix built and tested OK in the release branch but not when propagated to master".
That's funny because he spent the whole blog post pointing out why it does matter. The tools we use and their defaults structure the way we work. Having lived in both worlds I can add my testament to that.
What are the chances of GitHub introducing stacked diffs (or a similar abstraction) as an option? It'd be great to try out this workflow on a non-trivial project, but it seems like a hard sell organizationally when it involves migrating to and supporting a completely different tool like Phabricator or Gerrit.
I agree with the concept and it's sometimes very important to see how each commit produced the final combined diff.
It's just a lot nicer with the extra tooling on top that Phabricator and Gerrit provide (dependencies, command line tools, and so on).
It's an interesting idea, but even after reading this article I'm still not convinced it offers any serious advantages over a more traditional workflow. It also has the downside of requiring extensive use of more advanced features like interactive rebasing. (Which I personally have no issue with, but seems like a potential nightmare for developers who aren't as experienced with git.)
This is true. In the early days of mercurial (back before it was clear that git was winning the popularity contest) there was an extension called Mercurial Queues that made it really easy to work this way. The downsides were: it didn't handle conflicts well, and it could totally destroy your work if you weren't careful. Since then someone has written the Evolve extension which is Mercurial Queues done right. That, sadly, came about after git had "won" the popularity contest.
1. Best explanation of Mercurial Queues: http://stevelosh.com/blog/2010/08/a-git-users-guide-to-mercu...
The advantage is that each commit can be reviewed separately. You cannot achieve this with pulled requests as far as I know. I often see people creating a pull request and writing "this depends on PR #1233", the second PR is unreviewable because it contains changes from. The first PR. So the first PR has to merged first... It's a mess. But maybe I am just missing a big part of the PR workflow that solves this problem.
I find it is still useful to include helpful hints like "this PR includes all of PR #1234, if you prefer to review a piece at a time", but that's for the reviewers' sake, not the tools'.
This works better with a system that requires the PR owner to actually hit merge (so it's not accidentally merged into the first PR's branch) and where approvals stick across PR changes (which is undesirable in general, though).
> Yang sees that the Diff for the “fix” is out for review. Yang uses the Phabricator command line tool to patch that commit on top of master. This means that it’s not a branch. It’s just a throwaway local commit. Yang then starts working on the first change. Yang submits a Diff for review from the command line. Later, the “fix” has changed, so Yang drops the patch of the old version from the Git history and patches in the updated one via interactive rebase.
Sometimes I really miss mq
The author complains about wasting time branching and merging, but other examples at the end still have the same number of operations and still hit the same merge conflicts.
They make the point of how dependant work requires multiple branches but I feel it’s a workflow constraint, or a technical one. It can be easily avoided by creating branches as late as possible! Usually, when I’m working on a problem that requires multiple merges, I
1. Create a branch to fix the problem
2. Work on it
3. Rebase it so I can merge the first few commits
4. Push just those commits with a different branch name
5. Get it reviewed
6. a. Merge the reviewed branch into master and rebase the problem branch on top of master
6. b. Update the branch under review with feedback fixes, push it & update the review & rebase the problem branch onto the updated branch under review & go back to 5
7. Go back to 2
Just don’t do what I end up doing on personal projects and let 6 months go past between 2 and 3
You may argue that it is possible to put all these changes into one branch, but "asking 6 people to review a 500+ lines change, in which only ~100 lines is meaningful for each reviewer" is not a nice thing to do and it obviously delays the code review process.