* a detailed walkthrough https://upte.ch/blog/how-we-should-be-using-git/
Just stack up some very very small MRs in the order they're depending and you can easily have people understand the changes across them without wasting time solving merge conflicts you already knew you'd otherwise have to deal with.
> You should never be going off and work on huge changes without keeping in sync with the base branch.
I agree, but the stacked PR approach doesn't preclude this. If you periodically `autorebase` the stack against `origin/master` right after a `git fetch origin`, your stack stays in sync with `master`.
> ate a longer lived branch and make small PRs to that. Then merge longer lived branch into master
Large features don't necessarily imply long-running feature branches, though. I personally prefer feature flags (or some other mechanism for conditional execution) on the base branch for features that are "open" for more than a week or so, say.
"Large" is also subjective. You could have changes that are large in terms of lines of code added, the number of services that are being touched, or the number of different+unrelated concerns involved.
As an example, this week I worked on a semi-greenfield Kafka service. Stacking PRs helped me keep track of the review process a lot more easily than "one big PR" would allow, and the stack was merged in 4-ish days. I structured the stack like this:
- Add a `proto` file covering messages that this Kafka consumer reads off the incoming topic
- Pull in a gradle library for protobuf code generation
- Add in a simple Kafka consumer and producer
- Add a `Processor` that works on incoming messages
- Hook everything up; the sevice consumes off Kafka, "processes" the data, and produces back to Kafka
> Stacked PRs make all this more difficult.
I'm in agreement. Stacked PRs involve more overhead, but there are times when this is a reasonable tradeoff.
I haven't had a pile that big myself, but I have had a pile that was in the low 100s.
I follow this pattern a lot, we call it an epic branch and usually name it something like `epic/foobar`
* Branch can be reduced to a single, coherent commit: squash + rebase
* Branch can be reduced to a single commit plus one or two unrelated fixup commits (like formatting): rebase
* Multiple commits: merge
But that requires value judgement and can lead to discussions, so it's not ideal for many teams.
Of course then there are also a lot of companies that use Jira as their primary source of history, where most commits are just "Fix ABC-123". For those it really doesn't matter much what you do to the Git history anywway.
It does not. After PR1 is merged, PR2 gets retargeted onto the target branch of PR1—but all the commits it was based on now move to the PR2, sometimes causing conflicts. GitHub won't automagically rebase your PR2, that makes sense.
This is because the earlier feature branches are usually more "done" and I'm still working on the later ones, so don't want to wait until they are all done to submit PRs and merge them from the tail.
I also squash my feature branches before posting PRs, so there's only a single commit, which avoids the rebase issue when branches get new commits. If there's PR feedback for an earlier branch, I'll make the changes, squash (or amend) the commit, then force push that branch. I'll then rebase the later branches to propagate the change. If you have re-re enabled the later rebases are simple.
This works well for me, usually I don't go past F2 or F3 in practice, which is small enough # of branches to manage all the rebasing by hand.
Overall this is a great doc/explanation, and I love the visuals.
I've tried to do squashing into intermediate branches before, but in those cases I'm not a fan of the enormous commit message I need to make when squashing the combined feature into mainline in order to keep things clear to contributors on what it actually adds
Agreed. I keep each feature/significant change in its own branch/PR as a single commit, and just create more branches/PRs for each separate commit.
Sometimes my features require multiple commits over time to deliver, they each get their own branch/PR and those commits are preserved in mainline.
This very much reminds of of merged-based-rebasing aka psycho-rebasing (which itself became part of git-extras; Have you seen the concept before?
In the end we moved to Gerrit and never looked back. We are not super experienced yet but I'd say our commit quality went up by quite a bit.
This is a good introduction to the differences of stacked diffs vs PRs: https://jg.gg/2018/09/29/stacked-diffs-versus-pull-requests/
I feel something like this might be good as a first class citizen in git (it feels like a common problem, and I don't see how it's solved by some sort of Github feature)
Instead of doing this:
git rebase f1
push -fu origin f2
git rebase f2
push -fu origin f3
Do this instead:
git checkout f2
git merge f1
git checkout f3
git merge f2
You would follow this same approach if someone merged a PR ahead of yours into master:
git checkout master
git checkout f1
git merge master
I stack PRs everyday because of the points mentioned in the article, but haven't found a tool which will automatically rollup merges yet (haven't looked either).
I find that the bottom PRs get unwieldy; you can end up merging a multi-thousand line behemoth f1 containing f2, f3, etc. which is hard to verify as correct.
Instead, merge f1 including any changes f1’ you may have force-pushed to that branch. Then do a ‘git rebase <sha> f2 —onto master’ where <sha> is the old f1 commit that you originally targeted your f2 PR against.
You’ll need to resolve merge conflicts f1..f1’ into your f2. You can now force push that f2 and retarget your f2 PR onto master.
Repeat for f3, f4, ..., though I generally find more than 2-3 stacked PRs to be risky; the further away from master you get, the more likely something is going to change drastically in a PR that you’re branching off.
Anyway, this requires a bit more rebase-fu, but I think it leaves a better paper trail and is easier to work with if you do grok rebase fully.
In practice, I feel it is much easier to assign work and organize your codebase so that logically-different business functions can be managed in isolation.
We rarely have merge conflicts these days, and we are also on 1 big monorepo. It mostly boils down to discipline and architecture.
You simply create a change and in the footer say
Depends-On: <pull request URL/gerrit URL/gitlab URL/etc>
Zuul will pull everything together, merge it all, and provide your tests with repos on disk that reflect the entire dependency chain. You write your test to install/run/etc. from these repos that have been prepared for you. The test might have only your change, or it might have 20 other changes; Zuul handles all this. This way you test everything together.
You don't need to spend resources speculatively running every possible combination; most changes probably don't affect each other (when you know they do; setup dependencies). But the trick is, you don't commit, Zuul does. It puts all approved changes in a queue, merges them and runs the "gate" tests. It does this in a smart way to try and batch as much as possible. When they pass, it commits the change. When something fails; either it won't merge with HEAD, or it's tests fail against HEAD, it gets rejected and you fix it up and go through the cycle again. It's impossible to commit anything broken to HEAD, because everything that is committed has been tested.
Our code structure needs to support both the functionality required by our customers, and our needs as developers.
> It can't pause when a conflict occurs, so you have to fix the conflict and (somehow) re-run it from the point it stopped at, which is fiddly at best.
There is better way:
1. I use `git rebase -i` from the top of the stack -- it opens a vim with list of changes it's going to do.
2. I have script (https://github.com/mic47/git-tools/blob/master/GitStackTodo.... ) that process this and inserts commands to backup branch and move branch to new location to TODO list. At this point, I can even modify the TODO list to rip out commits I don't want, or squash commits i want. Or you can reorder commits (I usually do code review fix at top of the stack and then reorder commits -- at least if I am reasonably sure there won't be conflicts for this fixes).
3. At this point, you can insert more things, like run tests after each branch, or commit (and pause rebase in case of failure, so you can fix it).
4. When I close this file, rebase starts. In case of conflict, rebase pauses, let you fix it, and when you continue rebase, it will finish the TODO file.
5. After, I have script that removes backup branches.
6. I have script that runs command on each branch, so at the end, I do this to push my changes `git stack-foreach master git push --force-with-lease `
What if you can't resolve conflict and you are in the middle of the rebase? You can run `git rebase --abort` and it will restore top of your commit. Only drawback is that branches that were rebased are not restored, but hence my script also create backup branches so I can fix that manually and move branches back.
Solution is simply to never use force. Let the history be the history, don't overwrite it.
Additionally, technically speaking --force does not overwrite history, it just moves the branch pointer. Old commits are there.
It has important ties to how the Linux Kernel and Git dev teams work as well as breaks down the benefits in relation to CI as a methodology.
I've been using stgit for a very long time, and before that, I used quilt, and before that, Andrew Morton's patch scripts.
If I am not mistaken, Andrew Morton's patch scripts were the inspiration for quilt.
This allows you to use the local Patch Stack style workflow similar to the Linux Kernel team but while still using GitHub, Bitbucket, or GitLab to do the peer review process.
Everything else I have tried including quilt and the various other tools that have attempted to do this all feel too complicated and too much work.
No offense to the authors of st-git and the work they have put in. But, personally the workflow with it has felt too complicated to me and not natural.
git-ps is the first way I have found where managing the stack of patches feels easy and natural while still allowing me to use GitHub, Bitbucket, or GitLab for peer review.
Check out my article on it here to understand the workflow better, https://upte.ch/blog/how-we-should-be-using-git/
I am happy to answer any questions I can.
git ps pull
git ps rebase
git ps ls
git ps rr 0
git ps pub 0
>git-ps is the first way I have found where managing the stack of patches feels easy and natural while still allowing me to use GitHub, Bitbucket, or GitLab for peer review.
Maybe it would be good to describe more what is happening behind the scenes when you do this "git ps" commands. At the end of article you mention that you need to know Git. I know it, but I don't know what is happening behind the scenes (I could probably test it myself by actually using git-ps, but I've just read this article)
This approach is interesting but sounds more complicated than patch series, and therefore more error prone.
git push origin f1:f1
git checkout f2
git merge f1
git push origin f2:f2
git checkout f3
git merge f2
git push origin f3:f3
As others have noted already, I like the git-ps approach (https://upte.ch/blog/how-we-should-be-using-git/)
Merge in the stack from top to bottom - in this case we're
going to merge Feature 3's PR, Feature 2's PR, and Feature
1's PR, in that order, using GitHub's UI.
Maybe a new image of the branches with the list order reversed?
For feature branches feature-2 that's based on feature-1, why not create a PR from feature-2 onto feature-1 to get the review started? Once feature-1 is merged, simply rebase feature-2 and update the PR. Pipeline that shit, baby!