> This doesn't seem to be about forking, but rather contributing upstream using a GitHub-like model.
Probably useful to distinguish between the concept of "forking" where you basically take the existing project, duplicate it, then start working on your own independently, and the action of "forking" in the GitHub UI that just duplicates a repository.
The latter is commonly used for contributing upstream, while the former would be when you don't want to contribute and want something independent.
It depends a bit on the nature of a fork (not all forks are made equal!); here's my experience with forks:
* Feature forks - exists because upstream doesn't want to merge a certain (set of) features for whatever reason. These aren't necessarily hostile forks, it's just that upstream didn't use a forkable/modifiable license to accept third party patches. There's a lot of these sorts of "stacks" (a lot of them written in Ruby, which in my experience has very bad separation of the device it's running on and the source code), where portability outside of upstreams deployment just isn't a concern. For these, frequent rebases are fine - the feature forks only has a single developer, is unlikely to need many external contributions and can often just subsist by mainting a couple of .patch/.diff files for each major release (depending on how much the forker cares). This type of fork isn't really seeking to be its own project. It's also the type that the post is talking about, where you can effectively maintain your patches as single rebases, which saves some mental overhead. (Icecat is an example of this sort of fork, although for rather silly reasons.)
* "Soft" hostile forks. Upstream isn't bad or dangerous, but is notoriously unresponsive on issues they should be responsive on. The fork has it's own governance, but it's very much a skeleton and merging upstream code is a major priority to keep the project going. These projects tend to either fizzle out, fall back in-line with upstream or become a true hostile fork.
* True Hostile Forks: Upstream has lost its marbles. They're unreliable and can't be build further on. Worst case scenario; the fork is downstream only in the sense that the upstream shouldn't be used anymore. Here, merging upstream code is either to be avoided or should be checked the same as a third party contribution.
The post considers rebasing the fork. It seems easier to do only merges. You can merge specific upstream commits, or the entire upstream branch. That way you only need to merge in the changes, and not resolve the same conflicts for each downstream commit, every time the upstream is changed.
Maintaining a fork by merging upstream into downstream branches would be a mess: upstream and downstream code would be totally entangled, there would no clean way to edit/remove/reorder downstream commits. You can't easily know what you actually changed (not the full diff, but each "atomic" clean change).
> there would no clean way to edit/remove/reorder downstream commits
Yes, but I consider commits immutable so reordering or editing commits is not a thing that would happen. New commits are appended (either upstream or downstream), and any conflicting changes then of course have to be merged from upstream into downstream.
Merging seems easier? With merging you can get a definite answer about which commits are where. With rebase you’re dealing with changes/patches so the answer is less clear.
But it might be cleaner (for some definition of clean) to have a for which is just N commits on top of the upstream. Which is then rebased periodically.
It depends on whether the downstream only ever consumes upstream changes, or also often contributes changes to upstream. Rebasing in both directions can result in a messy history, too, especially if upstream isn't very careful in how it accepts pull requests.
I think both are done for long-term forks. Since both are done we can surmise that both are practical.
I suspect that rebase-forks are more common historically. Because the only thing I’ve heard of when it comes to managing changes on top of a version control system (which could be centralized) are “patch queues” or stacks. And those tend to be rebase systems with a different system.
But I have no practical experience with long-lived forks.
Nice overview. There is a lot of rebasing going on, and I wonder how that would work when more than one person works on the downstream fork. Force-pushing can easily break other users’ checkouts.
- at least not if someone else could be working on it
- and definitely not without talking to them about it
Personally I force push my branches quite often. It's super useful to rebase my branch on to main/master or doing an interactive rebase to reorder my commits, merge or just change the commit message.
I've been on the other end of someone rebasing and force pushing. I've found that just removing the branch from my local git and then checking it out fresh from origin is the simplest way to go.
With a “branch” you’re often talking about the same repository (as in you can push to the same repository). With “forks” there might be N different “trees” where each tree is “mine” and the coordination could be patches via emails.
With such a loose relation you could easily end up in a situation where you are conceptually “force pushing”. Because you might use the same patches but not have the same graph topology.
What if you send three patches to the upstream and they incorporate it with git-am(1)? Then they have effectively made a different history with your patches: three commits which are not the same as your three but may be patch-identical. Then you need to keep track of that “force push” by the upstream so that you don’t maintain your now duplicate commits (as in distinct commits but the changes are duplicates).
You can also `git fetch` and `git reset --hard origin/force-pushed-branch` to get your local branch up to speed with remote one assuming you don't have any local changes.
I haven’t tried it myself, but since you know commits A and B have already been rebased and had their conflicts resolved, can’t you instead rebase -i on top of the partial-rebase branch and then drop A and B?
I think this way at least you still benefit from the rebase --edit-todo, which you do not when cherry-picking C^..F.
Or easier, do an interactive rebase and mark the last commit which is in the partial-rebase branch for editing. Then, do `git reset --hard partial-rebase` and continue the rebase.
If you're actually forked, you shouldn't be rebasing.