
Stop cherry-picking, start merging, Part 1: The merge conflict - kiyanwang
https://blogs.msdn.microsoft.com/oldnewthing/20180312-00/?p=98215
======
rjmunro
I would normally rebase the feature branch onto master if I need fixes from
master and no one else is working on it. Usually this is the case - one
developer works on each feature in a feature branch until it becomes a Pull /
merge request and gets commented on and eventually merged.

I will sometimes make a fix in a feature branch that should go onto master
before the feature is ready, and cherry pick it to a new bugfix branch. Once
it's merged, if you rebase the feature onto master git will normally sort it
all out, or you can use rebase -i to skip the cherry picked commit manually.

~~~
jillesvangurp
Rebasing is great for branches you haven't pushed that nobody else is working
on. For that case, definitely rebase before you push.

If that is not the case and you have previously pushed your branch somewhere
remote already, rebasing would end up rewriting history and things get a bit
more tricky. You can still get away with this if it is just you.

Now here comes the crux: if you have a branch that you've been working on for
several days, you really should be pushing it regularly to remote so that you
don't lose your work. Also you should be creating lots of small commits and
create a giant commit that touches everything. So if you then go and rebase,
you are potentially rewriting history for someone. If that is at all going to
be an issue, the better solution is to create a new branch and push that (or
for Github folk, create a new pull request).

Many bigger projects will be configured to flat out reject rebased histories
overwriting existing ones. Most of the public projects only allow you to
create pull requests that have to be accepted by someone with commit rights on
the main repo. And many of those people prefer clean rebased histories when
they are merging so that they can do a so-called fast forward merge where the
history stays nice and linear. If you have a branch with lots of back and
forward merges, that is likely not going to go down well. It makes reviewing a
lot harder and you may be better off creating a new branch and then rebasing
the old one on top. Typically you'll have a bit of work re-solving all the
conflicts you resolved earlier. This is a good argument against long lived
branches.

Cherry-picking is sometimes useful but it potentially creates issues for the
subsequent merge (fast forward or not). The nice thing with the pull request
model is that it is not the problem of whomever cherry-picked but only that of
whomever created the branch. If there is a conflict, it would be theirs to
sort out before they can get their PR merged. If you are cherry-picking small
fixes, this is usually not a big issue and a perfectly valid way of solving a
small issue without landing a lot of work in progress.

~~~
sornaensis
Rebasing is perfectly fine for shared branches, all you have to do is pull
—rebase and push —force-with-lease when changing branch history.

I do this at work for all the feature projects I work on with my team.

You mostly run into problems when people get confused about what is happening
when they see the branch history has changed.

Git histories are wonderful and immutable so messing around with commits is
quite easy if you know how to move branches around to undo changes if you mess
up.

~~~
jillesvangurp
Git histories are only immutable if you don't rewrite them with --force.
That's why it's called --force. It means you messed up and need to rewrite
history. Nice to have the option but probably not great to make it a habit to
mess up your history.

It's a feature that gets turned off server side in bigger projects and on OSS
projects you don't get to push your changes at all and instead you need to
work to get people to pull your changes. I'd argue that's a great way to
isolate yourself from people misbehaving on their own branches since it is
entirely their problem to get their branch in a a shape where it is straight
forward to pull (as in fast forward merge). Rebasing can be a valid tool to
get there. Force push is not thing generally in such projects except maybe on
your own private fork.

------
shagie
Stop merging if you need to cherry-pick
[https://blogs.msdn.microsoft.com/oldnewthing/20180709-00/?p=...](https://blogs.msdn.microsoft.com/oldnewthing/20180709-00/?p=99195)

~~~
zamalek
I agree.

If you need to move a fix into multiple branches, isolate that fix in its own
branch to begin with. The entire branch is the cherry. Each target branch
deserves its own merge because chances are, seeing as the thing is so
critical, the merge needs adequate attention each time to get done correctly
(causing more work and friction is sometimes healthy).

If you're reaching the end of your sprint and your branch isn't ready to
merge, chances are the entire branch could use another sprint in the oven.
You've completely run out of time and you want to isolate a few changes,
that's great if you want to move fast and _break things._

In TFVC (and SVN, CVS, Perforce, etc.), branches are expensive. It has taken
me months to "unteach" people at work antipatterns that arise due to that.
Management still doesn't get it - there is this holdout perspective that
branches have to be micromanaged. This was the only sensible way to manage the
mess caused by TFVC+co.

If you're finding unusual degrees of friction with Git (Mercurial, Bitkeeper,
etc.), chances are that you're artificially introducing issues caused by your
legacy version control and trying to solve them. Start forgetting and
unlearning.

~~~
sourcesmith
SVN branches are not expensive. They use a copy-on-write.

[http://svnbook.red-
bean.com/en/1.8/svn.branchmerge.using.htm...](http://svnbook.red-
bean.com/en/1.8/svn.branchmerge.using.html#svn.branchmerge.using.create)

~~~
HelloNurse
Good luck using hard links on Windows without accidents.

Apart from sarcasm, SVN branches use suffer-on-merge: creating a branch means
immediate technical debt of the most useless kind, which is more expensive
than any inefficient file copying.

From the same manual:

"To perform a sync merge, first make sure your working copy of the branch is
“clean”—that it has no local modifications reported by svn status."

"One special kind of flexibility is the ability to have a working copy
containing files and directories with a mix of different working revision
numbers. Subversion working copies do not always correspond to any single
revision in the repository; they may contain files from several different
revisions."

------
drewg123
This gets more complex when you have, for example, a volatile open source
codebase as an upstream.

At work. we regularly merge upstream FreeBSD into our codebase, and run a
suite of tests to make sure that we have not introduced a regression in
performance or functionality. These tests take days to run, and roughly
another day to analyze.

What causes a cherry pick is if we realize late in our release process that
we're suffering from a bug that was fixed upstream (or that we can fix
upstream). Since we're late in the release process, we don't want to take a
full sync to the tip of the upstream master and consume potentially unstable
changes, as that will mean re-running all our tests and resetting the clock on
the release. So we cherry pick the fix (sometimes after committing it upstream
ourselves).

~~~
danieldk
Out of curiosity. What are you working on? A derivative of FreeBSD?

~~~
drewg123
The Netflix CDN. We run a slightly modified version of FreeBSD.

~~~
peterwwillis
Is this just the kernel, or the OS as a whole? This is probably a redundant
comment, but maintaining a fork of a vendored product kind of defeats the
purpose of using the vendored product. If at all possible I would recommend
either to use the vanilla release, or fork it once and stop integrating. If
you were shipping your custom FreeBSD to customers, and they were relying on
it being FreeBSD plus your changes, I can see integration providing
significant value. But since it's for a CDN, it's probably only being used by
your team.

------
fouc
I only cherry-pick when it's the only commit of a feature branch that's going
to be removed anyways. If development continued in that branch, I would merge
master back into the feature branch again, or rebase the feature branch on
master.

Not sure why, but I always knew from the beginning that cherry-picking is
something not to be done from a live in-development feature branch without
taking steps afterwards.

------
jdlyga
How else do you get bugfixes onto a release branch when the main develop
branch has already moved on? Should you rebase your feature branch on the
release branch instead, then merge it?

~~~
joevandyk
Branch off the release branch for fixes to it. Then merge back into release
and develop branch.

~~~
jdlyga
But what happens if it's a fix that was made to master. Then later, the boss
decides that they need that change as a bugfix to the current release? How do
you accomplish that without cherry-picking? Do you rebase the feature branch
it was a part of and merge it?

------
sourcesmith
Looking through the comments, seems like stacked diffs would serve better in a
lot of these cases...

~~~
alangpierce
Some background on stacked diffs in case people aren't aware:

[https://medium.com/@kurtisnusbaum/stacked-diffs-keeping-
phab...](https://medium.com/@kurtisnusbaum/stacked-diffs-keeping-phabricator-
diffs-small-d9964f4dcfa6)

I've been using stacked diffs for years now after trying various other git
workflows, and I love the freedom it gives me. I almost always structure my
work as a single branch (not a separate branch per feature), one commit per
code review, and I'm free to edit, reorder, split, and combine changes as I
please (mostly with interactive rebase). When a commit gets accepted, I just
move it to the bottom of my stack (if it's not there already) and push it. It
makes it easy to split changes into a "pipeline" of logical steps that can
each be independently reviewed and landed, with multiple steps in code review
at once.

I hate it when people treat dependent code reviews as an "advanced feature".
They're only advanced if you make your workflow so advanced that they're hard
to keep track of (many commits per review, many branches, merge commits). If
you keep it simple (one commit per review, always rebase), that simplicity
gives you much more power in other areas that IMO are more important.

I just wish it wasn't so much of a pain to get working in GitHub.

------
doombolt
Merges is the stupidest thing there ever is in the land of source control.
Contrary to the author I struggle to find even a single use that would make
merge merited.

My main point is that merges are not easily computable. comm3 = merge(comm1,
comm2) is not a kind of function you can run and see for yourself. Instead it
is a kind of hand wavy magic in which we declare that those two apples plus a
lemon equals three mangoes. You throw the beauty of hash tree out of the
window like an overdue christmas tree.

~~~
wodenokoto
What's the alternative? Only work on master and only on one document at a
time?

~~~
mkesper
Rebase everything so you get a clean timeline without meaningless "merge of
bla" commit messages. Can be made mandatory by turning your repo fast-forward-
only.

~~~
wodenokoto
I'm really bad at rebase, but doesn't rebasing a branch A onto branch B,
remove all the history from A?

~~~
aethr
Rebase has a lot of options, including squashing many commits into one or more
"good" commits. This is usually a good thing if you have commits that are
"work in progress" commits. You can use squash/skip to avoid having commits in
your history that break the project or feature, which is great if you use git
bisect.

The major downside of rebase is that even if you don't squash/skip it
_changes_ the hash of every commit. This is very problematic when others have
ever checked out your branch locally, or made commits that haven't been
pushed. It takes greater communication between the team in my experience.

~~~
yebyen
> It takes greater communication between the team in my experience.

You can also, as a substitute for greater communication, establish a protocol
around branch naming and follow it.

What works for us is using the word "release" and "wip" in our branch names.
If a branch is a "release" branch, then it is safe to base other work on it.
If a branch is a "wip" branch, then it is not safe for merging upstream.

These two are not exclusive. For example, you might have a branch "dev" and a
branch "release", and maybe another branch "release-dev-wip" – these all have
different properties. "release-dev-wip" is a short-lived merge target for
features that might not be completed yet. It is not safe to merge this
upstream, unless you've checked with all of your colleagues who merged feature
branches to it, and they all certified that it is no longer "Work In
Progress."

The key I think to make this protocol work is to distinguish between "feature
branches" and "environment branches" – dev is an environment branch, and a
permanent one, so it should not be rebased. Feature branches merge back to
environment branches, and environment branches are deployable.

You can break this rule, say if someone hotfixes master, which is upstream
from dev... but it should probably be an exception to do this, and not a
regular occurrence, as many people may have already based their work on "dev"
and they will all need to rebase on the new (rebased) dev, in order to get a
clean merge later. This is where the communication is not always optional. It
might be a better choice, if the hotfix is unavoidable, but the project is
large and this type of communication is logistically impossible, to merge in
reverse (checkout dev, then merge master – back to dev). It might be ugly, but
it's considerate. Either way, there should be a clear protocol and no
ambiguity on the matter of whether you have a feature branch or an environment
branch when you hold it in your hand. Feature branches represent work,
environment branches probably ought to just combine the work, and maybe keep a
record of how it was deployed.

The branch "dev-wip" is a temporary environment branch – it might be deployed
to a dev environment for example, but you should not expect it to remain
permanently in the git history. At some point, perhaps it will be renamed to
describe the features it contains, and then rebased and merged back to dev. If
you merged your feature to it, you might expect that you will need to keep the
feature branch around, so you can rebase it on "dev" or "master" later, and
finally merge it back.

The whole branch might not get merged upstream at once. (You can also call it
"release-dev-wip" and then, the person who looks at it will know that it may
contain some completed features that for some reason were not ready to merge
upstream, but perhaps should not be discarded entirely. I personally like to
rebase wip branches on their upstream before discarding them, just to be sure
I'm not throwing away someone's work that they may have thought they merged.)

Protocol is just a different form of communication that is done up-front. If
you decide on a protocol and forget to explain it to your team before you
implement it, you will obviously not have solved any problems. It's also
important to be clear and confirm understanding, so that you can be sure
nobody is imputing meanings that you didn't intend. Some teams might choose to
only do prod deploys from the "release" branch, and that anything in the
"master" branch must be safe to merge to release and send off to production.
You could easily get yourself into trouble if you didn't understand when your
team expects to work this way. Some teams might prefer to organize their
releases on a "release" branch, and then use Continuous Delivery to trigger
prod deploys when the release is merged to master. Other teams might prefer to
use a tag for that.

Mostly I think we can all agree that you should not rewrite a commit once it
has been tagged, but again, this is not something that is strongly enforced by
git, so it may vary from team to team. If a release that was tagged broke
prod, it might actually make sense to wipe that tag from history and reroute
the master branch around it. I've never seen that, but I think you're right,
the most important thing is to communicate with your team so there is no
ambiguity around these kinds of expectations.

------
Sir_Cmpwn
You should only use merging to merge two divergent or unrelated histories.
Merging should be a tool you use infrequently. GitHub has trained you to do it
wrong.

~~~
Blackthorn
Github is respecting the original intention of git. For a long time it didn't
even _have_ cherry-pick functionality.

~~~
Sir_Cmpwn
Citation needed. So far as I can tell from reviewing git's source code,
cherry-pick was there since at least September 2005, after which the trail
runs cold because of a big refactor. Git's first release was in April 2005.

------
LolNoGenerics
Stop excessive branching if you can use feature flags.

~~~
pjc50
This is only a good idea if the things you're working on are isolatable
"features", and you have an explicit process for garbage-collecting old
feature flags. Otherwise the combinational explosion of possible flags kills
your testing over time.

------
spenrose
Wonderful general Git resource: [https://github.com/k88hudson/git-flight-
rules](https://github.com/k88hudson/git-flight-rules)

------
neogodless
Does anyone have any references / recommendations along the vein of "guide to
branch strategies" for certain development methodologies? For example: working
in sprints / AGILE, branches for testing/QA/user-acceptance environments, etc?

(While I would hope the answers to this question would be widely useful, I
mostly work on C# applications, with less focus on JavaScript-based
applications. Maybe that's relevant for deciding on strategy - hopefully not!)

~~~
WorldMaker
The most commonly referenced references to read can be found by searching for
GitFlow (heavier) and GitHubFlow (lighter).

Personally, I recommend the GitHubFlow and CI/CD for "agile", and never
branching per-environment but there are almost as many opinions as there are
developers.

------
gnufx
This sort of discussion makes me even more grateful for patch-based systems
(Darcs, and possibly Pijul) than does having to use git for apparently simple
tasks required to make contributions to things.

