
Avoid Most Rebase Conflicts - kantord
https://medium.com/@porteneuve/fix-conflicts-only-once-with-git-rerere-7d116b2cec67
======
xg15
> _So to relieve some of that tension and ease up the final merge you’re
> heading towards, you decide to perform a control merge now and then: a merge
> of master into your own branch, so that without polluting master you can see
> what conflicts are lurking, and figure out whether they’re hard to fix.

It is indeed useful, and just so you won’t have to fix these later, you would
be tempted to leave that control merge in the tree once you’re done with it,
instead of rolling it back with, say, a git reset --hard ORIG_HEAD and keep
your graph pristine._

if you use git "linear history" style and will do a rebase at the end anyway,
why not do "control rebases" immediately, instead of control merges?

At least from my experience, a rebasing on a branch that has substantially
diverged can get painful quickly, however doing many small rebases and trying
to keep up with master usually works without many problems.

~~~
freetime2
Wouldn’t that require a force push after each rebase, which could potentially
cause headaches for team members working in the same branch?

~~~
notus
The simple solution is to not have multiple team members working on the same
branch. Branches should be pretty ephemeral.

------
derriz
Once I read "this is ugly and pollutes your history graph" as a justification
for any git feature, I'm immediately turned off.

What really is wrong with having a graph that actually reflects the real
history of changes as they were made at the time?

Why not keep things simple and improve git-log so that anything you find ugly
or that looks like pollution can be hidden while viewing the history of a
repo? In this case, for example, would it not be far more ergonomic to add a
"\--hide-control-merges" flag to git-log?

This rerere feature described looks like a lot of the foot-guns git provides.

I know I'm in a minority here - I've given up arguing with collegues that it's
possible to have a branch&merge work-flow without constant re-basing. Allowing
the history graph to be manipulated and changed seems to have become part of
the most popular git work-flows even though it's a recipe for serious pain in
a distributed VCS.

(This isn't a criticism of the article btw - more of the git
design/philosophy.)

~~~
dahart
Edit here at the top to clarify: I’m talking about rebasing your own commits
_before_ pushing. It sounded like the parent comment was talking about blanket
use of rebase. I don’t advocate rebasing other people’s pushed commits, other
than a few very exceptional circumstances.

> What is really wrong with having a graph that actually reflects the real
> history of changes as they were made at them time?

You’re inflicting irrelevant noise and cognitive load on yourself and other
people. Noise in the history can also cause all kinds of trouble with merges
and with bisect and other git features. A clean history is easier to work
with.

Given that Linus advocates rebasing to clean the repo history, I’m curious
where the whole “we must preserve the _real_ history”, and dogmatic belief
that messy history is some sort of “correct” git design/philosophy.

There is no “real” history. Commit order is arbitrary, especially between
orthogonal changes by different people. Anything you modify later using rebase
can (and is) done before rebase. There is no sacred set of events to preserve,
and it has real and practical ramifications to let the repo get messy when
you’re on a large team.

Why not embrace the idea of preserving the semantic history, the intent, as
cleanly and clearly as possible, rather than focus on preserving arbitrary
noise that doesn’t mean anything to you, let alone others?

> I’ve given up arguing with colleagues that it’s possible to have a branch &
> merge work-flow without constant re-basing.

It is possible, but it’s not desirable.

> Allowing the history graph to be manipulated and changes seems to have
> become part of the most popular git work-flows even though it’s a recipe for
> serious pain in a distributed VCS.

Would you elaborate on what pain you’re talking about? Rebasing is something
that happens before push to master, if you’re using it properly, and that’s
what the article here is talking about. If anyone’s rebasing the remote’s
branches, that is bad, but that is not common practice.

Aside from that, you’re talking about git design and failing to acknowledge
that rebase and manipulating the graph history _is_ the design of git.
Updating local history is a good thing, and it was designed that way
intentionally.

~~~
derriz
I completely agree that rebase and manipulating the graph history is part of
the design of git. But also you can use git effectively without manipulating
the history (in a non-incremental way). And you don't lose anything in terms
of branching and merging - in fact you gain in terms of safety and simplicity
in a distributed system.

The argument I guess is about what you consider the cost of inflicting
"irrelevant noise and cognitive load" on other users. I believe it's not
significant and is mostly a function of the deficiencies of the history
browsing tooling. On the other hand, I believe that the cost of making
rewriting history part of the workflow is significant in terms of the load it
places on users to learn features like rerere, for example.

If I've understood you correctly then I think neither of us can be right or
wrong - it depends what each of us considers a greater "cognitive load" or
cost which is going to be based on our personal experiences and preferences
rather than something empirical.

~~~
dahart
It’s still not clear what you’re talking about exactly, will you clarify?
Perhaps some concrete examples of the safety and simplicity gains you’re
referring to?

Are you saying, and arguing with your team, that individuals should _always_
branch and merge, even when they want to check in single commits, or small
numbers of commits?

I don’t consider what happens before push to be “history” at all. Do you? I do
consider what happens after push to be history, and for pushed commits, I
agree, people should not rebase them.

Branching was designed for the isolation and safety of multi-person teams
working on features that take non-trivial amounts of time. Arguing over what
individuals should do when not working in teams in a branch is possibly a bit
of bike-shedding. I don’t know if that’s what we’re doing here, but on the
other hand there are some clues: rebasing is something people do to their own
private branch before push, but not normally to other people’s commits after
they’ve pushed.

I do think it’s noisy and unnecessary to have two commits for every single-
commit change in master. A lot of teams specifically disallow that practice
and require direct checkins to master for single commit changes, just to
prevent the noise.

If you’re advocating never using rebase, and never rewriting your own history
before you push, then I think that is a dogmatic approach that misunderstands
the goals and tools in git and fundamentally mistakes git’s philosophy.

If you’re saying that people are rebasing your changes after you pushed them,
then I agree with you completely.

I’m not calling anyone wrong. But this isn’t a disagreement over which way has
cognitive load. You asked what was wrong with noisy history, and I answered.
Your example of having to learn rerere only applies to the team lead, the
person doing the merges, and it only comes up once in a while. Not everyone
needs to learn rerere, and nobody needs to use it all the time. My example,
noisy history, applies to everyone on the team at all times. On my teams, I
prefer that people learn git well, _and_ keep their histories clean.

~~~
derriz
Bit late to reply but regarding the advantages of not allowing rebase
operations in a distributed system: if you can only change the version tree by
adding nodes to leaves and by adding attributes to nodes (e.g. for merge
arrows, tags, etc.), then combining distributed changed versions of a version
tree is relatively trivial and users need never fear pushing and pulling.

Or more formally, there is a partial order on version trees (inclusion) and if
all changes to version tree are monotonic with respect to this partial order,
then it will always be possible to combine distributed versions without
conflict.

An analogy might be that I prefer storing raw facts in databases, where
practical and performing aggregations/filtering/etc. as queries during
retrieval rather than storing filtered and aggregated data.

Typical rebase operations like squashing commits or removing merge history
destroy information. Rather than decide up front what information is
interesting, why not keep it and provide better tooling for filtering and
aggregating commit information?

And what I'm suggesting here isn't radical - other VCSs offer branching and
merging without requiring operations similar to rebase. Git itself supports
such a workflow and many tutorials introduce git with a simple workflow that
does not involve rebasing. I've used such systems in the past and because they
had decent version browsing tooling, there was very little "noisy history"
overhead. If you wanted a linear view of a branch like "master", you could
view the history that way; if you wanted to drill into the branches, sub-
branches and merges involved in a particular "master" version, you could do
that also.

~~~
dahart
So yes, you are talking about individual branching & merging in order to
commit a single change. And you are talking about never using the rebase
command. Why? What are the tangible benefits to never using rebase on your own
private commits? You didn’t give me any concrete examples. This seems
exceedingly academic and it also seems like you’re either not understanding
normal git workflow, or are trying to work in an ad for Pijul or something?

If you use git like I’ve proposed - which is how most people actually use it -
by calling commits “history” only after they’ve been shared with someone else,
e.g. pushed, then you get all the same guarantees that you are proposing, and
the workflow is one of adding only leaves.

Rebase is something you usually do locally _before_ adding nodes to the shared
tree, and really has no bearing on published history, nor does it affect
published history or change the order or monotonicity of shared events,
because you never rebase already published commits. It is not only possible to
use rebase in a non-destructive way, it is the most common workflow, and
you’re arguing against that part in favor of something that offers zero
advantages.

It’s not possible to never have a merge conflict. It is possible to do better
than git, but if two people change the same file in the same location, you
have a problem, no matter how many monotonic partial orderings you have.

> other VCSs offer branching and merging without requiring operations similar
> to rebase

Git does not require rebase for branching and merging, and you already know
this because you’re advocating single commits in master from individuals get
branched and then merged instead of rebased locally. Lots of tutorials don’t
mention rebase because it’s irrelevant in the context, and because rebase
isn’t required.

Be specific: what systems exactly are you talking about using in the past that
had no rebasing and better tooling? There’s no discussion here without
concrete examples.

------
aaronbrethorst
I would settle for more clear definitions of “mine” and “theirs.” Bonus points
if the definition doesn’t change based on the action I’m performing.

------
werpers
What is the problem with the graph in the first figure?

Apart from looking a little bit more complicated than one without the merges,
will a history like that actually cause any probelms?

~~~
barrkel
There aren't any problems. If anything, it's a more honest reflection of
history, whereas rebase will erase past current state and remove your ability
to wind backwards to the way the world actually was.

If you view a neat ahistorical history as a primary artifact of development,
then you need rebase.

------
rurban
I'm doing that successfully for years. But I added more:
[https://github.com/perl11/cperl/blob/master/pod/perlcperl.po...](https://github.com/perl11/cperl/blob/master/pod/perlcperl.pod)

The .git/cache is a submodule, so others can easily rebase also. And a lot of
helper scripts to rebase and pull --rebase all work branches automatically.
All current branches are constantly rebased to master, others default to pull
--rebase (aliased to lb), with the shared. git/cache there are no conflicts
for years.

I'm also doing this for several forks automatically, like adding patches and
CI smokes to popular repos like clisp, libffi, openssl, pcre2, coreutils and
many more. These are rebased hourly, conflicts appear maybe once a year.

------
jayd16
Just leave the damn control merges in the history. It's what happened after
all...

~~~
joelthelion
Or rebase onto the head of Master instead?

~~~
jayd16
The whole article is about the problems with that and the pain of working
around it.

~~~
bsaul
Actually that's something i don't get (but i'm not a git pro by any means).
The article doesn't seem be about the pain to rebase, but rather the pain of
regular merging polluting the history. Regularely rebasing the branch on
master doesn't pollute anything, keep the branch in sync with master, and
makes for a happy final commit.

~~~
dahart
This is a good point. The author wanted to talk about rerere, and control
merges are a great motivator for that. If he rebased the branch, then he
wouldn’t be able to write about git rerere. :)

That said, rebasing a branch several times before merging back to master is
something you can easily do if you’re the only person in the branch. When
there are multiple people pushing to the branch, and when the branch changes
are large, rebasing becomes impractical and even dangerous. You don’t want to
rebase a branch while others are working in it, because it would require a
force push.

I’d say rebase the branch when you can, which is usually if it’s a private
branch and you’re the only person who made changes. Merge into the branch as
you go if the branch is large and has multiple people. Commit the merges if
you want people in the branch to do integration testing along the way. Use
control merges and don’t commit them if you want to keep the branch stable and
buffer the people in the branch from changes in master.

------
dahart
This was a nice read; git rerere is one of the git features that has been a
tad mysterious for me, I thought it was on by default.

I’m pretty sure I’ve used it, probably because I read somewhere, just like
this article says, that it should be turned on by default. What I didn’t know
and wondered while reading is how to clear a rerere, and it looks like you can
“git rerere [clear|forget]”. [https://git-scm.com/docs/git-
rerere](https://git-scm.com/docs/git-rerere) I have resolved a conflict and
then discovered later that I screwed it up, so needed to do it again.

------
jrochkind1
This seems like a very useful feature, and I don't want any more new git
commands/workflows to have to remember.

------
twic
I already know how to use push --force to do that.

~~~
mlthoughts2018
\--force-with-lease is usually safer

~~~
twic
Situations where you might need to rebase against incoming changes are
precisely the situations where --force-with-lease is a no-op.

~~~
mlthoughts2018
You are incorrect. This is only the case when the remote copy of the branch
you intend to over-write with the force push has already been modified and
became out of sync from your local copy of that branch prior to the rebase.

------
shlokjoshi
Interesting article

