
Git: Using Advanced Rebase Features for a Clean Repository - mtyurt
https://mtyurt.net/2017/08/08/git-using-advanced-rebase-features-for-a-clean-repository/
======
ekidd
Editing git history makes sense in several cases:

1\. Projects like the Linux kernel which use frequently use 'git bisect' to
perform a binary search on the history of project (to find when a bug was
introduced), or where the patch series tell a story to code reviewers.

2\. Open source projects, where some contributors have terrible git habits
that the maintainers don't want to merge.

Editing git history makes _less_ sense in other circumstances:

1\. When dealing with less sophisticated git users who don't understand git's
data model. About 90% of git nightmares begin when a novice git user tries to
rebase something.

2\. When it adds a whole layer of unnecessary process for zero payoff. If your
developers all work in short-lived branches with clean histories, then just go
ahead and merge normally without forcing everybody to jump through a bunch of
hoops to beautify a history that nobody ever looks at anyway. Git was designed
to handle branches.

I always twitch a bit when I see a 10-page blog post describing a "git
workflow", with all sorts of complicated branching rules and heavy use of
rebasing. That can make sense in certain specialized situations, but it
shouldn't be considered a "best practice" that everybody needs to emulate.

~~~
maxxxxx
Do people actually look at the history a lot? After a pull request has been
merged I rarely look at the history and I am really not interested in it. This
is for a small team with around 5 people. In larger teams is it more important
to see the history?

~~~
branja
I work as lead dev on a 5 year old web app I took over from a previous dev and
his team of subcontractors about a year ago. It's very helpful to see into the
past when there's no way to just ask the previous dev.

------
EnderMB
I think a lot of people don't bother with rebasing because they either don't
know how to do it, or they are scared of the idea of a version control system
not explicitly saying what they've done to accomplish the latest version of
their code.

Once you get even a small understanding of it, there are plenty of places you
can use it. For example, in the past month I've used it to:

1\. Rewrite the history on a coding test I was given for an interview. The
recruiter was oddly specific about how long to spend on it, and when it had to
be in by, even though they said no one would look at it for another week. I
worked on it throughout the extra week, and modified the history to make it
look like I had done it in the allotted time-slot.

2\. Rewrite history in feature branches where it doesn't accurately reflect
what I had to do to get the code working. I work in an agency environment and
(whether I like it or not) requirements change, sometimes mid-way through
delivering something. If I've written something that won't ever see the light
of day, or if I've done something that won't help the next person to read that
code I'll rebase where needed.

3\. Saving time for pull requests. I tend to commit early and often, and when
I'm in "the zone" on a fairly chunky bit of work that can mean quite a few
commits! When it comes to peer review, some people like to do it by commit,
rather than the finished output, so to help these guys out I squash commits
where possible, since we use our pull requests to illustrate the problem we're
solving anyway. I think an untarnished history is important, but sometimes a
PR audit trail is more useful than what you'd get from pure commits.

~~~
nothrabannosir
Just a note on your nr1: I don't think rebase back-dates the CommitDate, does
it? There are two dates on any commit: AuthorDate and CommitDate. Try git log
--format=fuller. E.g.:

    
    
        Author:     ...
        AuthorDate: Wed Aug 2 11:02:03 2017 +0100
        Commit:     ...
        CommitDate: Wed Aug 9 12:08:52 2017 +0100
    

This is a commit I originally created last week, but rebased onto my current
branch just now. To change the CommitDate, you have to go filter-branch,
afaik. But maybe I'm ignorant of a germane rebase feature?

~~~
dmacedo
You're correct, it doesn't change the author date.

In order to do no.1; perform your rebase, and then change the dates with
filter-branch. I have a couple of helper functions for both for resetting to
commit or author dates [1], use at your own risk! :)

[1]
[https://gist.github.com/dm/aad8e34a5ee6b542a0bc788375b548ed](https://gist.github.com/dm/aad8e34a5ee6b542a0bc788375b548ed)

~~~
EnderMB
Yep, it took me a while to figure out why my commit dates didn't match the
dates within GitHub. Luckily, it's fairly straightforward to figure out once
you know what the issue is!

------
radiospiel
I personally almost never rebase and/or squash, since I think that the
information that gets lost (like: when was it started, what were mistakes
along the lines) might be useful for future understanding of how the project
work evolved over time, and what adjustments should/could be made to the
development process.

But I understand that a screen as shown in the article is not immensely
usuful. But if it comes down on how to present git history information to
users, maybe an additional aproach would be to come up with a better
presentation layer (that, for example, could hide merge commits, "squashes"
branches, etc.

~~~
splike
I think there are two perfectly acceptable schools of thought.

One says that the history should be preserved exactly, because its important
we keep a record of exactly what happened.

The second says that its ok to rewrite history a little if that makes it more
understandable.

I think you would put yourself in the first, and that's ok. I would put myself
in the second because at the end of the day I value understanding over
precision in git histories. I'm of the opinion that one cares about the commit
I made to correct a missing semicolon.

~~~
hdhzy
The usual rule of thumb is not to modify (e.g rebase) published commits. So
it's perfectly fine to adjust local commits before they go into centralized
repo - from the point of view of an external observer the history is never
destructively modified. This rule can be extended to topic branches or user
owned branches with the caveat that others should not base their work upon the
topic branch.

Also remember git push --force-with-lease!

~~~
james-skemp
And `git commit --amend`

I missed a semi-colon or a file? I probably noticed right after I made the
commit (and before I pushed it to the remote branch).

------
wry_discontent
I vastly prefer rebasing and squashing. Just yesterday I had to review some
history, and reading the blames was incredibly frustrating.

> Clean up style

What was this code committed with? What other changes came along with it.

Git, for me, has 2 functions. When I'm developing, it's to save my work in an
incremental way that I can reverse. It's to make notes about what I'm doing.
After I'm done, it serves as a way to lay blame on me for what I changed and
to track those changes in Github. "Who wrote this? What was it committed
with?"

I want my PR to be encapsulated in 1 commit after it's been merged. At that
point, there is no reason to read the internals of what I changed.

------
mugsie
This is the type of workflow you get when using something like gerrit.

because gerrit forces every change to be a single commit, you have to rebase,
which ensures that the history is nice and linear.

I know there is some people who prefer the "git-flow" style history graph (and
I was one of those people) - but there is a lot of advantages to a clean
history.

And, yes gerrit can be painful at first, but is a _lot_ better with something
like git-review[0], or repo[1]

0 - [https://docs.openstack.org/infra/git-
review/](https://docs.openstack.org/infra/git-review/)

1 - [https://source.android.com/source/using-
repo](https://source.android.com/source/using-repo)

------
coldcode
Git is a product where there are 5 opinions on how best to use it for every 4
people.

~~~
tome
AKA "Two git users, three opinions".

------
git-pull
No one's perfect with this stuff.

If you don't believe me, clone git itself ($ git clone
[https://github.com/git/git](https://github.com/git/git)) and open up the repo
in tig(1)

[https://github.com/jonas/tig](https://github.com/jonas/tig)

To be honest, I only in the past 2 years even bothered to view ($ git log
--graph). Regardless of --graph getting wide now and then, I always visualize
my git history as a straight line.

Also, sometimes having a non-linear history is inevitable. Especially in large
open source projects where you're pulling in patches to a branch, and patches
on top of that. You're not always going to be merging a branch with a single
author straight onto master.

Despite posts like this ([http://www.bitsnbites.eu/a-tidy-linear-git-
history/](http://www.bitsnbites.eu/a-tidy-linear-git-history/)) encouraging
good git hygiene, I've had multiple open source projects merge in code via
GitHub and _never_ had negative consequence for it :P

Maybe there are corner cases where git bisect wouldn't work? Though I never
used git bisect even once. Most I do is scroll through tig and view diffs.
Also used to play with a cool git plugin in vim
([https://github.com/tpope/vim-fugitive](https://github.com/tpope/vim-
fugitive)).

Also, GitHub has (since that linear git history post) introduced Rebase +
Merge [https://github.com/blog/2243-rebase-and-merge-pull-
requests](https://github.com/blog/2243-rebase-and-merge-pull-requests). I
think that'll get you what you want.

I do keep branches ("pull requests" if you're using GH lingo) up to date with
($ git pull --rebase). That does mean a force push ($ git push --force), but
it's ok if it's your personal branch. I also use interactive mode ($ git
rebase -i <sha>) to edit/blend multiple commits.

Also, when I do merge, if I go through CLI, I'll preserve the history of the
branch by not doing fast forwards ($ git merge <branch> \--no-ff).

~~~
CJefferson
Just one small thing, I have found a few times git rebase breaking git bisect.

Git rebase can create large sequences of commits which no-one has ever checked
out, and often don't build -- after a git rebase most people check their new
head builds and passes tests, but I've never seen anyone bother check their
new history works.

~~~
arianvanp
git rebase --exec <test command> runs your test suite on each commit during
the rebase. It is super useful

------
vasilakisfil
I have been using that in every branch: I commit regularly and end up with 10+
commits. Then I rebase + squash them and at the same time write a summary
commit. Eventually I merge. This has multiple good effects. First, you get a
clean, featured-based history. Secondly, although your commit message is the
one you wrote when you rebased, you can keep the old commit messages and you
get a better summary of what happened during the development process of that
branch.

~~~
AstralStorm
Why have a history then at all? The idea is not to squash history but make it
reasonable chunks. Remove chaff so to speak while keeping the general history.
A single feature is very rarely a reasonable chunk. (For example, see Linux
kernel patch series per feature.)

Otherwise you may lose the "why" unless code is extremely well commented and
that never happens.

~~~
vasilakisfil
ok depends on the size of a feature I guess but usually I sit in a branch for
2-3 days before I finish the feature.

Also what's the point of preserving the history of a branch that is used for
the development of a requested feature ? For each commit before the final, the
feature is incomplete, possibly not working at all.

~~~
AstralStorm
Not everyone is a believer in shippable increments, but it is a good practice
nonetheless. (not necessarily fully working)

Splitting per feature is indeed ok if it is small including effects on
dependencies. But in my practice there are rarely small enough features -
these map to something akin to user stories which have to be further broken
down.

------
Walkman
For God's sake, DON'T TEACH IF YOU DON'T UNDERSTAND YOURSELF what's happening.

"Commit C's revision number has changed." \- NO IT DID NOT. The original
commit (f4ba6b) is still there, it still points to B, but a totally new commit
has been created with a different content. To avoid confusion, it's better to
name it C'. C is now a dangling commit (has no branch or tag pointing to it,
and will eventually be garbage collected).

------
cdevs
I tell everyone to save/commit as much as possible , I don't want to hear I
lost 2 weeks of work. After words I just squash the gesture into master named
after whatever the short feature is "feature modal","new admin view"
whatever...git history without squash or debase is a nightmare when everyone
has commits like "typo 1" "typo2".

~~~
nothrabannosir
Try selling git commit --squash or --fixup. Your colleagues can use this
without actually rebasing, and you can help them out with the final step by
doing the rebase. Especially when you have:

    
    
        git config --global rebase.autoSquash true
    

These commits are generally much easier to deal with than typo123 ones. But
ymmv.

------
whipoodle
Something that somehow never ever makes it into these tutorials but is
important: after you do the rebase locally, you must force push, don't pull
then push. (After you rebase, the git client will suggest you pull.) This
simple thing caused me a lot of unnecessary confusion about rebasing for a
long time.

~~~
mtyurt
The video in the blogpost is all about it :)

------
neebz
Does this work even we keep pushing commits on a remote repository?

We extensively use Github Pull Requests for code reviews and most of the time
we squash our PRs into a single commit but I would love to have a way to merge
into multiple smaller squashed commits (as OP did) instead of one big.

~~~
branja
Try interactive rebase. [https://robots.thoughtbot.com/git-interactive-rebase-
squash-...](https://robots.thoughtbot.com/git-interactive-rebase-squash-amend-
rewriting-history)

------
talkingtab
git rebase -i on a master with no changes.

As a sole developer, I am pulled in two different directions when using git
feature-branch style. I want to make sure I never lose my work so I check in
frequently- any time I start a change that might ultimately fail, I commit my
current code so I can recover it. But I also want a clean, concise and useful
history.

After reading this article I tried 'git rebase -i master' on my feature branch
even though the master had no changes since starting the feature branch. This
seems to work and allows me clean up my feature branch before I merge it to
master.

Is there a better way to do this or are there problems with this?

~~~
AstralStorm
The only potential problem is that you should instead run git fetch -u and
rebase onto origin/master instead. (By default your rebase will use the remote
anyway, but won't fetch for you.)

------
therealmarv
keeping a clean commit log or even strong commit rules is something which I
only see as barrier for improving existing code... just thinking of code
improvements by PRs which got rejected by bigger projects because the commit
history was not "clean enough". History is not always good or pretty in real
life... it should be the same on git commits.

~~~
joshschreuder
To some extent I agree, enforcing it at a project level seems like
bikeshedding and not necessarily that useful. On the other hand, like walking
people through your house, it's nice to tidy up first to give people a good
experience, and the same applies to walking through the history of your change
too IMO

~~~
therealmarv
Even software with good and beautiful git histories can be ugly as hell (seen
it all) and a good and beautiful house can have some really bad ugly history
;)

------
jwilk
Is this a fixed-width comic font? :-/

~~~
lomnakkus
Looks like Fantasque Sans Mono. It takes a bit of getting used to, but I've
found it amazing once you do get used to it. YMMV.

------
kesor
What is the tool/command used to show git graphs in the screenshots of this
article?

~~~
mtyurt
You can find it here:
[https://github.com/mtyurt/dotfiles/blob/b16c4811675dbf231453...](https://github.com/mtyurt/dotfiles/blob/b16c4811675dbf2314538ae7b9ac385fe4613fbf/bash/.bashrc#L10)

------
hakikosan
nice work Tarik!

