Hacker News new | comments | show | ask | jobs | submit login
Please use git pull --rebase (coderwall.com)
178 points by bitsweet 1632 days ago | hide | past | web | 114 comments | favorite

While the idea of rebasing history while pulling in new changes may appeal to some users or uses, I disagree with the absolute "one true way" sounding note of OP's article.

Git supports first class merging, and it is git's default behavior. It enables us to port over entire strings of commits related to a feature to other branches, or enables us to revert (un-merge) these strings of commits. Heck, --rebase doesn't even allow one to pull when there are uncommitted changes in the working directory.

On the other hand, --rebase is neater when it comes to the modern mantras of "commit frequently", "no feature branches", "deploy master to production". So in short this sounds more like an editor or a tab v/s spaces flame war.

As someone who works with a fairly large number of committers with many branches, I prefer merges to rebases.

The reason is simple, you always know _why_ something happened, which is quite important when looking for clues in the history. When trying to figure out what old code was intended to do it's much better to see it in the original context, not a re-written to merge context. As a benefit it also preserves change dates.

The thing is, you don't know _why_ something happened any more with a merge strategy than a rebase strategy.

In fact, you know less.

It's as if someone was doing work against an old, old, old version of your code. Say they did a bunch of work, months out of date, and then wanted to merge it in. Almost assuredly there'd be tons of conflicts, the code wouldn't match up at all with the current state of the world, and what they did in the commits would be entirely useless knowledge without also knowing what resolutions to the conflicts happened as part of the merge commit.

So anyone that looks back at those commits that were made against the old codebase have to know that what they're looking at is horseshit which had to be patched up before it could be merged in.

If they had used a rebase strategy they'd have updated their commits to the current state of the codebase such that they could be cleanly applied. Now you still know what feature(s) were added, but in a way that actually took into account the current state of origin when the commits were rebased.

> they'd have updated their commits to the current state of the codebase such that they could be cleanly applied.


> If they had used a rebase strategy


Take your out-of-date branch, merge master into it, and clean it up _in the branch_. It's not 2011 anymore, there's lots of tooling that makes viewing changelogs on a branch compare easy (Github's pull requests being probably the most common).

Rebasing removes your changes from the _context_ they were originally made, which makes following along through the project's history harder than it needs to be. It also creates revisions of the codebase that never actually existed, which means it's difficult if not impossible to use git-bisect to track down when a really gnarly bug was introduced (I've actually needed to resort to that 3 or 4 times in the past few years, to help track down the cause of the bug).

Except this idea of the context in which you wrote the code doesn't actually matter to anyone.

More likely you're going to piss off everyone that has to deal with your merge strategy because your commits that couldn't be merged into master are broken, having been fixed as part of the merge commit(s).

Such commits:

* Aren't able to be bisected (they're broken, remember?)

* Aren't able to be blamed (the collective merge commits and the actual commits have to be considered at the same time)

* Aren't able to be rebased later when you decide, actually, this merging shit is ridiculous (the -p option to rebase is dangerous and doesn't always work the way you think it will)

* Aren't able to be traversed cleanly in a log because they're probably split up with merges of master into your branch all over, because hey, you wanted to stay up to date.

* Basically aren't understandable by anyone, including you.

Being that it's not 2011 anymore (sorry, what?), you should learn how to use Git properly, understand what effects your workflow have both to you and to those that read your code (including you!), and those that choose to use Git's tools.

I once had a situation on a client project while at Pivotal where a guy had been working on a branch for 2 months, continuously merging master into his branch. It was a shitshow. He had 45 commits of his own plus 10-15 merge commits from origin/master. Doing a git log of that nonsense was basically impossible to read, impossible to code review (the client's process at the time), and when merged back into master would have been the most crazy ass log --graph I can imagine (because everyone else was doing similar nonsense). We're talking a 10-pair dev team all merging origin/master into their local master. The log was unreadable, unusable, not to be understood by anyone.

Luckily I was able to remove all the merge commits via the 2-argument form of rebase --onto, (http://krishicks.com/blog/2012/05/28/git-rebase-onto/) which then gave him the freedom to squash, reorder, fixup, and split his commits as he felt necessary to maintain his and everyone else's sanity. In the end he filtered it down to something like 20 very clean, green commits that could be reviewed by another person and understood, and rebased on top of origin/master.

Git's commit log is a hash tree (like the bitcoin block chain). When you commit you get a SHA for that commit. Part of the SHA contains the SHA of the commit before it. When you rebase it changes the SHA's of all the commits you've made. That's a lot of why you end up with conflicts when you don't know what you're doing. Never rebase if you've pushed the commits to a central repo because git will detect conflicts of the changes. You end up with duplicate commits with different SHAs after you resolve the conflicts and continue.

> Never rebase if you've pushed the commits to a central repo because git will detect conflicts of the changes.

the manual already cover this issue[1]. In short, rebase when you pull if you prefer to, but always merge when you push.

[1] http://git-scm.com/book/en/Git-Branching-Rebasing#The-Perils...


After reading this thread how many people would you guess have done that? Fairly good advice but if you wait for everyone to finish the book you'll wait forever. Better educate constructively when needed.

What you've said here is at best disingenuous, at worst outright wrong and confusing.

The changing of the SHA per se has nothing to do with why a conflict might arise. The conflict may arise because a particular commit could not cleanly be applied to the new base. The contents of the commit need not even change, which you can verify by looking at the actual contents of the commit, meaning the blobs that it contains. The blobs may stay the same (and indeed, will unless there's a conflict during the rebase), while the SHA changes.

> Part of the SHA contains the SHA of the commit before it.

This is incorrect.

A commit SHA has a tree SHA and a parent SHA associated with it. It's the parent SHA (the base of the commit) that changes when you rebase, which is a cascading change that continues down the tree in order as commits are applied. The SHA only changes because the SHA is a hash of the contents of the changeset and the time at which the commit was created.

You can see the commits along with their parent and tree hashes by doing `git log --format=raw`.

> Never rebase if you've pushed the commits to a central repo because git will detect conflicts of the changes.

This is also incorrect (or at least badly worded and disingenuous).

Git only knows that the changes you're attempting to push will cause you to lose data, which it says explicitly when you rebase a commit that's been pushed to the remote.

You should of course avoid rebasing commits that are on the remote unless you are the sole committer on the remote branch (meaning nobody else is pulling from that branch). You have to force push the branch when you do so, which is fine when you're the only person looking at it and disastrous when not.

I don't get why people like rebasing as an alternative to merging. It works kind of OK for small histories, but if you do something slightly complex or your history is not meticulous, rebasing becomes extremly tedious. The worse example is a conflict in a reverted commit. A merge will flatten the diff and skip the reverted commit entirely, while a rebase will require you to fix the conflict twice. And while a merge is tedious to revert, it is possible. Good luck saving a branch ravaged by a bad rebase.

> Good luck saving a branch ravaged by a bad rebase.

Git rebase --abort before you've finished screwing up, or if it's too late, use the reflog and reset back to before you started.

If you're working on a large team, this becomes a more pernicious problem. I've seen rebases change histories to the point that whole branches become incompatible with master. When you have 15 people all merging branches into master, complete with binary files, conflicts are frequent and often involve files you personally didn't change. People resolve the conflicts to the best of their ability, but they have no idea why certain things conflict and bugging your co-workers every time you merge a branch is unrealistic. So we guess and whole features get wiped off the face of the repository.

My team now has a merge only policy and it's reduced the number of conflicts and borked merges.

what's the idea behind no feature branches?

There are "feature branches", and there are "patch sets with four patches and three merge commits that are there just because the submitter did a bunch of pulls while working on it."

If you're dealing with a large code base with many submitters, you have to demand that individual submissions apply cleanly. That is isomorphic to demanding that they be rebased.

I am personally a proponent of rebasing, but I'm reading through the recent Hacker News threads to get a sense of the arguments on both sides.

> If you're dealing with a large code base with many submitters, you have to demand that individual submissions apply cleanly. That is isomorphic to demanding that they be rebased.

To play devil's advocate for the merge perspective: wouldn't this also be achieved if the commit author had merged master into his branch first before submitting? It will apply cleanly because conflicts were fixed during the merge, rather than because it was rebased.

If I understand GP, it's the idea that when your work on a feature branch is complete, you use rebase to base that branch on the current tip of the branch you are integrating to, e.g., "master", so the result does not reveal the presence of a branch.

If I understand the poster you're replying to, he understood that, but doesn't understand why you would want that. I'm in the same boat - why are people willfully throwing away useful history?

There's a simple concept at work here: origin wins.

Whatever is on origin at the moment of you wanting to merge your code is what your code has to work against.

When you use a rebase strategy you're allowed the opportunity to make sure your commits actually work on top of what's on origin, and fix merge issues with those commits as they happened.

So you rebase, run into a conflict, fix the conflict, run your tests, make sure everything's green, and continue the rebase.

When you do the same thing with a merge strategy what happens is you end up fixing the conflicts as part of the merge, and your fixes are hidden in the merge commit.

This means your commits prior to the merge commit are broken. They didn't take into account the work that was on origin at the time, and thus they are useless without considering the conflicts that were resolved during the merge.

The history you describe is not useful unless it's green and could be applied to origin without conflict. The only way to ensure this, both before and after your commits end up on origin is to do so via a rebase strategy.

Thank you for this. I cannot believe how many people are glossing over the fact that commits which have to be fixed up in a merge are probably broken. Rebasing is not a way of hiding this, it is a way of _going back and fixing it_.

Thanks for the well reasoned response! Your points are all reasonable, and I understand the pragmatism in wanting only fully-green commits to be on origin. I just think merge commits for logical chunks of work are more important. I read history to understand developer intention and process more than I bisect history to find problems. Those conflicted commits aren't broken, they represent the best effort at the time the changes were made and merely need to be merged together with the world as it is now.

I'm not sure I understand.

If I'm making a change to some code and someone else makes a different change to it and pushes their change to origin before me, I do a rebase and see that they made the change, fix my commit (which is broken at that point in time), resolving the merge conflict, and continue on.

Instead of seeing some changes that have no basis in reality because they were fixed as part of resolving the conflict when doing the merge, you see only their changes applied on top of the correct state of the world, which gives you a clearer idea of what changes they made.

You can still get logical chunks of work with a rebase strategy: you simply rebase on top of the remote and then do a non-fast forward merge, via merge --no-ff.

It's simple: you're into working off of the newest state of the world as much as possible and I'm into working off of a snapshot of the world as much as possible and then merging in my snapshot plus modifications when I'm done. I think your way sounds like it deals with changes to the world more often than I want to, and also loses the context of what your commits looked like when you actually made them, but it's a fine way of doing things, and is especially useful if you're submitting changes to a project for somebody else to merge.

Define "useful history". ;)

There's history in the sense of "log of everything that happened" but also in the sense of a nice record of decisions that were made, documentation essentially. I'm guessing "no feature branches" aims more for the latter. Personally, I'm not sure why we shouldn't have both.

Both would be nice. I generally restructure commits before pushing them to master in order to make them easier to read, which is nice for others. However, I think about the code in the order I built it, especially when I need to remember why I did something. Unfortunately, that information gets lost if I forget.

On the whole, I think restructuring is good (I haven't always felt this way), but something is lost in the process.

Feature branches, merge commits, and small (possibly broken!) commits are all better documentation of the process of developing a feature than big "cleaned up" commits.

In this case I'd like to be able to revert the merge.

No feature branches as in NO feature branches and use feature flags / toggles instead - http://martinfowler.com/bliki/FeatureToggle.html

Feature branches and feature flags are orthogonal.

I have no idea how toggles could negate the need for isolated development branches on an 'important' codebase with multiple contributors working on the same areas of concern.

It is certainly possible to avoid feature branches with multiple contributors working on the same areas. We've even been pretty successful performing major refactorings to substantial codebases in this way.

The key is to do continuous integration properly - i.e. actually integrate your changes with the rest of the team several times a day. We actually deploy to production several times of day. This requires you to break down changes into very small steps that can be completed, tested, and deployed in an hour or so.

Each small change is far less likely to break something, or cause performance degradation. If it does, then you know exactly what caused it without painful debugging. Each small change is unlikely to cause significant merge pain for the rest of the team.

This requires you to have excellent automated testing and rapid <5min deploys.

There are also tools like branch by abstraction and feature toggles which can help but are only tools. The key is how you work as a team.

Feature branches are another way of working. They may be the only way of working in many cases (e.g. distributed team of sporadic contributors). Feature branches are almost the opposite of continuous integration. By definition changes on the branch are not integrated.

hi dustin!

I humbly submit that this is a holy war. I personally am on the side of "history is just another part of the project to be maintained," but tread carefully friends; there's dogma at work here.

On one hand it's good to preserve things exactly as they happened. How else will we blame Dumb Steve guilt-free when someone branches from an arbitrary hash to give a one-off release to an angry customer, thereby bifurcating the state handling logic that allows the machine to be certain that it's not emptying a crucible on a crate of orphaned puppies? Think of the puppies!

On the other hand, it's good to align commits to units of work. It's cool that you've forked the project to handle massively concurrent input caused by edge case X (sorry, I used up most of my creativity on the last paragraph), but really I just want your cool configuration parsing piece. Can I cherry-pick that, please?

Git was designed to give you the choice. Do you care about preserving the ability to perform an arbitrary audit, or do you care about treating development as a series of portable patches? You decide, and then do what's right for you.

While this makes sense when dealing with a central repo (which I agree is the vast majority of use cases), it's inaccurate. You're rewriting history so that it looks like your work happened on top of $REMOTE's work, when in reality it was the other way around. This becomes a problem if you're working with multiple remotes, and which remote you pulled from, and when you did it becomes an actual concern.

Concrete example: if I'm the maintainer of wonnage/foo and bar submits a pull request from bar/foo, using git pull bar --rebase doesn't make a whole lot of sense.

Of course, that's absolutely correct. But that's the minority use case, at least in my experience. Git's default is set up that way because of how the Linux kernel maintainers work, which isn't how most organizations who use git work.

Even though git is distributed, most orgs have a central repo that they use as the source of truth. When contributors pull from this central repo, IMO they should be rebasing, not merging. When the maintainer pulls from other people, yes, he should of course be merging.

The merge commit that a contributor implicitly creates when they resync with upstream (without rebasing) is ugly and confusing. It provides no useful information and only serves to clutter the project's history, not make it clearer.

A topic branch workflow is tremendously useful even without separate people in the "developer" and "integrator" roles. For example, we have about 20 people regularly developing code and about five of us "integrate", first to 'next', then to 'master' when a feature has stabilized. We don't explicitly pass an integrator token so there is a possible race to the integration branch, but since we only merge there, it's actually pretty rare in practice. If you lose the race, the polite thing to do is discard your old merge, fast-forward, and repeat the merge. This preserves first-parent.


Your example is incorrect.

If you're the maintainer of wonnage/foo and bar submits a pull request from bar/foo, bar/foo should have been rebased on top of wonnage/foo, not the other way around.

(And if you are in a situation where rebasing on top of bar/foo would actually do anything, that means you have local commits that haven't been pushed to origin, in which case accepting the pull request is dangerous.)

Um.. No.. if you use rebase regularly, you're doing something very wrong.

In this case, the problem is that development isn't being done on a branch. One of the most important features of Git is how easy it is to fork & merge - there's no excuse for doing your work on master.

Work on a branch, merge your branch into master when it's finished. History is completely accurate, you only get one merge commit.

Don't mess up history just to work around bad workflow. Fix the problem at its source.

Indeed when you do work on a feature branch, you shouldn't rebase.

But when you're working on master, or a branch that's also remote, then you have to rebase. If you don't then you get completely meaningless merge commits. If 3 people merge when they just want to push a commit it then you're history becomes a tangled unusable mess.

Then those three should've worked on their own feature branch. In fact, one could consider a local copy of master (or, commits made on master on one user's machine) as a feature branch, all named 'master'.

Another thing, a developer doesn't need to obsessively pull every couple of minutes. If they're fixing a bug and deciding against using a branch for that (seeing that it could be just one commit), they only pull twice; once before they start working, and again when they're done and want to push to master.

If there's more than one person fixing the same bug simultaneously, then it's yet another process bug.

Maybe you are history becomes a tangled unusable mess, but it works for me just fine.

> In this case, the problem is that development isn't being done on a branch.

Huh? Do you mean it isn't being done on a topic (aka feature) branch? Master branch is still a branch.

> there's no excuse for doing your work on master.

Many people, including myself, aren't going to create a new topic branch for drive-by commits, such as fixing a typo.

> Work on a branch, merge your branch into master when it's finished. ... Don't mess up history just to work around bad workflow.

You're technically rewriting history, but it's not like you're rewriting history that's been pushed (which is typically a big no-no). Many people prefer a cleaner, more linear history from rebasing.

And if you're implying that using topic branches somehow fixes the "problem" of having to rebase, that's not true. If you've pushed your topic branch to a remote git repo then anyone else with commit access can make changes or you can even make changes on the remote repo if the web interface allows it (like how GitHub does).

> Huh? Do you mean it isn't being done on a topic (aka feature) branch? Master branch is still a branch.

Yes, I thought that was obvious from the context, but I admit I could have worded it better.

> Many people, including myself, aren't going to create a new topic branch for drive-by commits, such as fixing a typo.

Agreed, me neither, but this scenario is still no excuse for the advocated "Always use pull --rebase" - I would fix this by:

# See typo

# Immediately `git pull` to make sure it hasn't already been fixed

# Fix typo

# Commit, push

The likelihood that somebody will have pushed another commit in the time it took to do the above is so small that it's just not worth doing a precautionary --rebase. In the event that it did happen, I would use either rebase or a soft reset to tidy up the history before pushing.

> Many people prefer a cleaner, more linear history from rebasing.

IME this causes more problems than it solves: The history looks beautiful and linear, but it's a lie. The biggest problem with it is that the timestamps go out of chronological order on an apparently-linear history.

When you have a situation of "This bug started happening at 4pm" and you look at a linear branch that has commits for 5pm, 4pm, 3pm, 2pm, 1pm.. you'd be forgiven for not looking further, and so would miss that there was a SECOND commit at 4pm just before the 1pm one, courtesy of a rebase. If you'd kept history "true" by not rebasing, you'd see that there were two branches and therefore you'd investigate both for commits at the right time. This can lose you significant amounts of time when you're trying to track down a bug.

> If you've pushed your topic branch to a remote git repo then anyone else with commit access can make changes

That still falls into the trap of "minimal branches" - git is good at branching, do it often! If you've pushed a topic branch and are collaborating on it, your work should be on a feature branch of the topic branch. One feature, one branch. Not using git's awesome branch&fork power the way it should be used is what causes 99% of the problems people think they need rebasing for.

Branch early, branch often.

We have a multitude of git repos at $work and we all collaborate on new features. I haven't reached for rebase in months - it's an over-used tool that IMHO most people would be better off without. It gets too much use as a crutch that stops people from working out they're doing something wrong.

I agree with the mantras "Branch early, branch often" and "One feature, one branch" and follow them myself. And like you imply, the majority of git operations are cheap, so stuff like branching can (and should) be done more often.

> If you've pushed a topic branch and are collaborating on it, your work should be on a feature branch of the topic branch.

A feature branch of the topic branch seems like overkill in most cases. It seems odd to be writing a feature on top of an unfinished feature. Or if it's not a separate feature being added but the work is on the same feature, then creating a new feature branch of the topic branch breaks the branch per feature rule.

All this being said, people should follow the git workflow and guidelines set in place for their team. Don't use git differently than your teammates do, it'll cause issues eventually. If it's a solo project then go wild.

Perhaps better stated as "please, oh please, actually pay attention to history" - there are times when rebase is appropriate and times when merge is better.

Yes, but I used the title for people to actually start using rebase. Caring about history beyond avoiding useless merges and broken commits is another step I'd like at least my co-workers to take ;)

Rather than always using

  git pull --rebase
you can set up your git config to always rebase when pulling.

  git config master.rebase true
sets it up for master of the current repo

  git config --global branch.autosetuprebase always
sets it up for all new branches.

I would argue that it's not a good idea to do this for all branches by default. For example, we do the opposite for 'master', we force merge commits (--no-ff) on master/integration branches but use pull --rebase on feature branches that we share.

autosetuprebase is annoying because it doesn't apply to any already existing branches, and because it's configuring a per-branch setting going forward, if you ever want to change the setting later after you have a lot of branches, you're going to be making a lot of config changes in a lot of places. Instead you can just do: git config --global pull.rebase true

I would recommend that you do not do this. I rebase probably 80% of my pull requests, but sometimes you do want a merge commit. Instead, I setup an alias for git pull --rebase

This also helps when pairing with someone else. The process of handling a rebase conflict vs pull/merge (regular pull) is quite different.

if you want to do a merge commit sometimes, but pull --rebase most of the time, why not make pull --rebase the default, and do a fetch && merge when that's what you want?

Because what if you 'already' created a merge?

Example: I am working on a local branch feature/xyz. I decide I am done my work and merge feature/xyz -> master. I get ready to push and realize I am 5 commits behind origin/master. How do I fetch them now?

If you use git --rebase it will clobber your feature branch merge commit. On the other hand if you do a fetch & merge this also probably is not what you want (you will have a merge-commit on top of your merge commit, dawg). git rebase -p is probably the best option in this scenario.

I would ignore everything base698 says below.

In this case, you have a merge commit from a branch on to master, which would look like:

    A-B(master)----F (merge commit)
        \         /
         C-D-----E (topicA)
Once you find out your master (B) is behind, because say there's G-H-I on origin, you'd want to rebase onto that. So you use the 3-argument form of git rebase --onto:

git rebase --onto origin/master C~1 E

Which would take C's old base, C~1 (B), and replace it with origin/master, but only up to E.

Or, if you still had the branch around that you merged into master (which you do, in many forms, including the reflog, even after you delete the branch).

git rebase --onto origin/master topicA

I wrote a blog post on the different uses of git rebase --onto: http://krishicks.com/blog/2012/05/28/git-rebase-onto/

Ah awesome, I love there are so many things to learn about git.

I knew about git rebase -p and its nuances, but haven't known about the --onto until now. The 3-argument form of git rebase --onto in your blog post was great, thanks!

git checkout -b branch_fix; # for safety

git reset --hard HEAD~1

git fetch;

git merge origin/master;

git push origin HEAD:master;

git push origin HEAD:old_branch;

For me, my alias 'git pr' is shorter than 'git pull'. For the person working with me, not being in a rebase when they didn't expect to be is also a boon.

I like this philosophy for small, easily testable commits (e.g. minor bug-fixes). However, I never recommend others to use this workflow.

I feel like rebasing is the type of feature you really need to understand to use. If you understand the concept and how to apply it, you can do powerful things. If you don't, you'll munge your repo and end up begging the local git guru to bail you out of your mess.

While I don't enjoy looking at a commit history with a bunch of useless merges, it's better than encouraging users to perform an operation they don't understand.

No kidding. Nothing makes you miss svn more than an inappropriate rebase. I've done it twice now, & sorely regretted. Heck, once, I was bit so badly by it that I actually said something positive about Rational Clearcase.

Yes, exactly. And if your team consists of perhaps slightly less tech-oriented people (on our team it's the designers), trying to explain what rebase does and how to use it with its exceptions (the "Few notes though..." part of the article) is a recipe for confusion and broken repos. Which is why our policy is "rebase if you know what you're doing, otherwise merge".

As a quick illustration, this is what can happen to the commit tree when just one of your team members doesn't rebase by default:


So it definitely keeps history nice and pretty, and I use it for simple commits all the time.

However, sometimes I do 'git pull --no-ff' explicitly, because I want a more-complex multi-commit sequence of operations to stand out as a unit of work.

I really wish git had a primary concept of such a thing. I understand that hg might, but haven't had an opportunity to get that far into an hg project yet.

I've been burned a number of times by hg rebase, losing work in each case. Given that our team only has 3 committers, I'd rather deal with the messy tree than lost work.

The rebase extension for hg is still a bit experimental, at least as I understand it.

It will be much better once changeset evolution [1] is turned on by default in a stable release.

[1] http://mercurial.selenic.com/wiki/ChangesetEvolution

That looks nice; in particular `hg fold` solves the "messy local, clean remote" problem nicely.

What do you mean a primary concept?

The use case that comes to mind first is 'git bisect'. I love that feature, but I also love to leave a messy trail of commits in my history. In other words, I expect that there will be commits in my history where a full test run might not work, or maybe even things might not compile.

Of course, that sort of thing royally messes with 'git bisect'.

If git had a primary concept of "here's a synchronization point", then it'd be much easier to build tools like 'git bisect' (and even 'git log' etc.) around those sorts of paradigms. That'd offer the best of both worlds -- rich, detailed history showing how something got done, but also points at which it was believed that all was good in the world.

A push is (should be?) a natural synchronization point, for example.

Some form of nested commit would be useful for this I suppose - it would be arranged with something like an interactive rebase, that would squash a sequence of commits into one (for the purposes of log and bisect and the like), while leaving the original ones still accessible. This would give you your neat history for most purposes without losing the original commit comments and original sets of individual changes.

This is of course a bit like how branches are merged together ordinarily, at least if you don't go around deleting them the moment you're done with them. And sure enough it looks like you can sort of persuade git bisect to treat ordinary branch merges like this:


Probably the same thing as `first class' in `first class functions'. See https://en.wikipedia.org/wiki/First-class_function

This has been mentioned in another comment, but not a top level one, so here it is again:

Rebasing multiple commits means putting commits into the history that do not correspond to any actual historical version. If you manually test each commit to ensure that the merged version still compiles and runs, or you're sure that the changes don't conflict and don't get unlucky, then you have something akin to a patch set, a semantic rather than actual history that is easier to read and perform operations like git bisect on. But otherwise, you can end up with broken commits in the history that, when retrieved for bisection or other purposes, obscure whatever is actually supposed to be observed.

Regardless of whether merging or rebasing produces 'better' history, if you're lazy, merging lets you stay that way without creating broken history.

OP here,

I do prefer feature branches (especially git flow: https://coderwall.com/p/d1pkgg ), but for day to day 'agile' development in a small team we don't use them often. YMMV.

But for pulling changes working on local branch git pull --rebase (or autorebased branch) is many times more readable (and avoid ugly merge bubbles in history).

Branch when you need branches, merge when you mean merging, rebase when you're just updating your codebase from shared repository. Keep your local (not pushed/pull-requested yet) branches dirty (commit often!), rebase them into a clean, obvious history before you share them.

Remember - commit history is for other people to READ.

Rebase only if you treat it like a centralized version control system.

Using it all the time breaks a lot of good DVCS workflows.

If you're going to rebase every pull, you may as well just use SVN.

Git has benefits over Subversion besides the crazy commit graphs. I'm not sure there's really any reason to be using SVN when you could be using Git in the exact same style, and have the benefit of offline access to the repository.

Actually, I'd be interested in hearing about any sort of layers on top of git that make it work like SVN (ie: same commands, same workflow, etc.) It would probably help convince some people to move away from SVN.

such a layer would be difficult to provide in a way that isn't misleading, if the hypothetical subversion expatriate currently uses any functionality involving multiple (SVN) branches.

No rebase please. Let's keep the history accurate.

That's just it. It does keep the history accurate, if you're literally just trying to grab updates from upstream. Unless you're actually working on, or merging, a separate branch then normally rebase is what you want.

It does NOT keep history accurate. You have absolutely no guarantee that all rebased commits work as intended or can even be built. The only thing that can be rebased safely is a single commit, for the rest you should learn to live with a messy history. It doesn't really matter anyway as long as you have procedures to prevent criss-cross merges.

If you require that each individual commit meets semantic criteria then you can still require that to occur during a git rebase by rewriting the commit. Which you should probably be doing anyways, if you're pulling breaking changes from upstream.

Run unit tests after rebasing & before pushing?

Exactly. It makes sense to rebase for a single commit or two when you're syncing with a remote frequently, but in any other case merge is much more sane.

Having "git log -p" be readable is infinitely more important than so called "accurate" history.

Be careful though, rebase doesn't preserve merge commits. If you want to rebase after a merge you need

    git fetch && git rebase -p
I recommend using git-up[1] which solves everything in a single command.

[1] https://github.com/aanand/git-up

This is very good advice!

git --rebase will clobber local merge commits (e.g. when closing off a feature branch & getting ready to push the merged results).

I threw together a quick custom action you can use with Sourcetree to utilize git rebase -p. Interested folks can nab it here: https://gist.github.com/dgourlay/5465540 Install instructions here: http://www.derekgourlay.com/archives/478

I think I may need to give git-up a look over though.


> local merge commits (e.g. when closing off a feature branch & getting ready to push the merged results

Why not rebase the feature branch onto the master branch? Then the merge from feature to master will be a fast forward commit rather than a merge commit.

Developing this way one can mostly avoid actual merges. And the repository history is simple and linear afterward. I've been working this way for a while and I have found it effective.

  % git checkout feature
  % git rebase master
  % git checkout master
  % git merge --ff-only feature
(-ff-only isn't needed, but I like to confirm to catch mistakes)

Any commits that you have made locally (and haven't pushed anywhere public) are safe to reorganize in this way, and it results in simpler history.

If your workflow is based on very small feature branches, that's fine, but when your branch is more heavy on changes it's good to preserve the merge commit (and be able to undo it).

Yes this is precisely what I had in mind, for larger features I personally prefer to see merge commits and not rebase the branch onto master. I believe it is one place where merge commits definitely add value and make sense.

Rebase is nice if you have very careful code review policies. Otherwise, if you don't have humans carefully reading every merge, it is a huge waste of time.

Rebasing also increases the number of conflicts you will have to resolve. I've seen this 'always rebase' advice over and over in the last few years, it's stupid advice that betrays a lack of understanding about how git works. There is no advantage whatsoever unless you also squash your commits, and even then there is no advantage unless you are submitting to a project with very careful code review practices (Linux kernel, a project that uses the gerrit code review tool, etc).

Or just use mercurial which tends to do the Right Thing by default more often.

My first thought when I read the title was "Please use mercurial", because git is just a complicated tool that most of the time provides a distraction from real work (admittedly, a tool that gives a wonderful false sense of superiority - "zomg I am 3l1t3 h4x0r because I could finally persuade git to do what I want").

As a hg user I really can't have any understanding when I get git messages like "![rejected] master -> master (non-fast forward) error: failed to push some refs to ...". What the hell am I supposed to do with that? And a few minutes later I find out that in git you can't pull all the branches at the same time. Seriously, just use hg and make the world a better place.

In what way do you think is mercurial's "Right Thing" a different "Right Thing" than what git does (by default)?

In the way whereby nobody ever complains about usability issues like this one.

Or use git-smart (https://github.com/geelen/git-smart). It adds three commands to git. One of them 'smart-pull' which "will detect the best way to grab the changes from the server and update your local branch, using a git rebase -p if there's no easier way. It'll also stash/pop local changes if need be."

While such wrappers around git always pique my interest and and are created with noble intentions, I am yet to find a team of developers who use such tooling around git. There are dozens of wrappers and workflows around git that have been created in the last 3-5 years and frankly it only complicates the shared vocabulary that developers use when communicating with one another. Also, distribution of such wrappers is usually in the form of rubygems which, although a low-barrier distribution mode, is complicated to use with rvm or rbenv of existing projects .

The only git-related tool I like is tig - it's an interactive git log, where you can scroll through commits and view diffs (and lots of other things) without having to copy+paste or use awk or whatnot. This tool (git-smart) seems like... I don't know. For me, I'd rather have my team suffer the learning pains of git than rely on a tool that, in some cases, will fail and leave them even more helpless than they were when they struggled to learn git.

I do recommend checking out tig though; that's the one that I have kept. It's also not a gem, it is available through yum or apt (and probably pacman, brew, etc).

Yes, I am actually guilty of using `tig` for a long time too. :-)

"to keep the repository clean"

Seems like not a worthy objective. Has anyone here ever had a project saved by a tidy commit history?

No, of course not - how often do you actually look at extended windows of history... But it still pains me to no end when my coworkers constantly commit history like:

  Commit d
  Parent a b
  Merge branch origin/mainline into mainline

  Commit c
  Parent a
  Something they did

  Commit b
  Parent a
  Something I did

  Commit a

Totally agree. At best this rebase talk is navel-gazing, and at worst it's a technique that turns Git into a complicated, scary place for those just want to get work done.

I have introduced git and hg to many people, but I've always tried to tie it into what people do already. Without source control, practically everybody did the same thing: They'd work a little bit, they'd save, they'd work a little bit, they'd save... etc. Using branching for organizing work and a nice cheat-sheet of a few git commands, most non-technical people will be off-and-running.

It all works until the programmers make it complicated with the rebasing. The price of the tidy commit history is the loss of confidence of the rest of the team. I'd rather have the people.

Case in point: https://vimeo.com/60788996

It's only complicated because people don't realize how it works. Half the people in this thread advocating to use rebase aren't saying when it's ok and when it's not ok and why.

When it's ok: You are working on your own line and the commits you've made are not in a central repo.

Why: Git commits are a hash tree and when you change one commit it changes all the SHAs that come after it. This makes git see your commits pushed remote as different then the ones that are local. The commits have the same changes so it puts duplicate conflict markers everywhere.

Most people love it once they realize how it works.

A lot of people use version control also as a backup in case their hard drive fails. Because of this they are scared of staying on a topic branch for very long, so they push then rebase often leading to conflicts which are scary.


Nor has any project been "saved" by not mixing tabs and spaces in the source code, but it's really a good idea to not do that.

Frankly if I "git log -p" and your diffs aren't clean and to the point, that's a code smell (maybe a developer smell?). If you are sloppy and inattentive in your commits, that probably carries over to your code as well. There's more to the code than just the latest version--history is important.

YES. Many times. It has allowed us not to mess up App Store builds.

I have worked on two UI-heavy iOS apps, and code reviews are very effective in comparison to the automated testing we have. github & bitbucket make it easy see a linear stream of commits and discuss individual lines.

This all falls apart when a programmer takes a day to implement a feature, and then pushes it along with a merge commit that can only be reviewed by manual diffing on the command line (and I have found regressions that way).

Not saved, but it adds up when you have a torrent of "Merge branch master of http://myownrepo.com and merge conflicts every day.

Yes, I did. If you're using continuous delivery or at least continuous integration system you can revert the build, bisect easily the history in a matter of minutes (for a use case missed by all the tests) and have a feature release saved. :)

I've been telling people lately to always fetch and merge --ff-only separately, which is more commands but less confusing behavior. If you've got merge.defaultToUpstream set to true, you can alias ff = merge --ff-only. That way you can be careful about rebases, and don't have them automatically happening behind your back. So the workflow is:

    % git fetch # can never fail with a conflict
    % git ff    # tries to fast-forward
    # if fast-forward fails, then
    % git rebase origin/upstreambranch

There is history and there is cleanness, as Linus Torvalds explained in http://lwn.net/Articles/328438/.

The git way of doing what the OP wants to do is to develop all code on local branches. They're cheap and you can have as many of them as you want.

Just never commit directly to master and set it to track upstream/master. Synchronize with upstream by pulling into your local master, and then rebase, merge & diff your local branches at will.

Mislav has a great article on this topic: http://mislav.uniqpath.com/2013/02/merge-vs-rebase/

From the article:

"When working on a project you usually synchronize your code by pulling it several times a day."

Do you? Am I alone in thinking that seems like a really bizarre workflow?

From my experience it is common for a group of developers co-working on some feature to share codebase very often.

Also in an actively developed project there are likely to be many new commits every single day, and you want to be up to date with remote codebase to ease the pain of merging your features into it.

If you rebase your work onto master often you'll greatly speed up the release of your feature when it is ready. You'll also avoid late surprises when something you depend on elsewhere in codebase change.

You can also notify your co-workers about inconsistencies between developed features and detect incompatibilities early.

It also works in macro scale - projects released often with small changes shorten the feedback loop and can be adjust properly for their requirements.

Sometimes - why not? There are times when the client comes back with two dozen tiny feature requests. Fix one, push/pull, loop.

Can't agree more, I'll push it even further...

Use G2! which does pull --rebase magically on your behalf.



I try to avoid rebasing everything all the time because it changes the commit dates which can be confusing when you are tracking some particular changes.

why not changing your config for the no-feature branch to avoid that people forgot the --rebase arguments

git config branch.master.rebase true git config branch.develop.rebase true

This will make any pull be a pull --rebase on the master/develop

This is a TERRIBLE rule: they have no clue how git works.

Read: http://goo.gl/ONw7Q

To quote the page you linked:

> People can (and probably should) rebase their _private_ trees (their own work).

Is coderwall some sort of troll site, like the the Swedish Lemon Angels recipe in How to Play With Your Food?

I keep seeing bad advice posted there and popping up on HN.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact