Git supports first class merging, and it is git's default behavior. It enables us to port over entire strings of commits related to a feature to other branches, or enables us to revert (un-merge) these strings of commits.
Heck, --rebase doesn't even allow one to pull when there are uncommitted changes in the working directory.
On the other hand, --rebase is neater when it comes to the modern mantras of "commit frequently", "no feature branches", "deploy master to production". So in short this sounds more like an editor or a tab v/s spaces flame war.
The reason is simple, you always know _why_ something happened, which is quite important when looking for clues in the history. When trying to figure out what old code was intended to do it's much better to see it in the original context, not a re-written to merge context. As a benefit it also preserves change dates.
In fact, you know less.
It's as if someone was doing work against an old, old, old version of your code. Say they did a bunch of work, months out of date, and then wanted to merge it in. Almost assuredly there'd be tons of conflicts, the code wouldn't match up at all with the current state of the world, and what they did in the commits would be entirely useless knowledge without also knowing what resolutions to the conflicts happened as part of the merge commit.
So anyone that looks back at those commits that were made against the old codebase have to know that what they're looking at is horseshit which had to be patched up before it could be merged in.
If they had used a rebase strategy they'd have updated their commits to the current state of the codebase such that they could be cleanly applied. Now you still know what feature(s) were added, but in a way that actually took into account the current state of origin when the commits were rebased.
> If they had used a rebase strategy
Take your out-of-date branch, merge master into it, and clean it up _in the branch_. It's not 2011 anymore, there's lots of tooling that makes viewing changelogs on a branch compare easy (Github's pull requests being probably the most common).
Rebasing removes your changes from the _context_ they were originally made, which makes following along through the project's history harder than it needs to be. It also creates revisions of the codebase that never actually existed, which means it's difficult if not impossible to use git-bisect to track down when a really gnarly bug was introduced (I've actually needed to resort to that 3 or 4 times in the past few years, to help track down the cause of the bug).
More likely you're going to piss off everyone that has to deal with your merge strategy because your commits that couldn't be merged into master are broken, having been fixed as part of the merge commit(s).
* Aren't able to be bisected (they're broken, remember?)
* Aren't able to be blamed (the collective merge commits and the actual commits have to be considered at the same time)
* Aren't able to be rebased later when you decide, actually, this merging shit is ridiculous (the -p option to rebase is dangerous and doesn't always work the way you think it will)
* Aren't able to be traversed cleanly in a log because they're probably split up with merges of master into your branch all over, because hey, you wanted to stay up to date.
* Basically aren't understandable by anyone, including you.
Being that it's not 2011 anymore (sorry, what?), you should learn how to use Git properly, understand what effects your workflow have both to you and to those that read your code (including you!), and those that choose to use Git's tools.
I once had a situation on a client project while at Pivotal where a guy had been working on a branch for 2 months, continuously merging master into his branch. It was a shitshow. He had 45 commits of his own plus 10-15 merge commits from origin/master. Doing a git log of that nonsense was basically impossible to read, impossible to code review (the client's process at the time), and when merged back into master would have been the most crazy ass log --graph I can imagine (because everyone else was doing similar nonsense). We're talking a 10-pair dev team all merging origin/master into their local master. The log was unreadable, unusable, not to be understood by anyone.
Luckily I was able to remove all the merge commits via the 2-argument form of rebase --onto, (http://krishicks.com/blog/2012/05/28/git-rebase-onto/) which then gave him the freedom to squash, reorder, fixup, and split his commits as he felt necessary to maintain his and everyone else's sanity. In the end he filtered it down to something like 20 very clean, green commits that could be reviewed by another person and understood, and rebased on top of origin/master.
the manual already cover this issue. In short, rebase when you pull if you prefer to, but always merge when you push.
After reading this thread how many people would you guess have done that? Fairly good advice but if you wait for everyone to finish the book you'll wait forever. Better educate constructively when needed.
The changing of the SHA per se has nothing to do with why a conflict might arise. The conflict may arise because a particular commit could not cleanly be applied to the new base. The contents of the commit need not even change, which you can verify by looking at the actual contents of the commit, meaning the blobs that it contains. The blobs may stay the same (and indeed, will unless there's a conflict during the rebase), while the SHA changes.
> Part of the SHA contains the SHA of the commit before it.
This is incorrect.
A commit SHA has a tree SHA and a parent SHA associated with it. It's the parent SHA (the base of the commit) that changes when you rebase, which is a cascading change that continues down the tree in order as commits are applied. The SHA only changes because the SHA is a hash of the contents of the changeset and the time at which the commit was created.
You can see the commits along with their parent and tree hashes by doing `git log --format=raw`.
> Never rebase if you've pushed the commits to a central repo because git will detect conflicts of the changes.
This is also incorrect (or at least badly worded and disingenuous).
Git only knows that the changes you're attempting to push will cause you to lose data, which it says explicitly when you rebase a commit that's been pushed to the remote.
You should of course avoid rebasing commits that are on the remote unless you are the sole committer on the remote branch (meaning nobody else is pulling from that branch). You have to force push the branch when you do so, which is fine when you're the only person looking at it and disastrous when not.
Git rebase --abort before you've finished screwing up, or if it's too late, use the reflog and reset back to before you started.
My team now has a merge only policy and it's reduced the number of conflicts and borked merges.
If you're dealing with a large code base with many submitters, you have to demand that individual submissions apply cleanly. That is isomorphic to demanding that they be rebased.
> If you're dealing with a large code base with many submitters, you have to demand that individual submissions apply cleanly. That is isomorphic to demanding that they be rebased.
To play devil's advocate for the merge perspective: wouldn't this also be achieved if the commit author had merged master into his branch first before submitting? It will apply cleanly because conflicts were fixed during the merge, rather than because it was rebased.
Whatever is on origin at the moment of you wanting to merge your code is what your code has to work against.
When you use a rebase strategy you're allowed the opportunity to make sure your commits actually work on top of what's on origin, and fix merge issues with those commits as they happened.
So you rebase, run into a conflict, fix the conflict, run your tests, make sure everything's green, and continue the rebase.
When you do the same thing with a merge strategy what happens is you end up fixing the conflicts as part of the merge, and your fixes are hidden in the merge commit.
This means your commits prior to the merge commit are broken. They didn't take into account the work that was on origin at the time, and thus they are useless without considering the conflicts that were resolved during the merge.
The history you describe is not useful unless it's green and could be applied to origin without conflict. The only way to ensure this, both before and after your commits end up on origin is to do so via a rebase strategy.
If I'm making a change to some code and someone else makes a different change to it and pushes their change to origin before me, I do a rebase and see that they made the change, fix my commit (which is broken at that point in time), resolving the merge conflict, and continue on.
Instead of seeing some changes that have no basis in reality because they were fixed as part of resolving the conflict when doing the merge, you see only their changes applied on top of the correct state of the world, which gives you a clearer idea of what changes they made.
You can still get logical chunks of work with a rebase strategy: you simply rebase on top of the remote and then do a non-fast forward merge, via merge --no-ff.
There's history in the sense of "log of everything that happened" but also in the sense of a nice record of decisions that were made, documentation essentially. I'm guessing "no feature branches" aims more for the latter. Personally, I'm not sure why we shouldn't have both.
On the whole, I think restructuring is good (I haven't always felt this way), but something is lost in the process.
I have no idea how toggles could negate the need for isolated development branches on an 'important' codebase with multiple contributors working on the same areas of concern.
The key is to do continuous integration properly - i.e. actually integrate your changes with the rest of the team several times a day. We actually deploy to production several times of day. This requires you to break down changes into very small steps that can be completed, tested, and deployed in an hour or so.
Each small change is far less likely to break something, or cause performance degradation. If it does, then you know exactly what caused it without painful debugging. Each small change is unlikely to cause significant merge pain for the rest of the team.
This requires you to have excellent automated testing and rapid <5min deploys.
There are also tools like branch by abstraction and feature toggles which can help but are only tools. The key is how you work as a team.
Feature branches are another way of working. They may be the only way of working in many cases (e.g. distributed team of sporadic contributors). Feature branches are almost the opposite of continuous integration. By definition changes on the branch are not integrated.
On one hand it's good to preserve things exactly as they happened. How else will we blame Dumb Steve guilt-free when someone branches from an arbitrary hash to give a one-off release to an angry customer, thereby bifurcating the state handling logic that allows the machine to be certain that it's not emptying a crucible on a crate of orphaned puppies? Think of the puppies!
On the other hand, it's good to align commits to units of work. It's cool that you've forked the project to handle massively concurrent input caused by edge case X (sorry, I used up most of my creativity on the last paragraph), but really I just want your cool configuration parsing piece. Can I cherry-pick that, please?
Git was designed to give you the choice. Do you care about preserving the ability to perform an arbitrary audit, or do you care about treating development as a series of portable patches? You decide, and then do what's right for you.
Concrete example: if I'm the maintainer of wonnage/foo and bar submits a pull request from bar/foo, using git pull bar --rebase doesn't make a whole lot of sense.
Even though git is distributed, most orgs have a central repo that they use as the source of truth. When contributors pull from this central repo, IMO they should be rebasing, not merging. When the maintainer pulls from other people, yes, he should of course be merging.
The merge commit that a contributor implicitly creates when they resync with upstream (without rebasing) is ugly and confusing. It provides no useful information and only serves to clutter the project's history, not make it clearer.
If you're the maintainer of wonnage/foo and bar submits a pull request from bar/foo, bar/foo should have been rebased on top of wonnage/foo, not the other way around.
(And if you are in a situation where rebasing on top of bar/foo would actually do anything, that means you have local commits that haven't been pushed to origin, in which case accepting the pull request is dangerous.)
In this case, the problem is that development isn't being done on a branch. One of the most important features of Git is how easy it is to fork & merge - there's no excuse for doing your work on master.
Work on a branch, merge your branch into master when it's finished. History is completely accurate, you only get one merge commit.
Don't mess up history just to work around bad workflow. Fix the problem at its source.
But when you're working on master, or a branch that's also remote, then you have to rebase. If you don't then you get completely meaningless merge commits. If 3 people merge when they just want to push a commit it then you're history becomes a tangled unusable mess.
Another thing, a developer doesn't need to obsessively pull every couple of minutes. If they're fixing a bug and deciding against using a branch for that (seeing that it could be just one commit), they only pull twice; once before they start working, and again when they're done and want to push to master.
If there's more than one person fixing the same bug simultaneously, then it's yet another process bug.
Huh? Do you mean it isn't being done on a topic (aka feature) branch? Master branch is still a branch.
> there's no excuse for doing your work on master.
Many people, including myself, aren't going to create a new topic branch for drive-by commits, such as fixing a typo.
> Work on a branch, merge your branch into master when it's finished. ... Don't mess up history just to work around bad workflow.
You're technically rewriting history, but it's not like you're rewriting history that's been pushed (which is typically a big no-no). Many people prefer a cleaner, more linear history from rebasing.
And if you're implying that using topic branches somehow fixes the "problem" of having to rebase, that's not true. If you've pushed your topic branch to a remote git repo then anyone else with commit access can make changes or you can even make changes on the remote repo if the web interface allows it (like how GitHub does).
Yes, I thought that was obvious from the context, but I admit I could have worded it better.
> Many people, including myself, aren't going to create a new topic branch for drive-by commits, such as fixing a typo.
Agreed, me neither, but this scenario is still no excuse for the advocated "Always use pull --rebase" - I would fix this by:
# See typo
# Immediately `git pull` to make sure it hasn't already been fixed
# Fix typo
# Commit, push
The likelihood that somebody will have pushed another commit in the time it took to do the above is so small that it's just not worth doing a precautionary --rebase. In the event that it did happen, I would use either rebase or a soft reset to tidy up the history before pushing.
> Many people prefer a cleaner, more linear history from rebasing.
IME this causes more problems than it solves: The history looks beautiful and linear, but it's a lie. The biggest problem with it is that the timestamps go out of chronological order on an apparently-linear history.
When you have a situation of "This bug started happening at 4pm" and you look at a linear branch that has commits for 5pm, 4pm, 3pm, 2pm, 1pm.. you'd be forgiven for not looking further, and so would miss that there was a SECOND commit at 4pm just before the 1pm one, courtesy of a rebase. If you'd kept history "true" by not rebasing, you'd see that there were two branches and therefore you'd investigate both for commits at the right time. This can lose you significant amounts of time when you're trying to track down a bug.
> If you've pushed your topic branch to a remote git repo then anyone else with commit access can make changes
That still falls into the trap of "minimal branches" - git is good at branching, do it often! If you've pushed a topic branch and are collaborating on it, your work should be on a feature branch of the topic branch. One feature, one branch. Not using git's awesome branch&fork power the way it should be used is what causes 99% of the problems people think they need rebasing for.
Branch early, branch often.
We have a multitude of git repos at $work and we all collaborate on new features. I haven't reached for rebase in months - it's an over-used tool that IMHO most people would be better off without. It gets too much use as a crutch that stops people from working out they're doing something wrong.
> If you've pushed a topic branch and are collaborating on it, your work should be on a feature branch of the topic branch.
A feature branch of the topic branch seems like overkill in most cases. It seems odd to be writing a feature on top of an unfinished feature. Or if it's not a separate feature being added but the work is on the same feature, then creating a new feature branch of the topic branch breaks the branch per feature rule.
All this being said, people should follow the git workflow and guidelines set in place for their team. Don't use git differently than your teammates do, it'll cause issues eventually. If it's a solo project then go wild.
git pull --rebase
git config master.rebase true
git config --global branch.autosetuprebase always
This also helps when pairing with someone else. The process of handling a rebase conflict vs pull/merge (regular pull) is quite different.
Example: I am working on a local branch feature/xyz. I decide I am done my work and merge feature/xyz -> master. I get ready to push and realize I am 5 commits behind origin/master. How do I fetch them now?
If you use git --rebase it will clobber your feature branch merge commit. On the other hand if you do a fetch & merge this also probably is not what you want (you will have a merge-commit on top of your merge commit, dawg). git rebase -p is probably the best option in this scenario.
In this case, you have a merge commit from a branch on to master, which would look like:
A-B(master)----F (merge commit)
git rebase --onto origin/master C~1 E
Which would take C's old base, C~1 (B), and replace it with origin/master, but only up to E.
Or, if you still had the branch around that you merged into master (which you do, in many forms, including the reflog, even after you delete the branch).
git rebase --onto origin/master topicA
I wrote a blog post on the different uses of git rebase --onto: http://krishicks.com/blog/2012/05/28/git-rebase-onto/
I knew about git rebase -p and its nuances, but haven't known about the --onto until now. The 3-argument form of git rebase --onto in your blog post was great, thanks!
git reset --hard HEAD~1
git merge origin/master;
git push origin HEAD:master;
git push origin HEAD:old_branch;
I feel like rebasing is the type of feature you really need to understand to use. If you understand the concept and how to apply it, you can do powerful things. If you don't, you'll munge your repo and end up begging the local git guru to bail you out of your mess.
While I don't enjoy looking at a commit history with a bunch of useless merges, it's better than encouraging users to perform an operation they don't understand.
However, sometimes I do 'git pull --no-ff' explicitly, because I want a more-complex multi-commit sequence of operations to stand out as a unit of work.
I really wish git had a primary concept of such a thing. I understand that hg might, but haven't had an opportunity to get that far into an hg project yet.
It will be much better once changeset evolution  is turned on by default in a stable release.
Of course, that sort of thing royally messes with 'git bisect'.
If git had a primary concept of "here's a synchronization point", then it'd be much easier to build tools like 'git bisect' (and even 'git log' etc.) around those sorts of paradigms. That'd offer the best of both worlds -- rich, detailed history showing how something got done, but also points at which it was believed that all was good in the world.
A push is (should be?) a natural synchronization point, for example.
This is of course a bit like how branches are merged together ordinarily, at least if you don't go around deleting them the moment you're done with them. And sure enough it looks like you can sort of persuade git bisect to treat ordinary branch merges like this:
Rebasing multiple commits means putting commits into the history that do not correspond to any actual historical version. If you manually test each commit to ensure that the merged version still compiles and runs, or you're sure that the changes don't conflict and don't get unlucky, then you have something akin to a patch set, a semantic rather than actual history that is easier to read and perform operations like git bisect on. But otherwise, you can end up with broken commits in the history that, when retrieved for bisection or other purposes, obscure whatever is actually supposed to be observed.
Regardless of whether merging or rebasing produces 'better' history, if you're lazy, merging lets you stay that way without creating broken history.
I do prefer feature branches (especially git flow: https://coderwall.com/p/d1pkgg ), but for day to day 'agile' development in a small team we don't use them often. YMMV.
But for pulling changes working on local branch git pull --rebase (or autorebased branch) is many times more readable (and avoid ugly merge bubbles in history).
Branch when you need branches, merge when you mean merging, rebase when you're just updating your codebase from shared repository. Keep your local (not pushed/pull-requested yet) branches dirty (commit often!), rebase them into a clean, obvious history before you share them.
Remember - commit history is for other people to READ.
Using it all the time breaks a lot of good DVCS workflows.
Actually, I'd be interested in hearing about any sort of layers on top of git that make it work like SVN (ie: same commands, same workflow, etc.) It would probably help convince some people to move away from SVN.
git fetch && git rebase -p
git --rebase will clobber local merge commits (e.g. when closing off a feature branch & getting ready to push the merged results).
I threw together a quick custom action you can use with Sourcetree to utilize git rebase -p. Interested folks can nab it here: https://gist.github.com/dgourlay/5465540 Install instructions here: http://www.derekgourlay.com/archives/478
I think I may need to give git-up a look over though.
Why not rebase the feature branch onto the master branch? Then the merge from feature to master will be a fast forward commit rather than a merge commit.
Developing this way one can mostly avoid actual merges. And the repository history is simple and linear afterward. I've been working this way for a while and I have found it effective.
% git checkout feature
% git rebase master
% git checkout master
% git merge --ff-only feature
Any commits that you have made locally (and haven't pushed anywhere public) are safe to reorganize in this way, and it results in simpler history.
Rebasing also increases the number of conflicts you will have to resolve. I've seen this 'always rebase' advice over and over in the last few years, it's stupid advice that betrays a lack of understanding about how git works. There is no advantage whatsoever unless you also squash your commits, and even then there is no advantage unless you are submitting to a project with very careful code review practices (Linux kernel, a project that uses the gerrit code review tool, etc).
As a hg user I really can't have any understanding when I get git messages like "![rejected] master -> master (non-fast forward) error: failed to push some refs to ...". What the hell am I supposed to do with that? And a few minutes later I find out that in git you can't pull all the branches at the same time. Seriously, just use hg and make the world a better place.
I do recommend checking out tig though; that's the one that I have kept. It's also not a gem, it is available through yum or apt (and probably pacman, brew, etc).
Seems like not a worthy objective. Has anyone here ever had a project saved by a tidy commit history?
Parent a b
Merge branch origin/mainline into mainline
Something they did
Something I did
I have introduced git and hg to many people, but I've always tried to tie it into what people do already. Without source control, practically everybody did the same thing: They'd work a little bit, they'd save, they'd work a little bit, they'd save... etc. Using branching for organizing work and a nice cheat-sheet of a few git commands, most non-technical people will be off-and-running.
It all works until the programmers make it complicated with the rebasing. The price of the tidy commit history is the loss of confidence of the rest of the team. I'd rather have the people.
Case in point: https://vimeo.com/60788996
When it's ok: You are working on your own line and the commits you've made are not in a central repo.
Why: Git commits are a hash tree and when you change one commit it changes all the SHAs that come after it. This makes git see your commits pushed remote as different then the ones that are local. The commits have the same changes so it puts duplicate conflict markers everywhere.
Most people love it once they realize how it works.
A lot of people use version control also as a backup in case their hard drive fails. Because of this they are scared of staying on a topic branch for very long, so they push then rebase often leading to conflicts which are scary.
Frankly if I "git log -p" and your diffs aren't clean and to the point, that's a code smell (maybe a developer smell?). If you are sloppy and inattentive in your commits, that probably carries over to your code as well. There's more to the code than just the latest version--history is important.
I have worked on two UI-heavy iOS apps, and code reviews are very effective in comparison to the automated testing we have. github & bitbucket make it easy see a linear stream of commits and discuss individual lines.
This all falls apart when a programmer takes a day to implement a feature, and then pushes it along with a merge commit that can only be reviewed by manual diffing on the command line (and I have found regressions that way).
% git fetch # can never fail with a conflict
% git ff # tries to fast-forward
# if fast-forward fails, then
% git rebase origin/upstreambranch
The git way of doing what the OP wants to do is to develop all code on local branches. They're cheap and you can have as many of them as you want.
Just never commit directly to master and set it to track upstream/master. Synchronize with upstream by pulling into your local master, and then rebase, merge & diff your local branches at will.
"When working on a project you usually synchronize your code by pulling it several times a day."
Do you? Am I alone in thinking that seems like a really bizarre workflow?
Also in an actively developed project there are likely to be many new commits every single day, and you want to be up to date with remote codebase to ease the pain of merging your features into it.
If you rebase your work onto master often you'll greatly speed up the release of your feature when it is ready. You'll also avoid late surprises when something you depend on elsewhere in codebase change.
You can also notify your co-workers about inconsistencies between developed features and detect incompatibilities early.
It also works in macro scale - projects released often with small changes shorten the feedback loop and can be adjust properly for their requirements.
Use G2! which does pull --rebase magically on your behalf.
git config branch.master.rebase true
git config branch.develop.rebase true
This will make any pull be a pull --rebase on the master/develop
> People can (and probably should) rebase their _private_ trees (their own work).
I keep seeing bad advice posted there and popping up on HN.