Git is a powerful, general tool that lends itself to just about any workflow imaginable and, because of its adaptability, different teams will naturally converge on different Git workflows as "right." These teams, however, have a hard time imagining and thus accounting for the conditions and preferences of other teams, and end up advocating for their workflows as the best workflow.
Thus we end up with strongly held and yet contradictory beliefs. Some people claim that "rebasing is a bad habit: it destroys history" while others claim that "rebasing is a good habit: it prevents repo cruft." Who is right?
It depends on your preferences, which ought to reflect your team, your project, and your company culture.
For my projects, I prefer to edit my work into tight, clean commits before merging them into the mainline branch. That's because I value the logical story of my software's evolution and not the physical story. I want my commits to tell the story that Feature X was built from three sequentially self-supporting changes XA, XB, and XC, not that, while developing Feature X, (1) I was sick for two days, (2) Bob committed an important unrelated hotfix to the mainline tree that (3) I had to work around, and (4) Sally had to later revert Bob's commit. That I was sick, that Bob commited a hotfix, that I had to work around it, and that Sally rolled it back are all real. They happened. But none of those things are fundamental to the nature of Feature X and how I rendered that nature into code. I consider that stuff noise and make sure it's gone before my commits land in production.
But that's me. Maybe you care about that stuff. If so, your Git workflow and mine are going to be very different.
And that's okay.
 Last summer I interned for a company that had recently switched to git. The workflow was: you are assigned a feature and begin developing it on the shared "development" branch. As you make commits you note the hashes on the company issue tracker. When the feature is finished and signed off, the DevOps people manually parse through all the comments on the issue, manually cherry-pick (!) every commit (that you remembered to write down) to the "production" branch, push, and hope they didn't miss any.
There are, however, workflows that are so bad that they rank poorly under your < operator, my < operator, and just about every other reasonable person's < operator. Your example makes this painfully clear.
Still, someone at the company you interned at had an < operator that pushed that horrible workflow to the top of the list. You can argue that that person was crazy, and maybe you would be right, but maybe that person wasn't. Maybe that person's < operator accounted for things you couldn't see. Maybe the company needed the Git workflow to contort itself into integrating with legacy processes that would have been hugely expensive to change. Whatever the reason, that person's < operator is not one that you or I would want to have forced upon us.
Why then should we assume our personal < operators good enough to force upon anybody else, let alone everybody else?
I work at a place that has some code in production that's over ten years old, and which no one now working there wrote. We have many more discrete systems than developers. The repo history, including its comments (actually currently in SVN, where rebasing isn't a ready option), is a major tool for interpreting the intent of authors now gone (or who now don't remember writing it, something I'm guilty of, certainly). The clean narrative you are asking for is there, in tickets and internal wiki pages. The actual history of commits helps tell a different story, about whether it was intentional that line 493 was omitted because of the effect on another part of the code, or whether it being missing is a merge artifact or other tool screwup.
We're about to switch to git, and one of the things I worry about is that newer developers steeped in the ways of rebasing will come on board and "clean up" their commits to hide informative early missteps and paper over the cracks and seams of their work with single commit messages per feature which afford no understanding of the difficulties involved in developing it.
Nothing in the post explicitly says this, nor is it implied. You just took 7 grafs to say "the best tool for the job".
To clarify, my comment was about "discussions of Git workflows." For example, the very discussion occurring here on HN in the wake of the original post. In this discussion, you will find numerous comments predicated on the implicit belief that there is some global ordering on Git workflows and, therefore, that one workflow can be strictly better than others. Some examples from the top level:
> I think http://nvie.com/posts/a-successful-git-branching-model/ is simpler and more robust. 
> Just use Gitflow peeps. It's simple, not mentally taxing, and having a feature branch for each feature (JIRA ticket, or whatever unit you're using) makes things very atomic and simple. 
> How is this in any way better or simpler than Git Flow? 
> Rebase is a bad habit to get into (because it means other people can't pull your branches), and a pain to fix when it conflicts. Merge master into your working branch instead.
Also, I didn't write 7 grafs to say "the best tool for the job." I wrote 7 grafs to say that there is no "the job." Rather, there are many different jobs. Your job might be to record the physical story. Mine might be to record the logical story. Different jobs.
Like all tools (especially software ones), it's not how hard you swing the hammer, but rather that the nail ultimately gets secured. The two are often misinterpreted against each other.
This is so important I felt it necessary to quote it and restate it as a comment.
Not knowing this ahead of time caused me to have a serious problem on ship day that most source control solutions make trivial to fix - it got solved in the end, but that hiccup was unwelcome. The result was that I threw git out as a viable option for source control (until hg screws me I have no reason to switch back beyond its popularity).
Bad defaults are bugs imo...
EDIT: to be clear i'm referring to the advice to use --no-ff for a merge. i am of the opinion that --no-ff should be the default, because using it causes no damage, but not using it can cause problems.
Mercurial can do the same sort of history rewriting, to a point, but it is not the default workflow and requires that you explicitly enable that functionality. Even then, when you push a modified history to the central repository, you have the old history around still. Whether or not this is desired is up to your team's workflow, but it does make it harder to lose something.
i believe this is precisely why they suggest using it - because on private branches you can undo merges in much more destructive ways which are not viable on a shared repo - they suggest using --no-ff for when you merge in a feature branch
(edit: it is precisely why and there is an article linked from the original discussing the pros/cons of merge and rebase)
Basicailly, I agree with you. What I'd personally like to see more of is an easy way for a company to define a workflow for version control, using git as the underlying mechanism, that fits exactly how they want it to work (think, if the company wants Git Flow, or some other workflow, they define it in my imaginary tool), but for the most part the developers never have to drop down to git to use it. They have an abstraction layer over the top of it.
I saw a paper, gitless, I think it was, that is trying to do something similar but less ambitious. Perhaps in the future we'll see stuff like it. GitHub Desktop and Atlassian Stash also do something similar, but not quite what I'm getting at.
I believe that version control should really get out of your way, and right now, that definitely isn't the case. To me, all the arguments about git's power just sound like "why use Dropbox when you can just sync your files with FTP?"
Maybe, but it might instead be better to ask people to invest a small amount of time in learning to use tools that make them more effective at doing their job. "I'm working on a new feature" and "I'm done with the new feature" are, sadly, not actually expressive enough concepts to deal with the problem.
Syncing files with FTP is pretty much at the same conceptual level as "Use Dropbox". Git solves problems significantly more complicated than "I'm working on a new feature / I'm done working on a new feature".
EDIT: when I say "conceptual level", I mean, "what you do with it", not "how easy it is to use". Obviously, Dropbox is more usable than FTP. My claim is that Git provides a much different level of power than "I'm working on a feature/I'm done with a feature", which is not a sufficiently expressive concept to deal with software development.
I disagree. My mom loves Dropbox -- she just sticks a file in a folder like she always does and it's magically synced. If she were to use FTP, she'd have to set up a server, remember login credentials or find her public key, and grab an FTP client. And not every client supports automatic syncing either, so she might have to manually trigger a sync.
Of course, FTP has way more power and flexibility than Dropbox, just like git vs <proposed VCS>. If your workflow is especially complicated, I'm sure all the git functionality comes in handy and an abstraction would be a hindrance. But most workflows, I'd argue, don't need all the plumbing.
So as long as you want to use the Git Flow model, which is probably the most widely understood/used, you wish has already been granted!
"At this point solve any conflicts that come out of the rebase"
Hide (in my experience) huge amounts of complication -- this can get very hairy, and it isn't discussed at all how you do it (and it's more complex than any of the rest of the stuff shown here).
Maybe I just don't use it enough. Right now I'm in a job where my git workflow is: pull, (do some work), commit, push.
Could somebody please explain this more? My understanding of rebase which comes from  is that it's used to bring a feature branch onto a master branch, or similar, and 'erase evidence' of there ever having been a branch, and squash the intermediate commits on that branch, to make things cleaner.
For bringing changes on the master branch into your feature branch, what's the benefit of using rebase instead of just normally merging the changes in? I'm clearly missing something here. They say 'Resolving conflicts during the rebase allows you to have always clean merges at the end of the feature development.', but I don't see what merge vs rebase has to do with resolving conflicts -- you have to resolve conflicts when you merge master into your feature branch just the same.
I can understand using rebasing to keep the master branch's history 'clean', but what's the reason with a feature branch?
I use it to keep things organized, and also (as you mentioned) for squashing consecutive commits.
In case this helps anybody else:
I had always understood rebase as a mechanism that replaces merging a feature branch onto master, that allows you to squash all the 'messy' development commits on the feature branch, onto a single nice clean commit on master. In this model, you can be as messy as you want on the feature branch (with 'whoops bugfix' commmits and lots of merging in), and the feature branch is generally short-lived -- once you rebase+squash it onto master, the feature branch gets deleted, and you start a new one if necessary for further work.
The proposed model is the opposite -- you try to keep a clean feature branch. Every time you want to fold in changes from master, you do it as rebase, which has nothing to do with squashing anything -- it's about bringing forward the starting point of your feature branch to match master's current state. Then, when it's time to bring feature into master, you merge, not rebase, so the branch history is preserved, it's just nice and clean. And I suppose you can keep the feature branch around for more work on it, so it's almost more of a long-lived 'topic' branch with a bunch of individual features stringing along it.
The funny thing is, I've never seen a blog post that talks about both kinds of rebasing at once. Every one I've seen seems to look at it from one viewpoint or the other, which it possibly why rebasing can get so confusing (maybe I've missed one that does explain it well, both ways). The "git-scm" site describes the first way, but this blog post helped me understand this other way of rebasing:
You ARE right, that before such a merge, it's a common practice to squash your commits on the feature branch to clean it up and then merge a single commit over to master. (To squash you use git rebase -i but you're squashing not rebasing).
If you were to actually rebase this branch into Master instead of merging it, it would change the commit hashes for commits already in master, leading to awful merge conflicts the next time a teammate tried to pull down your changes.
I wanted to mention this to hopefully connect the last few dots. The reason most articles don't talk about "both" these ways, is because really there is only one way, with a few different choices.
Maintaining a branch: If you need to bring changes from Master into a long-running branch, you can either do so with merge or rebase. Rebase is cleaner IMO.
Preparing to integrate your work into master: You can either squash some or all of your commits, or you can keep the history as-is.
Integrating into master: You always do this with a merge.
So I guess I'm starting to see how these are somewhat the same thing in the end, just from different perspectives. Thanks again.
For squashing into a single commit, you can even use
git merge branch_name --squash
As far as I can tell the only real difference is the existence of a develop branch in git-flow that allows integration testing before a batch release.
For example, when I'm on branch Feat-A, but I want to make a change related to Feat-B (or I already have but haven't committed), I'll stash my changes, merge Feat-A into dev, merge dev into Feat-B, pop my changes off the stash, and commit.
The whole switch goes:
- "Oops, these changes don't go on this branch"
- git stash
- git checkout dev
- git merge Feat-A
- git checkout Feat-B
- git merge dev
- git stash pop
It takes half a minute, but it keeps my branches clean, and lets me move from feature to feature at will. I may add a 'git oops' alias just for this process.
I.e. I just started learning Rails, building a new app from scratch, and you can imagine it becoming messsy.
Before I get confident in a certain architecture, I would just push to master all the time with "some more blah" messages.
I think rebasing can be valuable in this sense, since it will destroy crappy history.
I'm more found of the git flow workflow in a team development effort, expecially if your thingy is already in a production pipeline.
I've found rebasing to be often very confusing and destructive, while merges are clear and easy to manage.
Nevertheless, IME finding the right branching/release paradigm in group development situations has not been the biggest problem. The bigger hurdle has been reaching basic understanding of git and SCM tools in general -- even among groups of very good/experienced programmers.
Edit: Never mind, I wasn't remembering git flow correctly.
I don't think this use case is uncommon, and unfortunately when you are multiple persons working on a shared remote feature branch you can't rebase the branch from master and then push it either.
What work flow do people who have shared remote branches use? Or do you think that way of working is broken?
I understood everything except this step. Could someone clarify this for me? Especially this part: "In some cases ... you can rebase also during development, but I strongly advise against it." Isn't the entire article about rebasing during development? Why did this become a "in some cases" thing now in step 6, when step 4 (rebasing during development) seems like the critical step in the whole process?
Otherwise, this is a great step-by-step (especially seeing the commands that get run for each step -- up until now I only understood the rebase process conceptually, but was always scared to try it due to not knowing the exact commands to run in the exact order, or which branch to run them on). Thanks!
"Rebase also during development", I'm assuming they actually mean "squash" commits, since step 4 was all about rebasing. But if the whole point of rebasing instead of merging is to keep the feature branch clean as you go along, why wouldn't squashing "less tidy" commits also be strongly encouraged?
Man, you think you know git well enough, you've got git-flow down, then you read an article like this, and realize there are people who use it in a totally different way, and you're confused all over again.
One difference is that instead of fetching and rebasing changes from origin straight into your topic branch, we recommend checking out master, pulling in changes, checking out your topic branch and rebasing against master. This way master is also up to date.
As for the .gitconfig tip near the end, I don't think changing the behaviour of such a common command like pull is a good idea. Better to be explicit.
We don't have that problem because we keep topic branches personal (but still public so they can be reviewed). In other words, our topic branches branch off from master, we don't branch off of someone's topic branch to build a new feature on top of it while the current feature is still ongoing.
We try to make branches short-lived and break larger tasks into small parts that can ship individually.
We use this more othen then not here, and never have problems. It's a great way to work on a dedicated feature while still getting features from others, yet doesn't suffer from sometimes hard to read history. Does this mean rebase is king and merge isn't? No. Is it the other way around then? Also no. Both are fine if you know how to use them.
True, but the big benefit of git comes from those branches being public, IMO.
> and even then some communication with the other people makes it pullable: tell them to first git reset --hard xxxxx
If other people are actually using the changes on your branch (and if not, why did they pull it?), you end up having to do staircase rebases, with everyone fixing the same conflicts again every time they rebase. It's not the end of the world, but it's noticeably worse than using merge.
> yet doesn't suffer from sometimes hard to read history
IME the only difficulty comes with tools that try and display a linear view of history, and with rebase you sacrifice an accurate time-ordering of commits, making it very hard to find a commit if you were working on several branches at the same time. As long as you configure your tool to show a tree/graph of commits, merged history is easy to follow, and keeps the time-order correct.
I don't know if you've looked at the graph of a Git repo where people merge instead of rebase, but even in that view it's nearly impossible to track even the history of master. It's a mess. Rebase builds such cleaner graphs that I would only advocate merge when you have no reason to ever look at the graph view.
For me, almost all of the time, when I'm ready to merge to master I can almost always squash all my work into one or two clean commits. At that point a fastforward is the simplest solution. Merge-only workflows are great for long-lived public branches (if you have an integration and release branch for instance) but they're insane for feature branches.
We use merge instead of rebase. Here's a snapshot of the graph:
| | | | | | | | | | | | | | |
* | | | | | | | | | | | | | |
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| |_|/ / / / / / / / / / / / /
|/| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
as a real world example bitbucket added a cool feature recently where they grey out merge commits.
personally i love rich and complete history, including merge commits - any practice that damages that is something i'll be highly reticent about adopting.
irl i generally don't need to time travel to achieve my goals...
Interesting. I went a different route by making it very easy for users to dissect a merge commit. For example, if you click on the "Find included commits" link in the first commit at:
you'll be able to see all the commits that it included.
The only reason I can think of for others to pull your personal topic branches are to review it locally or help you debug something. Others shouldn't be branching off of your personal topic branches to work on them, use feature flags instead.
We do continuous delivery and our branches typically live for about an hour before they are pushed to production.
Continuous delivery is about working in small deployable steps and using techniques that allow you to release incomplete features into a production environment (Feature toggles, Branch by Abstraction).
Feature branches are pretty much the opposite of continuous integration. By definition your changes are not integrated into production or with other people using other branches
In fact, the TeamCity CI system now supports automatically building feature branches -- so the two concepts are definitely not at odds.
git help workflows
So I'm resolving now to commit... when?
Any time a new test succeeds (without breaking old tests)
Any time I run a test
Something else I haven't thought of?
It needs to be something more concrete than "when you've done something significant". The mental energy drain of deciding "significant" is a killer.
For example, I'm working on a simple landing page and form. Some logical points for doing a commit may be something like:
1 basic responsive layout complete for mobile
2 progress on desktop responsive layout (going home)
3 completed desktop responsive layout
4 wired up forms
5 added validation to forms
6 cleaned up duplicate css
7 adjusted css for other, crappy, browsers
8 bug fixes
9 fixed a typo
10 bug fixes & final layout issues
Now I'm done and I want to merge that to master. But what I want the commit history to look like is a group of logical steps, excluding things that will be meaningless/useless to other people when they look back at the project history. So I would squash 2 and 3, 4 and 5, and probably 6 thru 10 so that I ended up with 4 total commits 1, 2/3, 4/5, 6/7/8/9/10 that get put onto master.
So, you definitely want to try to remember to commit over the course of a feature, but the actual times you commit are a little bit less important because you should clean up your history before you move it to master. In general I try to commit discrete pieces of functionality (like a working form, or a working layout, or a tested class/method).
Then rebase it to a consistent history with consecutive small changes later when it works. The reason many commits are good to have is that it tends to be easier to extract small independent changes that way.
git rebase origin/PRJ-123-awesome-feature
At this point solve any conflicts that come out of the rebase.
I don't know about this, we've used this before and what winds up happening is a lot of
git push --force
You can only really rebase onto one remote branch. If two people are working on feature branch PRJ-123-awesome-feature, then you both rebase onto origin/PRJ-123-awesome-feature. You have to merge with master if you wish to keep in sync.
Prior to one of you pushing to master you can opt to do a final rebase onto master, but I've always kept these long-running branches in history. I too try to avoid cluttering history with too many merge commits, but it's a very different thing if these long-running feature branches wind up as merges in history.
Very similar to this, but we add a couple of steps to allow a business owner to examine a feature before it gets accepted and also we use an integration branch to deal with merge conflicts before creating a releasable master branch.
> (At this point if you have rewritten the history of a published branch and provided that no one else will commit to it or use it, you might need to push your changes using the –force flag).
I don't think it's a great idea to institutionalize what most would agree is a Bad Git Practice, especially in a multi-user environment.
Here is the cached version: http://webcache.googleusercontent.com/search?q=cache:https:/...