I was researching some of this a few weeks ago, and there are many posts about tags being a bad idea. The arguments are that they have to be maintained separately and they lack context.
Speaking of bad ideas, anyone want to weigh in on merge commits? I've seen arguments in favor of using `git rebase` everywhere, both ways, and to never squash commits. This makes `git bisect` usable, since you never run into a situation where it points to a massive commit as the problem.
On the other hand, that seems pretty terrible from a `git log` perspective, since many commits are WIP. But maybe it's not a big deal. My bigger concern is that merge commits provide real context: whenever you merge a topic branch into master, it seems to make sense to have a merge commit for that entire operation. But wouldn't that cause `git bisect` to always point to that merge commit rather than one of the smaller commits?
I always advocate workflows where merges never happen.
To me, the 'git log' of a master branch is like a history book about the repository. When you read through it, it should give you answers to the questions "what was changed, why, and when" in as clear format as possible.
Now, I've read most of the arguments trying to show that merges are the way to do just this, instead of rewriting history with interactive rebasing. I won't repeat them all here, but just want to ask this: when one reads a real book about real world history, does it look more like git log history of a repo which has been using rebase, or merge?
I'm not sold on the idea of the master log as a readable narrative.
If you really want to maintain such a narrative, it would be possible to do it separately, in something like a changelog.
You might even hire someone to do that, a kind of technical historian. It's real work, because software development is pretty messy.
If you reaaally care about the history of the software's development, I would seriously consider aggressive rebasing of the repository even much later -- you could refactor the commit history as much as you want to clarify the logical progression of the software.
As I see it, a source code repository is not like a history book, because a history book is written after the fact by a trained historian who spends a lot of energy on tidying up the narrative and making it actually comprehensible.
A source code repository looks to me more like an archaelogical artifact with some terse notes sprinkled in there as clues by the various workers.
Basically I think the git log structure is kind of overblown and workflow arguments that hinge on the legibility of the repository's graph structure don't really matter that much to me.
I still sometimes write pretty involved commit messages, but that's a kind of separate issue from these "workflow" discussions that are mostly about how you should formally arrange the DAG. And I also know that my commit messages are mostly lost in time like tears in rain, so I try to communicate important changes in other ways.
It looks more like merge. Chinese history and European history merged around 1200, and then again a few centuries later. American history and European history merged around 900 and then again in 1492.
But, why not both? Using interactive rebase lets you keep a clean and bisectable history made of small commits. The developer every now and then can also rebase to master and ensure that all commits pass the tests (and perhaps write more tests based on what happened in the meanwhile on the master branch).
However, when CI runs, features are included in master with a merge commit, so that the occasional semantic merge conflict will bisect exactly to the merge commit and the developer of the feature isn't blamed incorrectly.
> when one reads a real book about real world history, does it look more like git log history of a repo which has been using rebase, or merge?
This is a bizarre analogy. History books are a record of things that happened in the world's timeline, which is in fact linear. Source control is a record of things which happened in the timeline of development, which is typically branched.
Imagine a history book from some terrifying PKD-esque sci-fi universe where timelines branch and re-converge. Does that look more like git log history of a repo which has been using rebase, or merge?
Sure, I agree. But a history book tends to be more substantive than "WIP building a nation", which is what most of our commits look like in practice.
Squashing would seem to be the answer, but do you feel that's a bad idea? It certainly has tradeoffs. You can easily end up with a massive squash commit.
I really want to keep the WIP commits. They provide context even if their log messages don't, and they make git bisect easier. But I don't think anybody does that, and I'm curious why.
My bad, I was going to edit my comment to be less coy and more constructive, but I'll just continue here.
Even though I always advocate rebasing, I also think that WIP commits should not reach master as-is. We are actually using Phabricator with my current team, and they have a really nice opinionated way of doing development. All dev happens usually in (really short-lived) task branches, but once they have gone through review etc. they are landed on master as one single commit, with the commit message holding all the relevant information about the change(s) made in the original task branch.
The reason for not having the WIP commits on master is that "commit early, commit often" is good practice, so the sheer amount of WIP commits will completely drown out the actual, finalised changes (i.e. the "actually interesting history as in a history book") in master. So no, I don't think squashing is a bad idea, I think is absolutely essential if you rebase onto master. If you don't squash, rebasing might lead to more mess than using merge.
Now, as to keeping WIP commits. I think that their value is usually much overestimated. I can count the times I have needed to go back to the actual, raw WIP commits instead of the properly rewritten one in master with one hand. But if you feel that it's the only thing keeping you from switching from merge-intensive flow to a always-interactive-rebase one, I'd encourage you just to retain the original, short-lived development branches in origin as separate branches. IMHO that gives best of the both worlds, if you think throwing out the WIP commits could hurt too much.
---
Edit: oh, and having massive squashed commits should not usually be a problem because mostly they should not happen. Individual tasks should be so small that implementing them can not result in a massive amount of changed lines.
So, all dev happens in feature branches, and they're integrated into master as squash commits. That just leaves two questions:
- Use release branches? Or just take the shotgun approach of "everything in master has to be working all the time"?
- When a problem inevitably pops up and you have to roll back, how do you kill just one commit? It's already been pushed to the repo, so a hard reset wouldn't be a good idea, right? So I guess that points to using release branches.
It would be optimal if the team can have the discipline to never commit broken stuff, and we could live with "everything in master must be working all the time".
If that does not work in reality, or breaking it would simply cost too much (because mistakes will happen), I'd use release branch or use git tag to mark versions with "no really, this one really truly actually works in every way".
For your second point, see 'git revert'! It does exactly that, i.e. picks a commit and effectively removes it from a branch by making its "mirror commit". The other way is, of course, just deploying an actual fix asap, git revert is just for when fixing the problem is for some reason or other slow and master must be unbroken immediately.
But yes, master branch master race.
Also related: "What are the problems with 'a successful Git branching model'?" https://barro.github.io/2016/02/a-succesful-git-branching-mo...
Speaking of bad ideas, anyone want to weigh in on merge commits? I've seen arguments in favor of using `git rebase` everywhere, both ways, and to never squash commits. This makes `git bisect` usable, since you never run into a situation where it points to a massive commit as the problem.
On the other hand, that seems pretty terrible from a `git log` perspective, since many commits are WIP. But maybe it's not a big deal. My bigger concern is that merge commits provide real context: whenever you merge a topic branch into master, it seems to make sense to have a merge commit for that entire operation. But wouldn't that cause `git bisect` to always point to that merge commit rather than one of the smaller commits?