Every time I have been in a development team that proposes anything more complicated than a basic branch, I start to break out in hot sweats.
I think this has come from experiences where we have had multiple live branches and then got into a mess with merging, testing, incompatibilities, not having the right thing on the right branch etc.
When it does all work, it feels more luck than judgement.
I agree with a poster above that some of the teams that I have been on seem to spend massive amounts of time moving code between branches and accounting for it. It feels like such a drag and a distraction that it's often not worth branching even for fairly substantial changes.
I think I need to slice off a few days to really deepen my understanding of GIT or similar.
In other words, arguing for a model (mainline or trunk) without first setting a context is pointless.
For instance, consider the needs of a very large team versus a very small team. Consider a team that releases often versus a team that releases infrequently. Consider a team releasing multiple projects according to a roadmap versus a team releasing a single project on an ad hoc basis.
The variations are many, and so too are the strategies for managing source code.
"Trunk based development with tests and code-review pre-commit"
is tangibly different from
"Branch based development with tests and code-review pre-merge"
The article talks a lot about trunk-based-development, but if you're doing any sort of checking before "commiting", then don't you essentially have a short-lived branch?
If everyone is on a branch, a team is tempted to use their branch to communicate their changes to each other just on the branch. Next thing you know, you've drifted far from main, and another team is trying to touch the same files, and there's no way to merge, and someone who did a refactoring has broken your APIs, and you're tempted to release from your branch because you don't have time to reconcile the merge, and you have code that solves this problem in ANOTHER branch and you manually copy it in to this project in the same place, but it needs to be slightly different so they drift apart from each other, and then your project gets deferred and none of your code gets merged...
If you're on trunk, you HAVE TO commit in order to share your code with your team. And that makes all the difference!
It's possible to use branches the same way he's proposing to use trunk, but it's awful tempting to do bad things.
I'm reminded of the Indian proverb I saw on Reddit: "If you want to go fast, go alone. If you want to go far, go together." Meaning, if you quickly want to make a demo, a branch is your friend. If you want to make sure your code lives on, make sure it lives on in trunk as quickly as you can.
I don't like the idea of merging from a release branch back into the trunk.
I see branches as things that are cut, potentially hardened, and then discarded.
My guess is that trunk based development is the idea that all commits pushed to the canonical repository are pushed to trunk (rather than a remote branch) with incomplete code being hidden using feature toggles?
In contrast mainline would push incomplete code to long lived remote feature branches and those branches would only be reintegrated into trunk once the code was complete.
However, I don't really see how this relates to a lot of the rest of the article which seems to be more to do with versioning, dependency management, and testing.
There's also some more specific points I'd like to pick up on:
Is it really easier to rebase my local branch than it is to merge from one remote branch to another? Seems like half a dozen of one and six of the other to me.
The article contrasts Google's & Facebook's model with the pull-request model of Etsy and Github but again I don't really see much of a difference. Facebook sends a patch to phabricator for review, someone looks over it and then it gets committed to trunk.
Perhaps I'm misunderstanding things, but the impression I got from the original post is that most (many? all?) developers have local repositories where they manage their features. So, instead of using branches in the central repository, those branches are employed in local repos.
I agree. I suspect it may be just workflow/jargon differences?
Developers, if they are local branching, are not marshaling long running 'in progress' changes there. By habit they're working on something that's going to hit the trunk after a matter of hours or a day or three. They might flip to a new branch for a defect fix (and push that), before coming back to the thing they were working on.
These workflows which avoid branching, avoid merging, and avoid multiple services with separation of concerns are a product of the limitations of the SCM system. Too many times, the designers have praised their workflows for so perfectly utilizing the features of the chosen SCM system. They put the cart before the horse, and don't see that their workflow is an attempt to work around the places where the SCM falls short.
Every SCM that exists will fail spectacularly at merging two refactorings that touched the same code. You simply can't solve this with software today.
Enter... merge pain.
The only real solution that doesn't involve developers avoiding refactoring unless they really need to (either because it's painful or because they don't know they should be doing it), is trunk-based development.
As a footnote, for what it's worth... multiple services with SoC is definitely a good thing, but I don't think trunk-based development precludes that.
Same code is not supposed to be refactored twice. If it happens, a human with knowledge of the two refactors must resolve the conflict.
If there's a line of code which has been touched twice in two different refactoring efforts, I dont think its a good idea to let machine decide between them.
Unless, machine knows the exact purpose of the code. If we get to that point, I think machines will be able to write programs themselves :)
Now the problem with long branches and painful merges sounds like a communication problem between the master and the branch, just at the level of the code. We know that at the level of people timely and terse communication is key to a successful project. So the same would kinda make sense at the code level too.
1. merge conflicts are almost unavoidable
in a non-trivial codebase
2. resolving merge conflicts in some SCM systems
is unreasonably difficult
As far as I know there's no SCM system which can understand the /intent/ of the change and without being able to reconcile the intents of two conflicting merges there's no way of reliably merging the code (at least as far as I know).
The build systems offers fine grained dependency management and the repository is organized so that different teams have responsibility over their components, just as if they were different repositories for all practical purposes.
The advantage of having them in a single repository are:
* atomic operations, i.e. you can refactor components and avoid code rot, or apply and API upgrade to all the clients, thus reducing the amount of time you have to maintain backward compatible APIs
* having a single, monotonically increasing, number that describes exactly which bugs or bug fixes your codes has. This simplifies greatly the management of rollouts in case of complex component dependencies.
One more reason to use trunk based development. Or SVN.
People do want some minimum of commit hygiene so that the repo avoids being a mystery vortex. But because distributed VCS lets you hide everything about how you actually work, people are able to get unnecessarily competitive about it.
I see this a lot in public timelines:
Commit A - splendid new feature
Commit B - oops, missed a semicolon
Commit C - typo
Commit D - addressing code review
Commit E - typo
Rebasing that to a single commit takes little effort or time, and makes the timeline clear. On top of that, if you are using a code review tool which persists changes you may already have history of commit C which is probably the only other relevant commit you'd care to be able to find in the future.
Like, instead of collapsing A-E into a single commit, A should be marked as the start of project/feature/milestone/whatever Foo, and E as the completion of it, and A-E shown as a single object when viewing the history but allowing the user to drill-down to expose the internal commits.
Destroying information is never good.
You can get some of the way there now by making use of tags or branch labels or something.
At one point, i wrote a Mercurial hook that would sit in the central repository add a tag every time someone pushed. Then, only tagged commits would be considered first-class parts of history. There are numerous problems with this, not least that none of the existing tooling is aware of this convention. The fact that Mercurial tags live in commits, which pushers then have to immediately pull, was also very awkward.
You can use branches in a similar way: do all the intermediate, historical-footnote, commits in a development branch, and merge into a master branch to publish them. Then only consider commits in the master branch to be first-class. This is roughly Git Flow, isn't it? Again, the tooling doesn't quite do everything you'd want it to around this.
At least when I was there, the first sentence was true for Facebook. There was/is a separate repo for Thrift services (largely C++, but also Java, Python, and more), each Thrift service is a deployable. The deployables could be pushed at separate times and were usually pushed by teams themselves on their own schedules. The process was still frictionless and trunk-based: only difference was having to type a few well documented commands to push the service yourself. This may changed since, however.