I think your post is very practical in what is the norms for everyday development. I have a bit more of an idealistic pattern I wish I saw more people follow.
As for squashing, I believe that everything should be rebased instead to maintain a proper, linear git history. This maintains the ability to bisect and allows a descriptive git history. Temporary branches pass on their history to the main branch.
Even for temporary branches, labeling commits properly makes reviews easier, especially if commits are atomic. Ideally, all PRs are small and easy, but many times more changes than what's easy to glance at are needed. Understanding what and why these changes were made helps to walk through step by step—they're meant to decay since they are quite literally history. Reviewers should review the commit messages themselves and ask for revisions on them when needed. This is hard to enforce and requires thorough buy-in from people to invest in it. There's some proof in the pudding, at least, if you look at the conventional commits usage in the community.
(very, very idealistically—not very practical) To keep commits atomic, each commit should be tested incrementally, at least with static checks and unit-y (read faster) tests. Each commit must past checks prior to PRs getting merged.
> As for squashing, I believe that everything should be rebased instead to maintain a proper, linear git history
Rebasing on your PR branches is up to you, it's subjective, but setting GitHub to rebase your PR commits onto your main branch as it's merge strategy is not proper at all. It might 'work' for a tiny hobby project, but it does NOT work for any kind of team activity / professional environment - it would be an immense amount of bloat onto your main branch, and git will eventually start to get slow as mud on every operation because of it. Fun fact, git performance does not scale linearly, and big monorepos hit this problem. Beyond performance, when looking through git history on the main branch:
- Every commit on the main branch should map 1:1 with a reviewed PR that passed CI tests/checks
- The 'mapping' can just be done by having the PR number in the commit message title. GitHub does that for you with the settings in this blog post
- It's not uncommon that someone has 30+ commits on their PR branch, many of which are in a totally broken state, have blatant typos, etc. People just don't (and it's not even desire-able for them to) structure their local commits while working on a PR to be some kind of iterative working state that is useful for others to look at. It's beyond clutter to have these 30 broken commits with crummy commit messages like "update" rebased onto the main branch for all to see for all time.
Not necessarily real world, considering performance especially. Let's say that a project is smaller, git improves performance, or computers get more efficient:
Regarding commits relation to PRs: GitHub still maintains this with rebase and merge. It's linked in the UI when viewing any individual commit on a branch.
Broken commits are mitigated by testing each commit incrementally.
Commit messages must be reviewed and edited to make sense. Having 30 commits would be something flagged in review. It's okay and encouraged to make a bunch of commits during review and then require those to be rebased into logical, atomic commits prior to merging.
If each of your commits are tested and make sense independently, they should be different PRs - sounds to me like you're just smashing 5 stacked PRs into 1 with highly fussy curated commits, which aren't guaranteed by github settings to each be passing CI
PRs are just logical groupings of commits. It makes it efficient to group related commits together. You are correct that each of the commits could be separate PRs which is part of the advantage to this pattern—code contribution is always atomic and easily organized. How you organize these contributions is done whichever way makes it easiest to get merged in efficiently.
I think the parent's point is that if you only just choose to think of the PR as the "unit of change" on your software product, rather than commits, all of this extra discipline you're imposing on yourself just goes away.
And what do you lose by making this decision?
The main branch, being the only place where the changes really matter, can still capture all of the necessary details of the changes (based on the PR) in the commit message when it's merged. This is the point where the curation is most useful, and where it's easiest to apply and apply standards, run tests, and actually enforce that they're followed.
> all of this extra discipline you're imposing on yourself just goes away.
I think the extra discipline is just shifted to the PR in that case instead of the commits. What's lost is the ability to make code contributions of related changes together in a PR which can be more efficient than making many small PRs.
BurntSushi (the project maintainer) didn't care about the story the commits told, he cared about the entirety of the changes being made to the repo (at the PR level).
> What's lost is the ability to make code contributions of related changes together in a PR
You're considering each minor commit to be an independent code contribution. BurntSushi instead considered the PR to be the code contribution. The code changes are the same either way you look at it, and there is a single PR in both cases, so it's a matter of choosing your perspective.
I'm finding it difficult to follow the thread of discussion here, so I'll just blurt out some things:
* Most PRs that come in do indeed get squashed merged, but because of a few reasons. Sometimes it's because the PR is really just one logical change and really should just be one commit. Sometimes it's because the commit messages are not written in a way that I like, and so I do a squash-merge so I can either edit or re-write the commit messages provided by the author.
* Sometimes I do a "rebase and merge" if the commit breakdown is good and so are the messages. It's somewhat rare, but some contributors get it right. I'm fine with a single PR containing multiple commits, but the common case is indeed "one PR = one commit."
* I don't usually levy this criticism because I think it's low-brow, but the OP here---"git commit messages are useless"---is pretty egregious clickbait. Halfway into the article, the OP acknowledges that they aren't useless, but just useless in the context of a PR workflow. Which... I don't also 100% agree with, but is usually true for very small changes. What an annoying submission.
I agree the title is clickbait but I didn't mind the article because it may challenge the reader to reconsider why they're putting so much effort into individual commits, even if they're not convinced to abandon (PR branch) commit messages entirely.
As for squashing, I believe that everything should be rebased instead to maintain a proper, linear git history. This maintains the ability to bisect and allows a descriptive git history. Temporary branches pass on their history to the main branch.
Even for temporary branches, labeling commits properly makes reviews easier, especially if commits are atomic. Ideally, all PRs are small and easy, but many times more changes than what's easy to glance at are needed. Understanding what and why these changes were made helps to walk through step by step—they're meant to decay since they are quite literally history. Reviewers should review the commit messages themselves and ask for revisions on them when needed. This is hard to enforce and requires thorough buy-in from people to invest in it. There's some proof in the pudding, at least, if you look at the conventional commits usage in the community.
(very, very idealistically—not very practical) To keep commits atomic, each commit should be tested incrementally, at least with static checks and unit-y (read faster) tests. Each commit must past checks prior to PRs getting merged.