It can be hard to review or blame small disordered commits. But you can rebase the commits to group and squash them for review. Then you may choose to squash the entire PR when you rebase main onto it.
This has pros and cons. Let's explore them after an example.
Scenario: Big & Noisy PR
We have a PR with 13 commits :
3 fix commits, 5 refactors, 2 doc updates, a CI modification, 1 feature commit, 1 test update
The commits are in random order, and some of the commits are revisions to earlier commits. It's hard to understand the narrative of the commits and review things accurately.
Solution Part 1: Tidy the PR
1. Reorder the commits by type and relevance:
3 fix -> CI ->5 refactor -> 2 doc -> test -> feature.
2. Squash logically similar commits:
fix -> CI -> 2 refactor -> doc -> test -> feature.
A squashed commit should have a bullet list detailing each changed module and scope of change:
"""
Fix(package a): Fix a, b, c
This patch fixes:
* (module b): fix [...]
* (module c): fix [...]
* (module c): fix [...]
"""
3. Review and CI the PR
4. Add commits needed to complete the PR
The tidied PR was easy to understand: We fixed some pre-existing issues, beefed up the ci, then set the stage for the feature. The feature itself was clear and simple.
If you do just this, your history will be much cleaner.
Now I'm going to recommend something controversial:
Solution Part 2: Squash-rebase the PR onto main
Yup. Take all that work and mash it together. The final commit message should look clean and detailed:
"""
Big Shiny Feature
This patch implements [...]
Feat:
* (module d): Implement big shiny feature
Doc:
* (readme): update feature list
* (userguide): add tutorial for feature
CI:
* (workflow a): modify [...]
Fixes:
* (module b): fix [...]
* (module c): fix [...]
* (module c): fix [...]
Refactor:
* (module a): rename [...]
* (module b): delete unused [...]
[...]
"""
Benefits:
- We drop 7x fewer commits onto Main
- Project history is more legible
- Commit messages are detailed and useful
- Bisecting takes log 7 = 2.8 fewer steps
Risks:
- File diffs can be illegible if feature work intersects with refactor or fix work
- There are 7x more defects per commit
- It is harder to uncover root cause if we bisect
Conclussion
Tidying PRs before review is a no brainer - it greatly improves our review and history.
Squashing PRs onto main loses some information, but can make history easier to navigate. Since we're disciplined and detailed in our commit messages, this is often much less of a footgun than it might seem. Each commit is now a logical and self-contained unit.
This has pros and cons. Let's explore them after an example.
Scenario: Big & Noisy PR
We have a PR with 13 commits : 3 fix commits, 5 refactors, 2 doc updates, a CI modification, 1 feature commit, 1 test update
The commits are in random order, and some of the commits are revisions to earlier commits. It's hard to understand the narrative of the commits and review things accurately.
Solution Part 1: Tidy the PR
1. Reorder the commits by type and relevance: 3 fix -> CI ->5 refactor -> 2 doc -> test -> feature.
2. Squash logically similar commits: fix -> CI -> 2 refactor -> doc -> test -> feature.
A squashed commit should have a bullet list detailing each changed module and scope of change:
"""
Fix(package a): Fix a, b, c
This patch fixes:
* (module b): fix [...]
* (module c): fix [...]
* (module c): fix [...]
"""
3. Review and CI the PR
4. Add commits needed to complete the PR
The tidied PR was easy to understand: We fixed some pre-existing issues, beefed up the ci, then set the stage for the feature. The feature itself was clear and simple.
If you do just this, your history will be much cleaner.
Now I'm going to recommend something controversial:
Solution Part 2: Squash-rebase the PR onto main
Yup. Take all that work and mash it together. The final commit message should look clean and detailed:
"""
Big Shiny Feature
This patch implements [...]
Feat:
* (module d): Implement big shiny feature
Doc:
* (readme): update feature list
* (userguide): add tutorial for feature
CI:
* (workflow a): modify [...]
Fixes:
* (module b): fix [...]
* (module c): fix [...]
* (module c): fix [...]
Refactor:
* (module a): rename [...]
* (module b): delete unused [...]
[...]
"""
Benefits: - We drop 7x fewer commits onto Main - Project history is more legible - Commit messages are detailed and useful - Bisecting takes log 7 = 2.8 fewer steps
Risks: - File diffs can be illegible if feature work intersects with refactor or fix work - There are 7x more defects per commit - It is harder to uncover root cause if we bisect
Conclussion
Tidying PRs before review is a no brainer - it greatly improves our review and history.
Squashing PRs onto main loses some information, but can make history easier to navigate. Since we're disciplined and detailed in our commit messages, this is often much less of a footgun than it might seem. Each commit is now a logical and self-contained unit.