Git rebase, what can go wrong

jawns · on Nov 6, 2023

I like how Atlassian puts it:

> The golden rule of rebasing

> Once you understand what rebasing is, the most important thing to learn is when not to do it. The golden rule of git rebase is to never use it on public branches.

https://www.atlassian.com/git/tutorials/merging-vs-rebasing#...

For me, even though rebasing comes with some trappings, I still greatly prefer it to the alternative, which is to have merge commits cluttering up the commit history.

ohwellhere · on Nov 6, 2023

The way I phrase and teach what I consider to be the important rule of git is:

> Don't rewrite history on shared branches with proper communication.

I don't teach "never", I don't teach that `main` is special, I don't teach that force pushing is forbidden, because I don't believe in those things.

I highly prefer a rebase-heavy workflow. In addition to not "cluttering" the history, it's an invaluable tool to keep commits focused on "the right level" of atomic changes.

joshribakoff · on Nov 6, 2023

You can simply pass flags to “git log” to hide merge commits, without needing to rewrite history to “destroy” that information. While they are often noisy, sometimes they can be useful. I usually prefer to hide information rather than destroy it.

unlikelytomato · on Nov 7, 2023

I read this justification in nearly every thread that pops up git rebase. I feel like a full because I cannot think of a real world example when this information crosses from signal to noise. Generally, branches that are not ready to merge tend to have enormous amounts of noise commits. Is there a blog post or some concrete examples I could work through that illustrate these benefits? I feel like workflows dramatically different from mine are likely the source of my struggle.

fulafel · on Nov 7, 2023

It's a log of what happened in dev and supports reconstructing history to understand why something worked or didn't in retrospect. "It work when we tried it" "oh this dependency was updated in this merge commit that could have changed the behaviour"

unlikelytomato · on Nov 7, 2023

I am not sure how this is unique to a merge commit. The commit with the dependency change still exists in the main branch. The commit should never have gotten into main branch of it failed tests. If I take a positive action to rebase, I am accepting my fate from master anyway. If I merge into my working branch instead of rebase, that historical context issue only useful for that moment in time of reconstructing history and is not useful anymore. Once a branch goes into master, I want commits to main to have a 1:1 ratio of committed code for a task to positive action taken by a human.

fulafel · on Nov 8, 2023

It's not unique to a merge commit of course, but a point in favour of preserving history.

jakelazaroff · on Nov 6, 2023

I assume that “with” is meant to be “without”?

thebigspacefuck · on Nov 7, 2023

It’s annoying when someone force pushes to a branch that you just reviewed, but you can no longer see the history so you have to scan through the whole PR you already reviewed looking for the change. Please just commit the fix, let me see it, then squash it.

seba_dos1 · on Nov 9, 2023

You can just diff the previous head with the new one. In GitLab, it's simply a matter of clicking "Compare with previous version". Locally, it's `git diff branch@{1}..branch`.

It's only becoming tricky if the MR has been rebased onto a different base in the process, but it's not very hard to deal with that too if needed (just annoying).

seba_dos1 · on Nov 10, 2023

Actually, it's not that annoying at all - TIL about `git range-diff`.

sam_bristow · on Nov 7, 2023

Unfortunately I haven't seen a git forge that will let you do "autosquash on merge" so I could just push up fixup commits as part of an merge request.

PhilippGille · on Nov 8, 2023

GitHub: https://docs.github.com/en/repositories/configuring-branches...

seba_dos1 · on Nov 9, 2023

That always squashes the whole PR into a single commit, making it not very useful in practice. Git's autosquashing is much more powerful than that.

recursive · on Nov 6, 2023

Squash merges cut down the noise considerably.

dkarl · on Nov 6, 2023

I think squash merges are a last resort heavy-handed tool for dealing with developers who refuse to clean up their commit history before merging. Most developers can do better by hand.

Git history should tell a simple, understandable story of each change. For example: 1) refactor existing code, 2) add feature. Or 1) add missing tests, 2) refactor existing code, 3) add feature.

But since you're working on the fly with imperfect knowledge, it doesn't happen in such neat steps. Refactorings and behavior changes end up interleaved in your raw git history, so you need to do a little bit of cleanup by hand in order to present a simple story in the commit log.

Of course if you have developers that don't do that and instead merge dozens of commits that just say wip, wip, wip, lol, fml, wip, wip, lol, yolo and you can't fire them or get them to change, then squash merges ftw.

bogeholm · on Nov 6, 2023

I ser it the other way around - why spend time on a ‘nice’ commit history in a (smallish) feature branch when you can squash merge later.

I prefer one commit to main per feature, a long with a good description on the GitHub PR.

Sometimes I’ll branch out from a feature branch for the occasional and infamous ‘get CI working’ round of 10 one-line commits though, to not make it too muddy.

gwright · on Nov 6, 2023

> why spend time on a ‘nice’ commit history in a (smallish) feature branch when you can squash merge later.

Several reasons:

    * facilitates much better code review discussions
    * enables use of git bisect to locate bugs
    * allows for informative commit messages associated with the changes
    * communicates clearly to future self about why changes were made

toast0 · on Nov 6, 2023

> * enables use of git bisect to locate bugs

This is really only viable if each intermediate commit on a development branch is intended to be bug free. If that's the standard you and your team work with, that's fine, but it's not usually my standard; in a development branch, I may commit things that don't even compile, let alone work, if it's a good point to commit.

bonzini · on Nov 6, 2023

The point of the parent comment is exactly that you should clean up the history before merging to a public branch, so that you can use bisect, even if so far you had wip wip doh wip as the commit messages. The way to get there is to have a mix of proper and wip commits.

watwut · on Nov 6, 2023

Frankly, people lately spend more time managing commit history then using it. Like, commit history is useful once in a year little bit, maybe, but we spend absurd amount of time trying to make it look nice.

nilptr · on Nov 6, 2023

I use commit history a little bit more than that, but mostly agree. I had another dev recently give me crap about the mess of "WIP" commits on a feature branch because they review by clicking through the commits and my commits don't tell much of a useful story other than I apparently did some shit and eventually it all worked.

That said, I've also come to the conclusion there's basically two classes of Git users: people who really understand Git and use it fully, and those of us who basically use it as a place to shove source code before quitting for the night.

Terr_ · on Nov 6, 2023

> Frankly, people lately spend more time managing commit history then using it.

At one company with a Giant Custom Enterprise App, I ended up occasionally acting as a historian for pieces of the company with bad communication/institutional-memory, ex: "Oh, the +5% Foo charge was because of a request 3 years ago by vice-president X, here's the ticket number, before that it used to be +3%."

In those circumstances--where the implementation is the source of truth for business process--a well-maintained stream of commit-messages become quite useful.

mixmastamyk · on Nov 6, 2023

Comment with a ticket-id would be more efficient.

Izkata · on Nov 7, 2023

On a long-lived codebase you're going to end up with nearly as many such comments as there are lines of code. Now that's a cluttered mess.

mixmastamyk · on Nov 7, 2023

When/how to comment is an art in itself, in no conflict to what I wrote.

Either a short quip, doc string, or link to the full story is an accessible combo. Nothing is the correct choice for unsurprising code.

Terr_ · on Nov 7, 2023

While it's often said that comments should capture the "why" of code, I don't usually think that ought to extend to "cuz ticket#" except when that ticket number is an significant bug/limitation that explains a nasty hack.

Noting each feature ticket that ever affected a line--or even just at the function level--is sort of like maintaining a few thousand incomplete micro-changelogs. Doing it "acceptably well" takes much more effort than grooming the commit history so that someone can click "show change history for selected lines" in their IDE.

Plus consider all the unnecessary noise it makes for people reading the code, or reviewing a PR.

mixmastamyk · on Nov 7, 2023

No, it's not difficult (cut/paste), nor a burden for reading in docstrings or even comments. The point is a short link that tells a long story, which should be accessible to non-developers.

Izkata · on Nov 8, 2023

And the commit message is an amazing place to put that.

mixmastamyk · on Nov 8, 2023

Not if you want them read by non-developers.

Izkata · on Nov 9, 2023

Why would you have non-developers reading code?

m000 · on Nov 6, 2023

Curating the commit history takes like 10' per PR and can easily repay in hours of work when some bug hits. Or when you want to tell the junior that wants to implement X for A, why don't you take a look on this one commit where we implement X for B?

mixmastamyk · on Nov 6, 2023

Is there something wrong with the latest version of the method? Instead of one from ~18 months ago which may not work any longer?

m000 · on Nov 6, 2023

I'm not sure I get you. What do you mean the "latest version of the method"? And why shouldn't code from 18 months ago not work? Some minimal regression testing should be in place for production code, and it is probably also used regularly.

So, yes seeing e.g. how "CSV export for class A" is implemented is a great guide for implementing "CSV export for class B".

mixmastamyk · on Nov 7, 2023

Most recent. Interfaces change over time.

Everything you need is in the most recent copy. Showing an old one invites errors for no benefit.

deredede · on Nov 7, 2023

A commit is not a method, it is a change set potentially affecting many files. Pointing people to the commit used to implement feature A lets them understand the whole story of which components need to change (and how) to implement similar feature B in a way that pointing them to a single method or file doesn't necessarily can.

You would then typically supplement reading the commit with reading the current version of the affected code, but looking at the commit points you in the direction of the files and methods you need to look at.

watwut · on Nov 7, 2023

The idea is for junior to learn from massive commit that affected many files?

mixmastamyk · on Nov 7, 2023

This reply is overstating a bit, but it does sound like a lot of work simply to avoid saying, "here look at these methods in this file" and the unstated, and trace the imports yourself.

Not to mention just finding the right commit months later sounds like more work than that already.

Personally, even if I were to make the history absolutely perfect, I never get the code right (interfaces etc), the first time. It might be hours or days before I'm 98% happy with the final implementation. Sometimes big refactor opportunities come to me months later, e.g. where I move code needed multiple times into a more central mixin location.

gwright · on Nov 6, 2023

Not my experience, nor my team's experience over almost 10 years of using this approach.

Chris_Newton · on Nov 6, 2023

I’m firmly in your camp on this one, but I’ve noticed that advocating a tidy history gets a lot of push-back online. I think there is an element of self-fulfilling prophecy here. If a team habitually leaves a messy history behind, that history is rarely going to be useful, so naturally the team has low expectations and sees little value in doing anything to curate it. And if a team isn’t used to making an effort to curate its history, they may assume that doing so is expensive because `git rebase -i` is scary and not something they use on auto-pilot for a few seconds at a time.

In other news, our developers also create several small PRs every day but each is for an incomplete change that doesn’t stand alone so we’re never quite sure which features are finished in any given build, everyone keeps complaining about being interrupted to do code reviews all the time when the code reviews have no value anyway because they always just say LGTM :+1:, and we have targets that no more than 15% of commits should break production when CI/CD deploys them and that we recover fully within an hour each time that happens. If only there were something we could do to improve all this…

seba_dos1 · on Nov 7, 2023

> I think there is an element of self-fulfilling prophecy here.

This too, but there's another thing at play as well: many developers don't know git at all. They just memorized enough commands to let them do their work. They don't understand what they're doing, so they can't reap the benefits of the tool they use. You won't get much use of RAW photos if all you can do in a graphics editor is clicking "auto enhance" button.

watwut · on Nov 6, 2023

I worked in a team where tech lead insisted on nice history. It was a lot of effort all the time and very little to no benefit.

He was lead and could influence salaries, his opinion mattered. So, in real life, people rarely pushed back. That is not the same as us sharing the same opinions tho. I became more verbal about history not being useful online.

mixmastamyk · on Nov 6, 2023

If a strategy requires humans to be virtuous AND vigilant it is doomed to failure.

I rarely use history and prefer merge/squash, with automated CI tools, and tests. "Why" is kept in doc strings, comments, specs, and story tickets. Everything viewable in gitlab with automatic links. All this gets out of the critical path, every day.

I submit that, if your code is so complex that diagnosing a bug is a major research project rather than moving forward with a few extra/modified lines of obvious fix, then that is the problem to focus on.

Chris_Newton · on Nov 6, 2023

If a strategy requires humans to be virtuous AND vigilant it is doomed to failure.

Sorry, but I don’t buy that. By the same principle, there’s also no point in writing unit tests or defining static types or having code reviews, all of which require thought and extra work, yet can yield considerable dividends when done even moderately well.

I rarely use history and prefer merge/squash, with automated CI tools, and tests. "Why" is kept in doc strings, comments, specs, and story tickets.

The argument for a tidy history isn’t just about a different place to explain a change. It’s about presenting work in clearly defined, meaningful steps to other readers like code reviewers, or perhaps someone who found these commits later through `git blame` on a problematic line of code or `git bisect` after a regression. It’s about each commit representing a complete, self-contained change that could later be reverted, or cherry-picked or merged to another branch.

I submit that, if your code is so complex that diagnosing a bug is a major research project rather than moving forward with a few extra/modified lines of obvious fix, then that is the problem to focus on.

Some problems have a lot of essential complexity. The code to solve them necessarily has at least the same degree of complexity. Sooner or later, there will probably be a change to that code with an unintended consequence for something else. Keeping the code and its history tidy and systematic is, IMHO, how you avoid those investigations becoming major research projects.

mixmastamyk · on Nov 7, 2023

One of these things is not like the other. (Journey vs. final destination.)

As an industry we get paid primarily for 1) working software and 2) communicating with stakeholders.

Tidy yet inaccessible (to non-dev) construction stories are not on that path. I would argue unit tests et al are, to ensure #1.

No stakeholders? Put why into a readme, where it can be seen at a glance. Comments can reference docs.

Complexity must be broken down into bite-sized chunks for a solution to be feasible in the first place, reliable in the second. i.e. skull-size limits. If there’s any code I don’t understand I rewrite it until I can. With tests of course.

Chris_Newton · on Nov 7, 2023

Sorry again, but I’m still not seeing the distinction I think you’re trying to make here.

I see version history as an asset, just like the code itself, tests, developer documentation, the bug tracker database… None of these things are directly visible to end users under normal circumstances, but they are useful sources of information and organisation and collaboration that help developers to create the software that users do see.

To me, a repo with a messy version history is like code full of superficial comments, a test suite with high coverage metrics that still doesn’t exercise the most important functionality, a dev team where the only documentation is some auto-generated static site that reproduces what any decent IDE would show in real time anyway, or a tracker where all the tickets are vague one-liners. You can produce useful software despite those things, but why would you?

mixmastamyk · on Nov 7, 2023

It's not a black and white distinction, I agree.

Also, I said/meant stakeholders not end-users. Ours definitely do write bugs, look at docs and generate db reports etc.

The main distinction is that things on that list have a high cost-to-benefit ratio to goals 1 & 2, where history maintenance does not. The cost is high and utilization isn't. Additionally it can't be used to communicate with anyone but developers.

watwut · on Nov 7, 2023

> there’s also no point in writing unit tests or defining static types or having code reviews

Not true. I do not do those to have nice clean process. I do unit tests, because without them the code is unstable and it is hard to fix bugs without causing unrelated ones. If the code is super simple and unlikely to break, I don't do test. I like to use static types, because I am much faster when writing them. The code is more readable and I have less bugs. Now, I have seen both useless and useful code reviews.

But, in all of those cases, things are done because they beneficial impact in final code and speed of delivery. Beautiful git history does not have such tangible measurable benefit. Git blame and bissect work without it, you just need one more step once in a while.

Chris_Newton · on Nov 7, 2023

But, in all of those cases, things are done because they beneficial impact in final code and speed of delivery. Beautiful git history does not have such tangible measurable benefit.

I respectfully disagree. In my experience, a tidy history directly benefits both efficiency and outcomes of code reviews, speeds up investigations of both bug reports and sometimes general background before starting new development, makes development much easier in situations where changes may need to be isolated and deployed to specific environments (not all software is a web app using CI/CD…), makes it much easier to back out a problematic change without causing unnecessary collateral damage, and helps to verify which development has actually been completed and deployed to which stages/environments, which can be useful for general awareness around the team but is particularly important if you’re operating in any kind of regulated field. All of that in exchange for usually spending less time in `git rebase -i` than it’s taken me to write this comment seems like a bargain to me, but YMMV.

seba_dos1 · on Nov 8, 2023

Fully agreed. Seems to me that most people who don't care about curating their commits (and either leave a mess or squash everything at merge regardless of context) simply don't work on projects that are either complex, distributed or long-living enough, so they can easily afford such carelessness.

mixmastamyk · on Nov 8, 2023

Nope, ~15 year project here. No one is reading obsolete commit messages when there are hundreds of files to get familiar with today.

Not to mention quality is significantly higher now so you wouldn't want to refer to a granular history of crap anyway. Any time spent on that would have been completely wasted as I wipe out a thousand line file for a new one with a hundred lines because requests hadn't been invented yet and the original implementer didn't understand network protocols or how to use argparse and implemented it from scratch poorly.

seba_dos1 · on Nov 9, 2023

Yeah, as I suspected.

If you can afford your first instinct to be reimplementing things from scratch, your understanding of the value provided by proper version control will be limited. Some of us work with constantly changing code developed by thousands of people from all around the world in projects that 15 years ago were migrating to git and that have tons of downstreams, and are thankful for maintainers and processes that keep their commit graphs useful.

Though that said, once you're comfortable enough with git you'll be thanking yourself for commit hygiene even when coming back to your few years old single-person codebases.

In my experience, developer's work consists mostly of gaining understanding of codebases. It's like being a detective. Writing new code happens too, but not as often and it's not as impactful (and usually can and should be handled by less experienced devs wherever possible). Among the most impactful things are single line changes that took a week to write, or a few dozen lines that took months. Rewriting existing code from scratch is something that happens only as a last resort and after very careful consideration. Maintaining some basic version control hygiene makes a whole world of difference in such work. Sure, you can live without it, but you can also live without docs, comments or tests (and sometimes have to - which makes you appreciate them when they're there).

erik_seaberg · on Nov 6, 2023

If I can invest business hours and get back minutes during an outage at 3 AM, I should do that.

krferriter · on Nov 8, 2023

Yeah I just don't care really. I have noisy dev branches, force pushing rebases from an upstream branch, and usually just squash merge and throw the history away when merging back to master. I've never come across a situation where I needed a preserved, fine-grained commit history for every dev branch after they've already been put in working order and merged to master. I guess I just don't use commit histories very much after the fact, like months later trying to find what original commit changeset a line of code was changed in. It's never mattered.

funcDropShadow · on Nov 10, 2023

I use the commit history every time a I open file I haven't worked on for a while. Just seeing the output of git annotate in the border of Intellij gives me a sense how old code is, what changed together with what and whom to ask in case of questions.

WorldMaker · on Nov 6, 2023

If merge points are your "known good" points anyway you can just use the powers of the git dag and `git bisect --first-parent` in your main branch to just bisect the merge points. There's no need for rebase/squash and you still get useful git bisect results.

bonzini · on Nov 6, 2023

All commits are good points and potentially useful points. Was the bug in the refactoring? In the feature itself? In the resolution of merge conflicts? You can only answer if you don't squash, and it becomes easier to fix the bug if you know the answer.

WorldMaker · on Nov 6, 2023

Sure, but also no one particularly wants to CI every commit inside a PR, so there is a usefulness in `git bisect --first-parent` as the "first pass" of known CI points (merge commits presumably from PRs) to find the "PR that introduced the problem" and then drill down into every smaller commit to see if you can get additional bisect information (from commits that may or may not have passed CI in the first place in development work-in-progress).

dkubb · on Nov 7, 2023

I think the point the GP message is making is that, prior to review/merge you extract atomic commits from your WIP that tell a clear, concise story of how the change was made. The reviewer has less built up context so by chunking it like this they can step through each commit one at a time.

IMHO the expectation is that each commit would 100% pass CI, so if you decided to extract some commits and merge that early you can. This is especially useful when a 6 commit PR is reviewed, and the first 3 commits are fine but there is more feedback on the last three. The reviewer can split the first 3 good ones out, get them merged and whittle down the PR to the remaining three. The subsequent follow up will be less.

IME team velocity goes up with this too, and it encourages small and easy to review commits like a Remove to be extracted and merged early.

Since PRs are always as large or larger than commits, I would much rather have a specific commit flagged than have to wade through the whole PR diff. If the PR is not familiar to me, I want to increase my effectiveness narrowing down the cause, so I can fix it faster.

bonzini · on Nov 7, 2023

I don't do full CI for every commit but I do run the relevant unit tests (or all of them depending on the change and the project) and ensure that they pass.

Terr_ · on Nov 6, 2023

> even if so far you had wip wip doh wip as the commit messages

Aside, `git commit --fixup HEAD` is often better than `git commit -m "oops, one more thing"`, since it means you can easily `git rebase -i --autosquash`.

seba_dos1 · on Nov 7, 2023

Of course I commit a lot of garbage commits that don't work, it's super useful to do so. Those never get pushed out into branches that I share to others though - why would I waste their time having them look at those?

What I push out are atomic commits that make sense logically, not an external undo log of my text editor; squashing those on merge provides no benefit and only loses useful information. Squashing should happen before push, not on merge, and there's no reason to have buggy "intermediate" commits recorded in your central remote branch at all.

u801e · on Nov 7, 2023

> > * enables use of git bisect to locate bugs

> This is really only viable if each intermediate commit on a development branch is intended to be bug free.

git rebase has an --exec option that allows you to run a command or set of commands for each commit in the branch. You could rebase your development branch before pushing it up for review and ensure each commit passes coffee linting and tests.

u801e · on Nov 6, 2023

Another good reason is that having a small commit that changes just one thing is a lot easier to revert without encountering conflicts, even after other features have been committed to the main/master branch.

sangnoir · on Nov 6, 2023

There are a multitude of Git workflows, and opinions on what the basic unit of change is: for some, a feature is atomic, so squash-merging feature branches is perfectly natural.

> facilitates much better code review discussions

This can be done while adding code to the feature branch

> allows for informative commit messages associated with the changes

I'm assuming you consider individual commits to be the basic unit of change? This isn't always the case. Some products are not amenable to adding features fractionally

> communicates clearly to future self about why changes were made

You can do that with a squash-merge too!

I've noticed people who work on an evergreen deployment can afford to work on a very granular, commit-level. However, if you have to support multiple production branches concurrently and often have to cherry-pick features and fixes across them, features will naturally become the basic unit of change you will find yourself gravitating towards, and will liberally use squash-merging just to keep your sanity.

bogeholm · on Nov 6, 2023

> facilitates much better code review discussions

Hmm, I usually mark PR’s as draft until ready for review, and then I expect the discussion to be about the current state, not a previous intermediate state. Easiest with small PR’s.

> enables use of git bisect to locate bugs

Interesting. I know _of_ git bisect, but haven’t used it as part of my workflow. Have you found it useful to bisect commits on a feature branch (which, presumably, represents unfinished work)?

> allows for informative commit messages associated with the changes

I find using the PR title and accompanying info in GitHub or similar to be quite informative - that should convey the purpose of the change.

> communicates clearly to future self about why changes were made

See above. Perhaps we work differently, but I find it clearer to read a git history where each commit represents a single, complete feature/fix/refactor instead of intermediate steps.

marcandre · on Nov 6, 2023

The classic example where this fails is when needing to revert something. An atomic commit for the migrations + some atomic commits for the implementation mean you can easily revert the implementation, and leave the migration intact (as should be) and add a reverse migration.

throw555chip · on Nov 6, 2023

> why spend time on a ‘nice’ commit history in a (smallish) feature branch when you can squash merge later.

Agreed, it has been standard at most shops I've worked in the past 8-10 years.

ufo · on Nov 6, 2023

Precisely, you want to keep it about one commit per feature. I think the parent comment was worried about monster merges that squash many features together.

folmar · on Nov 6, 2023

If things are heading towards a single commit an amend commit works fine for me. If I need a previous state I just get it from reflog.

m000 · on Nov 6, 2023

> I ser it the other way around - why spend time on a ‘nice’ commit history in a (smallish) feature branch when you can squash merge later.

Squash-merge is a scourge. I've seen squash merged commits 30 lines long ("try 15", empty line, "try 14", empty line...). I'm not even sure if you can do anything about such commits because squash-merge is a github/gitlab thing. So, I'm not sure if there are hooks to block it via a commit message linter.

And I've seen people going through some intense mental gymnastics to justify avoiding squashing locally, writing a proper commit message and then merging.

joshribakoff · on Nov 6, 2023

You can simply ask “git log” to show you one coarse entry per Pr rather than “destroying” the more granular history

erik_seaberg · on Nov 6, 2023

This. It bugs me that people permanently throw away details of changes rather than show just the log of merge commits.

hhjinks · on Nov 6, 2023

Why would I want to keep the details? Umpteen "tmp", "fix" and "fixed typo" provide negative value. When I check the blame of a line, I need to see the context of the change, meaning a description of all the work that was done as part of that change, and perhaps a ticket number. Anything else is noise that actively detracts from the value of the log.

It's like 4K porn. It's less appealing when you see everything.

mablopoule · on Nov 6, 2023

No yeah, absolutely do squash those commits, they are actually polluting the git history.

The issue for me is when commit who are about adding a new value in the env file become mixed with template and responsive handling, mixed potentially with a bug fix.

zaptheimpaler · on Nov 6, 2023

I get why you would prefer it clean but its just too much overhead for me. I naturally make lots of changes together - especially on a complex feature, you need to build things in a "full-stack" way horizontally so you can test as you go. Then pulling things apart into "clean" atomic commits later just takes too much time and I don't really know how to do it efficiently.

yxhuvud · on Nov 6, 2023

The thing is that when you get good at it the overhead will go down drastically. And the practice will make it easier and faster to extract small pull requests out of your main work that can be reviewed separately.

leptons · on Nov 6, 2023

It's like if someone said they wanted to invite you over for dinner. Then they started texting you. "at the grocery store". "bought a pound of beef". "bought some carrots". "Checking out now". "Arrived at home". "Turned oven on". "Turned oven to 450 degrees". "Turned oven up to 460 based on different recipe". "Starting to prep the beef now". and so on, and so on. I mean, just cook the damn dinner - I don't need to be needled about all the steps. I'll come over and bring the wine, and we'll eat a meal. I don't need to know every minor implementation detail in a commit log, to review and merge the branch. Arghhhh I have one dev on my team like this right now. I'll have to have a talk with them.

silenced_trope · on Nov 6, 2023

I disagreed with you up to this point:

> Of course if you have developers that don't do that and instead merge dozens of commits that just say wip, wip, wip, lol, fml, wip, wip, lol, yolo and you can't fire them or get them to change, then squash merges ftw.

Yes, any large organization has plenty of devs who all have their own style and preferences, for better or worse.

Whoever demands they all bend to the one true way is a fascist (lol not really but you know).

Just set up your CI/CD in such a way that PRs with weird git logs get squashed into one pretty message, preferably the PR description since other devs have to review the PR it's often given more effort. Set it up so that if things weren't formatted the "right way" they get auto-formatted or a test fails and the dev says "ah, I have to run that one task and then update the PR".

I don't think a big organization is going to scale with developer "evangelists" demanding people write their commits a certain way either.

conventionalcommits.org was the worst. I worked at one "big co" that tried to get devs to do this. Even after we had been doing it for a while, nobody ever went back to look at the history in such a way that it was worth it. We ended up throwing in the towel rather than the company trying to get all other teams to do it.

smcleod · on Nov 7, 2023

Eh, commit history is handy to have but if you spend all your time crafting the perfect commit messages and history you’ve likely lost sight of what matters. Great is the enemy of good and make the tools do the work for you. Commit as many WIPs or whatever then just hit squash and merge - it saves a lot of time and keeps momentum up.

Rapzid · on Nov 6, 2023

There is a reason so many open source projects require squashing.

Ain't nobody got time for that shit.

jimbob45 · on Nov 6, 2023

I think squash merges are a last resort heavy-handed tool for dealing with developers who refuse to clean up their commit history before merging. Most developers can do better by hand.

This is too much thought put into a VCS. I don’t want to have to think about my VCS at all beyond the commit message. For all of Git’s popularity, I’ve never seen benefits that justify the absurd amount of work and knowledge it takes to perform simple actions. It’s the VCS equivalent of Scheme or emacs.

dkarl · on Nov 6, 2023

It really pays the effort back, though, when you can figure out why something was done, beyond knowing the feature it was related to, which is all you get with a squash merge.

scme0 · on Nov 7, 2023

But if you're using pull requests, you can just look up the PR to get the reasoning and details of the squash commit. I would argue that if you need it to be separate commits after merging, you should create separate PRs most of the time.

mixmastamyk · on Nov 6, 2023

Why not document it then? In a place where everyone can read it, instead of only developers.

Izkata · on Nov 7, 2023

...are you suggesting spending hours searching through documentation, hoping to possibly find something relevant, instead of just being able to run "git blame" to see why a specific line was changed?

mixmastamyk · on Nov 7, 2023

I use blame often to blame.

Code tells how, not why—the domain of specs and comments. Commit messages effectively don’t exist for non-developers.

I put spec links in doc strings whenever possible. They are accessible to everyone—devs, PMs, SMEs, stakeholders that pay bills, and myself when at a web browser.

Searching is not required but even if it was it would be a tiny fraction of “hours.”

mablopoule · on Nov 6, 2023

It's not about VCS, it's about code, both now and later, and context.

If you need to do a workaround, or a complicated feature sometime it's nice to explain it as a comment in the code, but sometime it's better to put it as a comment inside the commit message. But if it's all merged i the end, along with lots of template changes, README changes, refactor irrelevant to the current changes, then you're losing an important way of navigating a codebase.

Dylan16807 · on Nov 6, 2023

> I don’t want to have to think about my VCS at all beyond the commit message.

Fine as an opinion.

> For all of Git’s popularity, I’ve never seen benefits that justify the absurd amount of work and knowledge it takes to perform simple actions. It’s the VCS equivalent of Scheme or emacs.

This is just wrong. When people talk about this, it's not about git at all.

You write some code, and you make it into commits. Part of that is choosing if/how to organize it with multiple commits, and how much effort you want to put into that. This is fundamental to using a VCS, any VCS.

Or by analogy, if a lot of emacs users complain about your spelling, that's not because emacs is overly demanding.

jdhzzz · on Nov 6, 2023

That was me.

Got fired. Kinda. I was laid off.

mablopoule · on Nov 6, 2023

I actually hate squash merge because of all the noise it adds. Sure, the commit graph looks nicer, but it come with a terrible loss of information when doing git blame.

I'm a big proponent of rebase and squash if it helps to make a commit more coherent, but we use squash merges by default in the current project I'm working on, and I die a little bit each time I try to understand what changes were related to a line when tracking down a bug.

nerdponx · on Nov 6, 2023

This is the big one for me. destroying Commit information just to keep the graph tidy is a bad idea in my opinion. It would be better if Git provided better tools for filtering the log, e.g. providing some mechanism to elide commits from parents of any merge commit other than the 1st.

diek · on Nov 6, 2023

> destroying Commit information just to keep the graph tidy is a bad idea in my opinion

The commit information I see when telling teams to squash their branches on merge is not valuable.

* "fixing whitespace" * "incorporate review comments" * "fix broken test" * "fix other broken test"

(note, the broken tests were broken by the changes in the PR)

As soon as that PR is merged those commits are worthless. And there are branches with dozens of those "fixing X" commits that would otherwise pollute the commit graph.

gray_-_wolf · on Nov 6, 2023

> * "fixing whitespace" * "incorporate review comments" * "fix broken test" * "fix other broken test"

Things like this should not be standalone commits though, they should be incorporated into the previous branch by amending the original work. It takes some effort to have a useful git history, it does not just happen on its own.

lesuorac · on Nov 6, 2023

Sounds like six vs half-dozen. Why does it matter if somebody amends vs squashes?

gray_-_wolf · on Nov 8, 2023

It does not matter if you have one commit. If your change is split into few commits for increased readability, in that case it does matter.

Do you really believe that if, for example, this change to btrfs filesystem https://lore.kernel.org/linux-btrfs/cover.1699470345.git.jos... would be squashed, nothing of value would be lost?

noahtallen · on Nov 6, 2023

You can very easily rewrite your commit message on GitHub when squash merging. Since the organizations I work exclusively use squash merge, I often just update the commit to be more valuable, listing the important changes it contains. (And of course the PR in GitHub will contain the commit history of the branch that was squashed, as well as any discussion.)

IMO, this is a lot simpler and easier to do than rebasing your branch to have a flawless history.

sodapopcan · on Nov 6, 2023

I rather strongly disagree here.

Having whitespaces mucks up commit, causing you to lose focus of what's actually important.

I have `git blame` aliased to `git blame -w` which ignores whitespace-only changes.

You can also reblame when you come across this formatting commits.

trevor-e · on Nov 6, 2023

Yep, intermediate commits on a branch tend to be completely worthless. I'd much rather have "git blame" point to the commit that contains the entire change together.

dieselgate · on Nov 6, 2023

Agree strongly, it's nice in theory to view the intermediate commits but in practice have never needed to look at them

lelandfe · on Nov 6, 2023

Those commits would be the bathwater one casts out alongside the useful commits in using squash merges.

diek · on Nov 6, 2023

If the useful commits are the "baby" in your bathwater analogy, all the useful information in those commits is in the squashed commit.

This assumes a branch being merged in represents one logical change (a feature/bugfix/etc) that is "right sized" to be represented by one commit.

mablopoule · on Nov 6, 2023

Yes, but now it's mixed with the bathwater, and now morph into another metaphor as it become the needle in the haystack.

It's okay to have 'low information' commits one can easily ignore in your history, as long as the 'high information' ones stay readable and coherent.

26fingies · on Nov 6, 2023

You can usually see that in whatever tool youre using anyway. Blame -> find the PR -> see commit history.

gray_-_wolf · on Nov 6, 2023

You mean like for example `git log --first-parent`?

nerdponx · on Nov 6, 2023

TIL, thank you! Now I know for a fact that squash-mergers have no excuse and can brandish the man page at them.

Izkata · on Nov 7, 2023

People in general just have no idea how much version control is able to do. For one example try running "git help log" and just tap page-down a few (dozen) times to get an idea what's in there.

For another example, you know how people hate aligning code vertically so much that linters don't allow it nowadays, the primary reason being that if you have to change the spacing then the diffs will identify far too many lines as having changed? Both git and svn have options to ignore whitespace changes:

  git diff -w
  svn diff -x -w

WorldMaker · on Nov 6, 2023

`--first-parent` also today works for blame and bisect.

diek · on Nov 6, 2023

> I die a little bit each time I try to understand what changes were related to a line when tracking down a bug

A change/feature/bug is a branch, which is squashed into a commit on your main branch, right? So your main branch should be a linear history of changes, one change per commit.

How does that impact the ability to git blame?

js2 · on Nov 6, 2023

Because unless it's the most trivial of features, you'll break it up into smaller commits which each explain what they are doing and make reviewing the change easier.

As a simple example, I recently needed to update a json document that was a list of objects. I needed to add a new key/value to each object. The document had been hand edited over the years and had never been auto-formatted. My PR ended up being three commits:

1. Reformat the document with jq. Commit title explains it's a simple reformat of the document and that the next commit will add `.git-blame-ignore-revs` so that the history of the document isn't lost in `git blame` view.

2. Add `.git-blame-ignore-revs` with the commit ID of (1).

3. Finally, add the new key/value to each object.

The PR then explains that a new key/value has been added, mentions that the document was reformatted through `jq` as part of the work, a recommends that the reviewer step through the commits to ignore the mechanical change made by (1).

A followup PR added a pre-commit CI step to keep the document properly linted in the future.

diek · on Nov 6, 2023

In general I agree with you, there are absolutely times where you want to retain commit history on a particular branch (although I try to keep the source tree from knowing about things like commit IDs).

I would argue that those are by far the minority of PRs that I see. As I mentioned in another comment, _most_ PRs that I see have a ton of intermediary commits that are only useful for that branch/PR/review process (fixing tests, whitespace, etc). Generally the advice I give teams is, "squash by default" and then figure out where the exceptions to that rule are. That's mainly because, in my opinion, the downsides of a noisy commit graph filled with "addressing review comments" (or whatever) commits are a much bigger/frequent issue than the benefits you talk about. It really depends on the team.

js2 · on Nov 6, 2023

> As I mentioned in another comment, _most_ PRs that I see have a ton of intermediary commits that are only useful for that branch/PR/review process (fixing tests, whitespace, etc).

Right, but that's only because developers don't amend and force push their commits to the PR branch as they receive feedback. Which is largely encouraged by GitHub being a terrible code review tool.

To me, git is part of the development process, it's not an extra layer of friction on top. So I compose my commits as I go. I find it helpful for recording what I'm thinking as I write the code. If I wait till the very end, I'll have forgotten some important bit of context I wanted to include. So during the day I may use the commits like save points. But before I push anything I'll often check out a new branch and create and incremental set of commits that have the change broken down into digestible pieces. And if I receive feedback, I'll usually amend those changes into the PR and force push it.

I'd like to add that I spend a lot of time cleaning up tech debt. And I deal with a ton of commits and PRs that don't explain themselves. So I'm really biased toward a clean development workflow because I hope to make the lives of those who come after me easier.

I was also trained on this workflow by being an early git contributor and it had extremely high standards for documenting its work. There's a commit from Jeff King that's a one line change with about six paragraphs of explanation.

There's no right answer here. I value the "meta" part of writing code. Not everyone does and that's okay.

throw555chip · on Nov 6, 2023

When the word "force" is involved, it's time to take a step back and re-evaluate things.

js2 · on Nov 7, 2023

It's due to GitHub lacking change set support. With Gerrit, force pushing isn't required.

Izkata · on Nov 7, 2023

> only useful for that branch/PR/review process (fixing tests, whitespace, etc).

I have had bugfix cases where, digging through the repo history, both of those examples accidentally introduced the bug (the first because the person who made the original change didn't completely understand a business rule so it changed both the code and the test, the second because of a typo in python that only affected a small subset of the data). Keeping the commit separate let me see very quickly what happened and what the intent actually was.

mablopoule · on Nov 6, 2023

Because now instead of having a line changed within a granular level of changes, it's lost with the other changes from the same feature branch, which is a more macro level. So if a change in config is needed for the feature, the part when this config change actually need to be handled, or would impact the data-flow is harder to evaluate now that you mix it with template changes, style changes, new interactions needed for the users, etc...

EDIT: On top of that, there's usually a bit of 'related' work you need for a task, by example when you find an edge case related to your feature, and now you also needed to fix a bug, or you did a bit of refactoring on a related service, or needed to change the data on a badly formatted JSON file.

Unbeknownst to you, you added a bug when refactoring the related service, a bug that is spotted a few months after, only on a very specific edge case. If the cause is not obvious, you might want to reach for git bisect, but that won't be very useful now that everything I've talked about is squashed into a single commit.

diek · on Nov 6, 2023

> EDIT: On top of that, there's usually a bit of 'related' work you need for a task, by example when you find an edge case related to your feature, and now you also needed to fix a bug, or you did a bit of refactoring on a related service, or needed to change the data on a badly formatted JSON file.

I agree that's related work, but I'd argue that work doesn't belong in that branch. If you find a bug in the process of implementing a feature, create a bugfix branch that is merged separately. If you need to refactor a service, that's also a separate branch/PR.

That's actually the most common pushback I get from people when I talk about squashing. They say "but then a bunch of unrelated changes will be lumped together in the same commit", to which I respond, "why are a bunch of unrelated changes in the same branch/PR?"

mablopoule · on Nov 6, 2023

I agree with you in principle, but it's usually because of process and friction. In the place I'm working right now, that would result in days lost as I need to create a new Jira ticket, which obviously require a team meeting for grooming (because Agile!), and then going after colleagues so that the PR is accepted, which best case still need for CI/CD pipeline to finally deploy, and then merge it to the dev branch, and finally rebase the current feature branch... and all this multiple times.

joshribakoff · on Nov 6, 2023

Because sometimes a PR touches more code than a single commit, and you lose the more granular context surrounding the more granular changes. You can always ask git to make the log more coarse, but once you “destroy” the granular history it is for all intents and purposes gone.

WirelessGigabit · on Nov 6, 2023

Me too. I care a LOT about provenance. And squash merge completely breaks that.

When my branch is up to date with `main` I can build an artifact, fast forward merge that branch into `main` and RETAIN the artifact, and merely update its tags to mark it as `merged` in.

With a squash I lose that information.

Now, GitHub does not allow me to do a fast-forward merge but I can still trace the 2 commits that are the parent of the resultant merge, and find the artifact based on that, and retag.

johnsbrayton · on Nov 6, 2023

I do squash merges but keep the feature branches. So after determining that I made a change as part of a big pull request, I can then look at the commit/blame history for the pull request source branch if necessary.

mvdtnz · on Nov 6, 2023

I'm guessing you don't work on large projects. This would create an outrageous amount of noise in a busy repository.

matijsvzuijlen · on Nov 6, 2023

This means having to keep these branches around cluttering up everything, and makes git bisect a lot more complicated.

Lio · on Nov 6, 2023

Git blame confuses people even without squash merges.

I've seen people forget to go back more than one commit and then blame the person who last indented a file instead of going back to the commit that actually wrote the code many times.

sodapopcan · on Nov 6, 2023

> I've seen people forget to go back more than one commit and then blame the person who last indented a file instead of going back to the commit that actually wrote the code many times.

I default my `git blame` to `git blame -w` which ignores whitespace commits. Though knowing how to jump back commits should be required knowledge.

erik_seaberg · on Nov 6, 2023

We shouldn’t tamper with code we don’t actually need to fix, it’s not a good use of time and it makes history less useful. Just because it doesn’t look like I wrote it doesn’t make it wrong.

Lio · on Nov 6, 2023

I’m thinking of situations where the surrounding structure of the code has been changed to correct a problem.

That’s done by an automated tool. Correction of indentation is just a byproduct.

I don’t consider that “tampering”.

WorldMaker · on Nov 6, 2023

You can deal with that through tooling.

In a lot of my work I call those types of automated tool commits "wrench" commits personally and even have a simple shell script to help automate committing them. In my case I prefix the command line with a wrench emoji. At that point it's very obvious in git blame that if a line starts with a wrench it was last touched by an automated tool of some sort.

You can also very easily at that point grep your git log for wrenches to dump commit hashes into a git-ignore-revs file and automate that part too so that those commits don't even show up in git blame at all.

skybrian · on Nov 6, 2023

This all depends on the project. Sometimes you don’t look at history all that much. Sometimes the loss of information is acceptable.

Izkata · on Nov 6, 2023

If you don't look at history much, why would you care about keeping it "clean"? Just keep the truth of the changes in the history for those of us who do use it, and you can continue ignoring it.

sodapopcan · on Nov 6, 2023

No one is mentioning:

    $ git log --merges

Now you can see your features in a nice history and also have added benefit of seeing intermediary commits. Pro tip: merge commits aren't required to use the canned "Merge branch into..." message, you can give it any message you want, such as "feat: ..." or whatever your convention is.

I hate that branch squashing has become something of a defacto. I actually do rewrite my history and often add context to my commits. `git blame` can be an incredibly useful tool to get context about a given small change. Getting a massive diff for a whole feature is much less so, especially since you can just look at the diff of the merge commit.

dahart · on Nov 6, 2023

What I think I see these days is squash merges being used lazily to avoid having to do anything to build a clean history with clearly semantically delineated commits. Squash merges are good compared to an alternative where people check in super messy noisy branches, but they unfortunately have a big downside because squash merges can make bisecting and history spelunking more difficult, when the branches that are squash merged were big.

sidlls · on Nov 6, 2023

What is a "semantically delineated commit"? What is a "clean history"? Why are these two things important?

FeepingCreature · on Nov 6, 2023

Not parent: there are technical commits, such as "fix review", "fix jenkins", "fix typo" etc. Those don't delineate a particular feature but a fix for a problem that arose from the workflow. This ends up with a history of "big feature commit that is wrong in three trivial ways" + "fix 1" + "fix 2" + "fix 3". Of those, "big feature commit" is the important one, but "fix 3" is the only working one. This is clearly silly; you should pretend you were perfect from the start and squash "fix 1" through "fix 3" into "big feature commit". Your typos and brainfarts are not of historical relevance.

sidlls · on Nov 6, 2023

Perhaps I'm missing something, but I don't see how your comment answers my questions. Do you mean that a "clean history" is one without "fix 1", "fix 2" and "fix 3"? Or is that a "semantically delineated commit"?

ruds · on Nov 6, 2023

A clean history is one where there is a single commit, "big feature commit", that produces a worktree that is the same as the one produced by "fix 3" in the "unclean" history.

nomel · on Nov 6, 2023

How is this possible, while sharing code? Doesn't this require that pushed code is perfect? What about everyone else working on the same code? Do they wait until you've reached perfection? Or, do you squash the branch once it's complete, with the assumption that there's no other development on/from that temporary branch (I envy you if so)?

(I ask these questions fully assuming I'm doing it wrong.)

dahart · on Nov 6, 2023

> Doesn't this require that pushed code is perfect?

We aren’t talking about pushed code. We are talking about cleaning up the local commit history before pushing it into a shared branch.

sidlls · on Nov 6, 2023

And that's the one--and only--reasonable use of rebasing, to squash commits from a branch before merging into main. If engineers find themselves using rebase in any other context than squashing a merge, it's time to re-evaluate the processes/culture around workflow.

nomel · on Nov 6, 2023

What about the context where one works with other people, while sharing code?

dahart · on Nov 6, 2023

When working with published/shared branches with other people, the advice with git has always been that history is history and not to be changed after publishing, unless there is an emergency like a security incident.

Aside from that we need might need to clarify what the question is. With shared code & git, it’s nice to use a branch & merge workflow, and it’s nice to make incoming merges as clean / nice as you can do the resulting history is as smooth as it can be while capturing what happened at a reasonable granularity. These are today’s conventions though, and it’s really up to the team to decide how to balance shared work, and what people feel are the most important workflows and tools.

hhjinks · on Nov 6, 2023

You fix your local tree before sharing it. Alternatively, you can communicate with your team and tell them they'll need to run git fetch && git rebase -i origin/main to drop your erroneously merged commits.

Izkata · on Nov 6, 2023

They are. Or at least can be. Typos I probably agree with, but I've seen plenty of logic bugs introduced in those "fix" commits and keeping them separate from the big one is useful when figuring out what was supposed to happen.

smw · on Nov 6, 2023

A bunch of wip wip2 wip3 commits don't add any value, and make the log harder to read. But if you break a bigger PR down into "added feature x", "tests for feature x", "refactored y to support x" -- the commits are easier to read and provide valuable "why" history when you're trying to figure out what happened two years later.

sidlls · on Nov 6, 2023

That's more about the contents of the merged commits than anything else. Modifying the commit message(s) fixes that, as long as that's what the commits actually did.

Aside from that, how are "a fix for a bug" style commits not "clean"? If merge 123 into master contains a bug that is fixed in a future merge 1234, it doesn't seem "dirty" to me; quite the opposite actually, as it tracks what actually happened.

Now, "wip" style commits shouldn't be on whatever main branch everyone is working on: that's what branches are for. And if everyone is just working off the main branch and committing directly to it, that's an organizational deficiency; not one that VCS can solve.

dahart · on Nov 6, 2023

Modifying commit messages is rewriting history, right?

> “wip” style commits shouldn’t be on whatever branch everyone is working on

Agreed! We aren’t talking about rewriting shared branch history, we are talking about removing the “wip” commits made hastily and locally before pushing them. Sounds like we agree!

sethammons · on Nov 6, 2023

> they unfortunately have a big downside because squash merges can make bisecting and history spelunking more difficult, when the branches that are squash merged were big

can you help me understand this? It is the exact opposite of my experience. The flow I see is: bug reported, write a git bisect test, identify the feature that introduced it, reach out to that developer/team.

This is allowed by squash merges. When I've seen these more "clean" histories, they have commit points that wont even compile or have runnable tests causing git bisect to fail.

> branches that are squash merged were big

it must be this - how big are your merges? All the projects I've worked on strive for smaller PRs. Large PRs are usually broken up into smaller pieces. Large PRs are an anti-pattern.

dahart · on Nov 7, 2023

I maybe don’t know what you mean about “clean histories”. Speaking for myself, I always expect a history that’s called “clean” to compile error-free at every commit, unless otherwise noted; one of my personal criteria for calling history ‘clean’ is that efforts are made to keep the main branch up and running for every commit.

> how big are your merges? […] Large PRs are an anti-pattern.

Depends, but they sometimes on occasion can get pretty big, if there’s a bit refactor and/or multiple people in the branch. Small enough PRs are a nice goal - it’s a goal that might agree with and exist in part because squash merges on large PRs lose too much. It’s just the real world routinely gets in the way. It’s very easy for someone who needs to do an ‘atomic’ refactor to touch a ton of files. It’s very easy for a planned feature to end up way bigger than intended. You can’t always keep PRs small or enforce it on other people. Sometimes stuff happens, and when it does, sometimes squash merging feels less good than merging a branch with multiple commits. The good news is that it’s always optional. The bad news is that I can’t necessarily babysit or dictate what others do, and some people prefer squash-merging to spending any time doing cleanup on a messy branch.

paulddraper · on Nov 6, 2023

Squash merges are rebases.

ksenzee · on Nov 6, 2023

Only metaphorically, maybe. You can squash merge in lots of cases where a rebase will fail.

Izkata · on Nov 6, 2023

They are essentially rebase+squash, despite the name. There is no actual merge taking place.

And for that matter, you'd manually do a squash with the interactive rebase tool anyway ("git rebase -i").

ksenzee · on Nov 6, 2023

Imagine a feature branch where someone has been keeping it up to date by merging main into it regularly. Now the feature is ready to go into main. You can easily `git merge --squash` that branch into main. You can likely do the same thing manually (as you point out) by running `git rebase -i` if you squash all the commits in the branch. But you’ll never manage to do a genuine rebase, where every commit in the branch gets turned into a clean non-merge commit onto main.

paulddraper · on Nov 6, 2023

FWIW I consider `git rebase -i` to be a "genuine rebase"

ksenzee · on Nov 7, 2023

I do too, except in cases where it’s being used simply as a more complicated UI for `git merge --squash` and there’s no actual “generate a diff and apply it to a different base commit” going on.

paulddraper · on Nov 7, 2023

I think we have a rose by any other name situation.

I call that a rebase.

Dylan16807 · on Nov 7, 2023

That's a badly fitting analogy because there's only one type of flower involved. In this situation, they're saying that most things you might do with "rebase -i" are rebases, except for one.

I'll make a math analogy. Technically a rectangle is a trapezoid, but if someone says tries to draw a distinction between rectangles and proper trapezoids, it's not hard to figure out what they mean.

When rebase -i outputs a single commit, that's a degenerate case. There are statements about rebases that are generally true but not true for that specific kind.

recursive · on Nov 6, 2023

Just when I thought I was starting to understand git...

Izkata · on Nov 6, 2023

Rebase essentially means "create new commits out of old commits", the original use being to "move" a set of commits from one branch to another (think of the name as meaning, to change the base these commits started from).

There's a few special cases that have their own names, a common one is when you amend a commit - to do that manually you'd make a new commit, then use interactive rebase to squash the two commits together into a new one (or, use the "fixup" command available in that tool, which is a squash that automatically picks the first commit message instead of asking for a new one).

Squash merges will squash a whole branch into a single commit, rebasing it onto the target in the process, and then fast-forward the target to the new commit. It's a tightly controlled use of rebase, and can be thought of a bit like how "for", "foreach", and "while" loops are a tightly controlled use of "goto", an abstraction built on top of a far more flexible tool.

jawns · on Nov 6, 2023

My rule of thumb for commits is that they should be of a size and scope suitable for cherry-picking. So, maybe I'm working on a small feature that entails three changes, and each of those three changes is useful in and of itself and could conceivably be cherry-picked by others. I would create three separate commits, generate a PR with all three, and merge in the work. Sure, I could squash merge and end up with one merge commit encompassing all three changes, but now none of those three changes is cherry-pickable.

mcv · on Nov 6, 2023

They also cut down the signal.

Lio · on Nov 6, 2023

> Squash merges cut down the noise considerably.

They do but they have their own issues. e.g. having to delete local branches using git branch -D instead of git branch -d and getting the protection from deleting unmerged work.

I still agree that on balance annoyances like that might still be worth putting up with for larger teams with mixed skill levels.

leptons · on Nov 6, 2023

I don't mind merge commits, it's the 100 tiny individual commits some developers seem to like to do that really clutters things up. Yes, I know, git squash is a thing, but not committing until the feature is working and ready to commit is also a thing.

lolinder · on Nov 6, 2023

> not committing until the feature is working and ready to commit is also a thing

That leaves you prone to losing work if you have a false start that you need to back out of. I prefer to commit early and often on my private branches, then before submitting a pull request I clean up the history to where there are a few good commits that form useful, standalone chunks (ideally the test suite fully passes on each commit).

leptons · on Nov 7, 2023

>That leaves you prone to losing work if you have a false start that you need to back out of.

Hasn't happened to me in over 20 years of using version control. I always keep moving forward, there's really never been a need to go back to a previous commit that hitting crtl-Z wouldn't accomplish just the same. If I wanted to try a new direction I'd just clone the repo again and do the work there. Littering the git history with dozens of superfluous commits just seems pointless. Having to stop and think about writing a commit comment is also just a waste of time - in aggregate it wastes a lot of time. It adds a lot of churn to a workflow for something that may never really be of any value.

lolinder · on Nov 7, 2023

> Littering the git history with dozens of superfluous commits just seems pointless.

This is where the final rebase comes in—you should be combining all the small commits into one.

> Having to stop and think about writing a commit comment is also just a waste of time

Most of my commits when I'm working like this are named "draft". The names don't matter when you're going to redo the history later.

> I always keep moving forward, there's really never been a need to go back to a previous commit that hitting crtl-Z wouldn't accomplish just the same.

You've never started down one path for solving a subproblem only to realize 30 minutes in that it's not going to work?

leptons · on Nov 7, 2023

>This is where the final rebase comes in—you should be combining all the small commits into one.

Sorry, but I'm a software engineer, not a git engineer, and the less I have to do with git, the better. KISS applies to git, too. A simple thing like not creating a commit for every stupid thing keeps the history clean, doesn't bog down the developer by requiring to think about writing a commit message every 2 minutes, and keeps git simple.

>Most of my commits when I'm working like this are named "draft". The names don't matter when you're going to redo the history later.

But then what value have you added by naming everything "draft" and creating a commit? There is no value in doing this.

>You've never started down one path for solving a subproblem only to realize 30 minutes in that it's not going to work?

Sure I have, but I don't need to enter it into the git logs. I'll either start over in a clone of the repo if I want to save the bad work for whatever reason (which is very unlikely), or I'll just stash the work, or whatever. The thing I don't need to do is commit the bad work.

throw555chip · on Nov 6, 2023

> For me, even though rebasing comes with some trappings, I still greatly prefer it to the alternative, which is to have merge commits cluttering up the commit history.

The purpose of history is to remember. Rewriting history, whether git or in life, is bad; outside of the context of don't use it on public repos. Such advice is similar to saying, only point the shotgun away from you when firing. If you have to remember such a rule, it's best to avoid it.

0x6c6f6c · on Nov 6, 2023

But in unmerged branches, you aren't rewriting history, you're starting your work on a more recent commit in history.

ozim · on Nov 6, 2023

Shush, don’t say those things because maybe people discussing merge vs rebase will realize they don’t discuss but just talk side by side.

One and the other does not care what is the context and what they discuss but apparently each one just knows better.

I also don’t mean those specific users - but in general any git discussion I saw for last 10+ years.

oehpr · on Nov 6, 2023

A history you can't understand is a history you can't remember.

sakarisson · on Nov 7, 2023

> I still greatly prefer it to the alternative, which is to have merge commits cluttering up the commit history.

I've heard this many times before, but haven't been able to figure out why this is a problem. In your workflow is it a problem to have a cluttered commit history? If so, could you explain how?

richbell · on Nov 6, 2023

> I still greatly prefer it to the alternative, which is to have merge commits cluttering up the commit history.

GitHub recently added a feature that prompts people to update their branches via merge. It's frustrating because every PR now had dozens of merge commits polluting the history.

spankalee · on Nov 6, 2023

A PR with merges is fine by me, it lets me see how the PR has evolved.

What I want is for GitHub to track changes between sets of commits in a PR so that you can do most of the review with merges and "address review comments" commits, and then rebase into well organized, logical commits and review that those have the same diff as the messy history after a force push.

richbell · on Nov 6, 2023

The problem is PRs that have <5 lines of changes followed by dozens of pointless merges, because users are prompted to merge every time another PR is merged.

It wouldn't be a problem if people took the time to organize the history prior to merging as you said, but most people don't do this.

adhesive_wombat · on Nov 6, 2023

At least Gitlab does that when you push a new commit (force or not) to the branch: it'll show a list of value of the branch head commit and you can diff between them.

smw · on Nov 6, 2023

So does Github, but it breaks if you fix via rebase and push -f. Gerrit and some other competitors manage this better.

pjc50 · on Nov 6, 2023

Some people have pull set to merge, which is a "I don't know how you can live like that" feature.

Nullabillity · on Nov 6, 2023

Some of us like our history tracking tools to.. track history.

AndrewDucker · on Nov 6, 2023

I find it fascinating that people talk about "Having a history of what people did" in such emotive terms - "Cluttering", "Polluting".

What matters is that you end up with working systems. That a lot of change happened is just, well, what happened. It doesn't need to be prettied up and made to look like your development occurred in a clockwork march of cleanliness. It literally does not matter unless you spend a lot of time doing git-bisect.

Let it go. Accept that coding is not a smooth, robotic, endeavour, where everything is always tidy. And that's just fine.

ZeWaren · on Nov 6, 2023

I've accepted this a decade ago. I put my ego on the side, and now I don't care if my git history doesn't look like "beautiful" when looking at the commit graph.

I've been working on dozens of projects since, and probably did thousands of commits. Some of the teams of those projects included dozens of developers working concurrently on the same codebases. We always merged the upstream branches into our development branches and never did any rebases.

I have NEVER ended up in a situation where I thought rebases would have been better. The git tools and IDE integrations of our current age allow me to find any information I need from the history without pain.

ecnahc515 · on Nov 6, 2023

Have you ever had to use git bisect? That's really where a 'clean' git history is important. Plenty of people never use git bisect, and that's fine too. That said it's a very useful tool when you do need it, and can drastically simplify finding when and where a regression was introduced.

WorldMaker · on Nov 6, 2023

You can `git bisect --first-parent` and only bisect top-level merge commits. In most cases that gets you to the ballpark of "PR that introduced the bug" no matter how dirty the commit history inside that PR had been and if you can git bisect further in that branch. In my experience that is most of what you want anyway, "PR that introduced the bug" gives more than enough context.

joshribakoff · on Nov 6, 2023

You can bisect across the more coarse merge commits, without “destroying” history and losing the ability to bisect across more granular constituent commits. Bisect is more robust when more information is preserved.

suzzer99 · on Nov 6, 2023

This exactly. I'd rather pinpoint the issue to a small commit with only a few changes vs. "well I know which feature caused the issue, now to wade through 65 changed files."

krferriter · on Nov 8, 2023

I have never used git bisect, which is maybe why I'm wondering why people care so much about curating and cleaning up git history.

duped · on Nov 6, 2023

The point of a clean git history is not to have a clean git history. The point is to make it possible to debug later, via bisect, or show, or even just a diff. The point is to make the workspace clean for the next guy.

Instead of letting it go, maybe we should have more discipline and organization in our lives and not less.

BeetleB · on Nov 6, 2023

It's hard to tell what side you're on, because both sides refer to their stance as "clean history".

The pro-revisionists (squash, rebase) say they do what they do so the history looks clean (no intermediate commits breaking stuff, a "straight line" graph, etc)

The anti-revisionists say they do what they do so the history looks clean (can see the actual development, can safely diff different commits to see what changed in between, see the log in chronological order, etc).

> Instead of letting it go, maybe we should have more discipline and organization in our lives and not less.

Again, both sides could argue that they're the ones with more discipline.

> The point is to make it possible to debug later, via bisect, or show, or even just a diff.

This sounds anti-revisionist.

> The point is to make the workspace clean for the next guy.

This is one of the most common pro-revisionist arguments.

spider-mario · on Nov 11, 2023

> > The point is to make it possible to debug later, via bisect, or show, or even just a diff.

> This sounds anti-revisionist.

That’s not how I see it. What makes debugging via bisecting easier is self-contained changes, not exactly chronological changes where you temporarily broke stuff and then fixed it before submitting your PR.

corytheboyd · on Nov 6, 2023

100% agree, but nobody gives a shit, and I’ve learned to just let it go. I’ve been in so many meetings, seen so many PSAs, and you know what happens every single time? Nothing. Maybe a couple people learn what interactive rebase is for the first time, try it once, say “it lost all my code” and never try it again. Good luck explaining ref log in these cases.

b450 · on Nov 6, 2023

Did you notice, though, that rebase advocates use very "emotive" terminology when talking about git history? Like it's a subject they care about? Seems awfully touchy feely.

mostlylurks · on Nov 6, 2023

You say that like it's a bad thing. If there are two groups of people, and one of them is indicating (via words or behavior) that they don't care about something all that much and the other is indicating that they do care about that something quite a bit, why would I ever listen to the ones that don't care? It is almost tautological that the group that actually cares is going to have the more persuasive arguments and is thus far more likely to be right than the apathetic group.

duped · on Nov 6, 2023

I don't know if "emotive" is the right word, because to me this whole discussion is like trying to tell someone to be less sloppy because they make a mess when eating at their desk, knowing that the custodians will clean up after them.

FeepingCreature · on Nov 6, 2023

> What matters is that you end up with working systems. That a lot of change happened is just, well, what happened. It doesn't need to be prettied up and made to look like your development occurred in a clockwork march of cleanliness. It literally does not matter unless you spend a lot of time doing git-bisect.

And git blame. And git checkout to a past state. It "doesn't matter" only if ease of understanding your project history doesn't matter.

jjeaff · on Nov 6, 2023

how often is "understanding your project history" something that actually comes up for you? In all my years of working with projects in git, I will occasionally look at my history to help me find a change that may have led to a bug, but it really only comes up for me once or twice a year and even then, it is rarely an extensive deep dive and never very far back in time.

Groxx · on Nov 6, 2023

>how often is "understanding your project history" something that actually comes up for you?

Frequently, for any long and complex project. Large amounts were written by people no longer working on it, and the history of how things came to be can help fill in documentation gaps and make intent clear.

By "frequently" I mean something like "I check history for about 2/3rds of bug fixes, and 1/4 of adding features" to understand the surroundings better, when writing or reviewing. Anything that makes that better saves me hours per week.

It catches and prevents more than enough subtle issues to be worth the effort.

mixmastamyk · on Nov 6, 2023

I'm on a long and complex project. However most of previous folks were not very good and one reason I'm here to fix it. Their history is not particularly useful except to giggle at.

duped · on Nov 6, 2023

Do you work with other people or on large codebases at all? It comes up pretty much weekly for me.

nomel · on Nov 6, 2023

> I will occasionally look at my history

It's others history that I'm usually interested in. I can easy follow the small diffs of individual commits, but have a much harder time grokking a wall of red and green.

erik_seaberg · on Nov 6, 2023

When I’m on call and discover at 3 AM that we’re doing something weird, I need to know whether we meant to do that and especially why. In theory you could write all that down, but the people who aren’t doing that in git also won’t do it outside of git. The more you write down, the less likely it is that I need to page you to ask WTF.

tomjakubowski · on Nov 6, 2023

I read git commits in either the repo I am working on or a dependency repo almost every day