Hacker News new | past | comments | ask | show | jobs | submit login

Commits should be a contained change that can be understood as a logical piece of history, and reverted if necessary. If you have to look at multiple commits for 1 logical change to the code, it's much more difficult to figured out what the intention was, if it was correct, and how it can be reverted.



I'd also point out that it's bad to have commits in the main branch where compilation or unit tests would fail, even when each bad-commit always arrives with an immediate "oops, fixed" commit trailing it.

It messes up the team's ability to use tools like git-bisect to pinpoint behavior changes, and is generally more noise than signal.

At a bare minimum, those spans should get squashed before they leave the PR or feature branch.


Do you always make all of your logical code changes in single, atomic commits? What if you have to modify a feature later on? I'm sorry, but this is ridiculous. There's nothing wrong with having multiple commits in a row modifying something. That's how git works. GitHub's PR merge workflow really messed up the meta game, I tell you what...


Yes, always. Of course I make mistakes, but the idea is that any patch I put up for review, if it cleanly applies, should not break any user and any tests. It's much more liberating knowing that bisects actually work and each commit does exactly what it says on the tin (no followup "fix"/"minor" commits that actually do something completely different). Incremental improvements are compatible with this approach. Also see https://gist.github.com/thoughtpolice/9c45287550a56b2047c631... .


I don't think you're arguing against the point being made.

Logical change 1, implement Hello World:

    puts "Hello, World!"
Logical change 2, make it a method:

    def hello_world
      puts "Hello, World"
    end
Logical change 3, take an argument for the greeting:

    def hello_world(greeting)
      puts "#{greeting}, World"
    end
Logical change 4, take an argument for the recipient:

    def hello_world(greeting, recipient)
      puts "#{greeting}, #{recipient}"
    end
Logical change 5, improve the method name by showing the intent:

    def greet_recipient(greeting, recipient)
      puts "#{greeting}, #{recipient}"
    end

Logical change 6, defaults…:

    def greet_recipient(greeting="Hello", recipient)
      puts "#{greeting}, #{recipient}"
    end
And so on. These could each be their own feature or part of a feature etc. What they are, though, is logically atomic. If you want to add a default recipient later, or next, that is its own logical commit. If you roll it back, each rollback will lead you to a logical difference, not some typo fix, e.g.

    def greet_recipient(greeting="Hell", recipient)
      puts "#{greeting}, #{recipient}"
    end
plus the missing "o"

    def greet_recipient(greeting="Hello", recipient)
      puts "#{greeting}, #{recipient}"
    end
Are one logical commit, for which I'd use a fixup or `git commit --amend` or whatever. What one considers logically atomic may differ from person to person, language to language, feature to feature… but they can be differentiated from things like typos quite easily.

Personally, I make numerous commits in a feature branch, (transparently, too, my typos included - I'm not proud:) and before requesting review I'll clean up using rebase into as few logical commits as possible, and upon acceptance either squash it all to one or leave it as is, depending on the team's culture/needs.


If those are all different commits, now imagine the fun of interactively rebasing them on top of a main branch where some other thing changed "Hello, World" to "Greetings dear Globe."


Why is someone else writing my feature? Do we not have a ticket system? ;-)

More importantly, when using a feature branch one should really rebase master into the feature branch regularly, and any other branches that might touch the same places, it takes care of little surprises like this.


To clarify, I meant a scenario where that very simplest version is already upstream, and your feature-branch contains 5-6 commits which are all different refactoring-steps or making the string literals override-able or abstracted away.

When you need to rebase that onto main/master, there would be 5-6 stops to fix conflicts in a 3-way diff, and I don't think the kind of work in that example is compelling enough to justify that as opposed to having one (squashed) commit that simply states that you improved the method in several ways.

P.S.: Sometimes more commits can make refactoring easier instead, often when it comes to splitting file rename/move operations from changing their content.


I'm not against squashing, especially for merging into master (although, again, that should also constitute a logical whole), but doesn't rebasing a feature branch onto master first take care of this?


> Do we not have a ticket system?

Maybe we do, maybe we don't. Maybe there are two tickets with a mutual dependence.

Even without that, if you add a parameter to a method, and later someone uses it in a totally different PR, then you effectively can't revert the creation of the parameter. How are you going to track that dependency. I've heard the arguments about logical commits and the ability to cleanly revert them. I just haven't seen any code where that would actually work, even if the trouble was taking artisanally curate the commit history.


> Maybe we do, maybe we don't.

If I'm in a workplace like that then I wouldn't be surprised to find a poor git flow.

> How are you going to track that dependency.

The commit graph (which is linear due to rebasing). The extra parameter commit comes before the commits using it. If you want to remove that parameter then you have to revert the features that rely on it or refactor them, there's no magical way to keep dependent code if you want to revert the code it depends on. Having logically atomic commits makes this easier, not more difficult.

> Maybe there are two tickets with a mutual dependence.

Then rebase on each other's work until its ready for review (or, even better, don't work on dependent features in the same sprint). Again, logical, atomic commits would make that easier.

The real point is, what do you do when someone introduces a parameter in two functions, one is related to your work but the other isn't, and they bundled both changes in one commit, and the changes aren't even related. Now that's a mess.


If the second parameter isn't supposed to be there I'd just make another PR removing it.

If we use commit ancestry to determine logical dependence then this "revert the bug commit" thing won't work unless the bug just so happens to be in the very last commit.


You've created a straw man. No one can make code magically not depend on other code if it does - it's a tautology. No one can argue for better organisation if it's not wanted - feel free to not use a ticket system, don't communicate work, don't rebase work that is interconnected, and don't use logical, atomic commits. I'm not sure that Git is going to be a help given those requirements.

As I've already pointed out, I'm not sure what you're arguing against but it's none of the points made in this thread.


`git rerere` is your friend.


GitHub's PR flow is what brought this mess of fixup commits. In the previously common mailing list flow the normal practice is to amend your patches based on feedback. Translated to a PR flow this means rewriting the branch.


I'd say pull request should be reverted if necessary, but an idea that each commit should be revertable and expectation that project should work after it is unneeded complexity for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: