Hacker News new | comments | show | ask | jobs | submit login
Stay away from rebase (oneandoneis2.org)
295 points by oneandoneis2 1461 days ago | hide | past | web | 111 comments | favorite



The whole misunderstanding ``git-rebase mangles history'' revolves around the concept of history.

For me, history in DVCS should be about logical changes that introduce given feature(s), fix bug(s) etc., and not be about raw, unprocessed changes, not be edition of `physical' bytes. Whenever I make a mistake in a commit (be it a typo, a small bug or perhaps I went in wrong direction and backtracked later on) I either --amend it, or, if it's an earlier commit, git rebase --interactive to fix it.

In other words, ask yourself, ``Would this commit stand for a good LKML submission?'' Until it does, it's --amend or git rebase --interactive time. Refactor history just as you refactor code ;-)

Of course, once the code is pushed to a shared repo, the genie is out of the bottle and there's no changing it. But that's a different matter.

What you want to do is:

  * 1) clean up history with git-pull --rebase, git-rebase --interactive and git commit --amend, 
  * 2) then optionally perform git merge --no-ff --log --edit  $your_feature_branch while on $upstream_tracking_branch, to create a new, merge-ish commit that covers the whole new feature you've been working on.
This way you have:

  * clean history
  * logical, standalone commits
  * complex features introduced by a separate merge-ish commit (which you can git-rollback as whole, if need be).
My rule of thumb for the step 2) -- do it when there's three or more commits introducing a particular feature.

I perform the step 2) often enough to have created a short wrapper script for it [1].

[1] https://gist.github.com/dexen/f36cb1668b5c04eb5abd

[ lotsa small edits for cleanups -- sorry ^^; ]


Even the "don't rebase once code is pushed to a shared repo" thing is more of a guideline than a hard and fast rule. For example you might reasonably want to clean up the history on branches that you have previously:

* Pushed to a distributed test system * Pushed to a branch based code review tool (e.g. Opera Critic or Gerrit).

Using a heavily rebase-centric workflow for several years, with rebases on semi-public branches like those described above an expected part of the process, I don't think that history rewriting has ever caused me an actual problem. So my gut feeling is that people spend a lot of time trying to avoid something that sounds theoretically bad but isn't much of a problem in practice as long as some common sense is employed.


The intent is "don't rebase if you might have a downstream". You're welcome to rebase branches that you push as long as your workflow specifies that they are still volatile. For example, we push unfinished topic branches to our team repo as backup and to allow "passive review" (a pull request is "active review"). Comments at this stage are communicated by bitbucket line comments (or by email/chat) and are handled by amending the topic branch. This unmerged topic branch is rebased any time the author prefers to start from a later 'master' in exchange for re-testing the series.

When the topic is considered complete by the author, she either makes a pull request or merges to 'next' herself. Merging to 'next' signifies that another topic may depend on that branch.

In this workflow, an unmerged topic branch is roughly equivalent to a patch series on a mailing list, but doesn't require email client integration and degrades more gracefully with less disciplined commit factoring and less sophisticated tool use.


There's a loose convention of using the suffix, "-wip", to name published branches that you plan to rebase.


Every complaint I've read about rebase is this same one, over and over, and I think it's based on a false premise.

> Rebasing destroys your history, and therefore destroys the point of using a VCS.

The answer, of course, is don't do that. Don't rebase public branches, and that includes master and whatever else you're deploying from.

In any case, you can't rebase master unless you're willing to force push, and if that hasn't tipped you off to the fact that you've gone off piste, what will?

> If you've got an entire history that's been heavily-rebased and your problem is "This bug started happening on Tuesday last week", you have a big problem: You can't just track back through your simple, linear branch until you get to last Tuesday's commits.

This hypothetical scenario of last Tuesday's bug therefore makes no sense. You can do exactly that, because you don't rebase history: you rebase work in progress onto the canonical record.

In summary, rewriting history would be bad, but that's not what rebase is used for.


Alas, "Stay away from rebase" will get more clicks than "Git-rebase caused me some problems and I should be more careful". I've found rebase essential in maintaining nice clean feature branches. Occasionally I run into issues, but I make sure I can back out of any changes even if it means copying the whole damn repo to somewhere else on disk.


git reflog will save you from any amount of rebase-induced chaos. Pick the last good point, reset --hard, and carry on.


Any time you rebase you rewrite history, not only when you rebase public branches. It is significantly worse to re-write the history of public branches as you've pointed out. Your work in progress also has a history that you re-write when you rebase.


Stay away from rebase? What are you nuts?

What the hell is everyone doing working on master anyway?

Make your own working branch and merge it when it's ready. Don't mess with master if you're working with others.

If you are working on master (or merging to master, or doing something with master), merge and push right away, and if someone does something else upstream in-between, then rebase yours on top of it.

If you have local changes, and other people pushed changes to your upstream remote, first of all, you should have pulled before making changes (you know that's usually the reason) which would have done a nice fast-forward merge with no merge commit. If your commits are just local anyway, no one cares if you just rebase so it's cleaner! It's equivalent to having pulled first and then made changes. Exactly the same. No one wants to see an unnecessary merge commit anyway, that's just annoying.

All of this makes sense. Merge works, rebase works, sometimes they do the same thing, sometimes one's better, sometimes you should just not be dumb.

What's the problem? Geez, it's just git.


I very strongly disagree with this post. But I should be clear that I also strongly disagree with the statement 'always use rebase!'.

Merge and rebase are two similar but distinct ways to update a branch in git. They are both appropriate at different times. A very good guide to sensible merge/rebase policy can be found here

http://lwn.net/Articles/328436/

(see also http://lwn.net/Articles/328438/)

I use rebase regularly. I always work on branches other than master, i.e. task and feature branches. I never rebase branches that are shared with other developers - development to these shared public branches is always just adding new commits to a stable history. Merging is important but rebasing is a wonderful tool.

With respect to the issue of timestamps, I did not know this and am very glad to learn it, you do have a problem here. My position would be that I would rather throw out the timestamp chronology than abandon rebase. Typically in my experience bugs can, and usually are, tracked against a particular commit, which solves this problem. However, I appreciate that this won't always be the case, but throwing out rebase for this is a high price for a moderate reward IMHO. :)


The timestamp thing is a red herring. Git tracks CommitDate and AuthorDate separately. CommitDate gets updated by default in a rebase. Use 'git rebase --ignore-date' to also update AuthorDate, though I like having both pieces of information.


Thanks Jed, that's great stuff.

For anyone else who wants to see this directly you can compare the two dates side by side with

git log --pretty=format:"%cd - %ad"

Where %cd is committer date and %ad is author date.

%cd is the value that resolves the op's bisect problem.


From the second link:

  > since you most not rebase other peoples work, that means that you must 
  > never pull into a branch that isn't already in good shape. Because after 
  > you've done a merge, you can no longer rebase you[r] commits.

  > Notice? Doing a "git pull" ends up being a synchronization point.


The issue that I can't figure out, is how to deal with refactors, or more broadly, a codebase that changes. If my feature takes two weeks to develop, when I'm ready to merge, I'm applying my changes to a codebase that's two weeks newer than when I branched. For a merge to be not-painful, the master branch should stay reasonably steady. This, I think, creates a conservative, change-adverse culture, rather than a IMO superior "if it's wrong, fix it, fix it right and fix it now"-culture.

If I continuously rebase (and especially after a colleague emails saying he's pushed a big refactor), I can fix these things as and when they happen, and I can feed back to my colleague if something they've done works poorly with what I'm working on, and they can then address that while they still have the context of the rebase in working memory.

Finally, I think the history criticism is based on a too narrow view of history, as a single, linear reality. My code might have been weeks in the making, and while it's true I wrote that particular function on Monday, if I didn't push it until Thursday, it simply didn't exist in the world of my peers until then. Merging history so something I wrote on Monday appears as having existed on Monday, even though it didn't and couldn't affect the world until Thursday is also dishonest.


Try to continually merge in upstream changes when you're working on a long-term branch, to minimize the individual deltas.


This is exactly the best practice for feature branches.


You really should always rebase before merge, the workflow I use is just pull and not worry about rebasing until you want your branch mering.


Hm... I think this post is rather myopic and very specific for the author's workflow.

> "This bug started happening on Tuesday last week" [...] If you aren't aware of this and you start running your "git bisect" using your "good" base as the last commit made on Monday last week [...]

Well, or you could find the first commit on Tuesday last week and start with its parent. Or, better yet, have "releases" tagged or at least have a log of deploys (timestamps, commits); what if I developed something on Sunday and pushed it to production on Tuesday?

> You only get the plethora of merges if you're using git wrong.

I've seen it in real-world codebases. Some people just don't know how to use git very well (either can't understand it, or don't try to), and produce ugly history. I agree that for "noobs" it's better to merge than to rebase (less opportunities for failure), but good developers should know there's a place for both `rebase` and `merge`.

> And you do have a problem, because not only are you writing crap code, but you're committing it as well!

Well, one idea of git is to "commit early, commit often". It's much better to commit bugs and then commit fixes as well than to not commit and loose a bunch of code (due to a mistake or hardware failure).

I use rebase often. I should probably commit more often. Everyone has their own way, and we can all learn other ways and improve our work and life. But getting all angry and upset because someone uses rebase more often than you is an ineffective way to learn.


>If you're consistently writing buggy code and rebasing to fix it, then you're coding badly. Don't fix the symptom by rebasing endlessly, figure out your problem. And you do have a problem, because not only are you writing crap code, but you're committing it as well!

>Look closer at your diffs. Write more unit tests. Run them more often. Whatever, figure out what you need to do to avoid routinely making bad commits.

I disagree with this. Git is fast enough that you can commit early and often - more often than you can afford to run unit tests, even. I find it very useful to be able to commit each minor change of direction; half my commits don't even compile (I also believe in using strong type systems and the "compile error tells me what I need to change next" approach). If I need a "clean" history for review or similar (though honestly I don't see why you would - just review the differences between the branch head and branch base) I can always squash those commits thanks to git's nice history-rewriting features.


Exactly. Keep remote history clean, keep local history dirty. Commit broken code, commit experimental features, do whatever it takes for you to write your code.

Squash, rebase interactively, rearrange, amend, merge, --no-ff before you push so other can read only a good, maintainable code served in meaningful and ordered commits.

Also rebase onto origin master often so you won't get lost in milion pages long diffs and conflicts.


> half my commits don't even compile

This sounds awful to me. This does not work especially if you are working in a continuous integration environment like Hudson or Jenkins.


Considering my project uses Jenkins and I commit on a sometimes minute basis (including code that won't build) I'd have to say committing broken code works just fine with Jenkins. Now if you're talking about pushing broken code to something like gerrit, that's another story, and that's just one of many reasons why the article is flat out wrong, because you have to rebase to squash commits into a single patch set before you can push them to something like gerrit, which then runs them by Jenkins.


I suspect I push working commits as often as anyone else - I just make a lot more local commits in between.


It'll still mess up CI, sadly.


No, it does not. CI never builds those broken commits because they never go outside of authors personal computer. Commit often, push when it's done and only push commits that work.


The only way they never go outside of the author's personal computer is if the author squashes (or discards!) them before pushing. Otherwise, they will go outside of the author's personal computer.

However, they will never be a head outside of the author's personal computer, because they are pushed along with later commits that fix the brokenness. CI only ever builds heads, so the existence of the broken commits doesn't matter.


You're right. What I had on my mind writing the previous post - don't push broken commits. :)


Not all CI only builds heads. Some build and test all commits.


Well, then it will mark them fixed in HEAD then


Why would CI ever run a build of more than one commit sent in the same push? Isn't the push atomic? (Certainly it's worked that way for me in practice).


Suppose you want to know, not just the fact that dev is failing, but why it's failing. The particular commit that caused it to fail.


Merge commits are more than just a cosmetic issue. If you have merge commits from pulling master into your working branch you have broken the relationship of the commits which you just merged into your branch from master. Once you merge your branch into master, these changes you pulled in will now appear to be changes that were part of your branch.

So say you have your branch, Sally's branch, and master. You create a branch from master, and then Sally merges her changes into master, and then you pull those changes in, when you merge your branch into master it will look like Sally's changes came from your branch. If something went wrong and you revert your merge all of Sally's changes get taken out of master. Yup.

Not being able to do a clean revert of all the changes is just one problem with pulling master into your branch, another problem is being able to tell just what in the hell you are about to push to production. If you have merge commits your change diff may include all kinds of stuff already deployed, in which case you have to do a manual diff of the code deployed and what's in master instead of just relying on what the merge commit tells you.

So please rebase.


Keep a clear distinction between branches, and never do any clean commits to your develop branch, instead merge any feature (or hotfix) branches you might have with merge --no-ff.

Basically, git flow is your buddy.


What? No. If Sally has already merged her commits into master, then those commits will never look like they've come from your branch.

If i'm misunderstanding this, then a worked example would be a really great way of clarifying it.


I believe you are wrong. The merge commit would make it look like her changes were in your branch. Try it out with a new repository. Create a new repo. Add some commits. Create Sally's branch, add commits to her branch. Create your branch from master, add some commits to your branch. Merge Sally's branch into master. Merge the updated master into your branch. Add some commits to your branch. Merge your branch into master. Revert your merge, and see, Sally's changes are no longer in master.


What is with these preachy headlines? "Why you should..." "Please use ..." "Please stay away from..."

It's very presumptuous to assume that you know the intricacies of thousands of developers' workflows well enough to tell them how to do thier jobs.


It developed originally as a fairly cynical way to make blogs posts stand out, but now everyone seems to be doing it. I hate it.

In fact, I might write a post on it: Why Your Post Titles Are Annoying and What You Can Do About It


I think a better title might be: "Your Titles Suck."


7 Reasons Why Your Titles Suck


How I wrote a title that didn't suck, and saw my productivity increase 30%


With "Please, oh please use git pull --rebase" I didn't mean to preach but to convince some of the people I code with to maintain at least basics of a clean git history, thus the 'please' in title. I wrote it some time ago and only yesterday I was surprised at the sudden popularity of that post :)


Don't take it personally, it's more of a general observation that a problem with your post in particular


> Rebasing destroys your history, and therefore destroys the point of using a VCS.

I don't think this is true. Rebase, as the name suggests, simply rebase the starting point of your local history. It preserves your history better than that merge commit and having your series of commits intertwined with commits from other people implementing features not related to yours.

Rebasing also makes merging even easier, because now merging is done on a commit by commit granularity rather than branches. This also preserves the history better.

It's funny that the main rational for most people to use rebase is that it provides better history and this article is arguing that it does the exact opposite without carefully comparing the history in both scenarios. His only argument is that it's not "Real", but what we really care about is "clear" history, not necessarily whose commits come first.


I was with the author until he/she got to the point of saying "bug started last tuesday". I don't know any serious development/testing teams that think like that, nor any developer that would search by day to find a bug.

Its much more likely to hear "the bug was in build 'x.y.z'" to which the next question is which build was green, and then bisect from there.

Searching for commits on tuesday because a tester happens to notice a bug on tuesday is a recipe for disaster... Or at least a completely wasted work day.


Rebase is for cleaning up history. Merge is for introducing new features. Use the best tool for the job.

Always "git pull --rebase"; it is fast, easy, and meaningful. You can change the default configuration and probably should; same goes with other tools like emacs and vim.

Worried about date rearrangements? For those few situations where it is important, git log --since="$DATE_OF_LAST_TUESDAY".


Better yet, git log --since='last tuesday'. Try it.


I wanted to suggest that, but I couldn't get it to work on my repo. Might be my version.


"If you've got an entire history that's been heavily-rebased and your problem is This bug started happening on Tuesday last week", you have a big problem: You can't just track back through your simple, linear branch until you get to last Tuesday's commits. You have to keep going back until you're absolutely certain there are no other commits further back in the history that are newer chronologically."

I think the author is mistaken. As mentioned by jedbrown, git has a separate AuthorDate and CommitDate, and rebase updates the CommitDate. You can see them both using:

    git log --format=fuller
Furthermore, when you use @-limiting to filter logs by date, the CommitDate is used. For example, if you have a bug that started last Thursday, you can show all the commits that were either written or rebased since then with:

    git log HEAD@{last.thusday}..


I think this is the most important bit because it makes the whole article fall apart (for my use case - code reviews).

Both my UI (Tower) and bitbucket sort by commit date. There is no way to sneak a broken commit into last week; for that I'd have to rebase master on top of the buggy commit.


Why would someone take one of the most important features of git (local history) and throw most of it away by abandoning ability to edit it?

If you pushed a broken commit push another fixing it, just like in merge-based workflow.

Having dates switched is not important, having four different branches on said Tuesday gives you a lot headache searching who introduced a bug. And git bisect won't help you. ;)


As an experienced developer and a git noob, this article makes much more sense to me than the plea to use rebase all the time.


Ditto. Well explained and all.

Eventhough I've used Git for quite some time (GUI are for the lazy, and I'm a lazy basterd) but I still don't understand a damn thing. Well I've grasped the broad concept, but everything smaller eludes me.

This post is very well written.


Yeah I agree. I've been using git for a few years and I think his logic is more sound than the original plea.


Rebase is a powerful tool, and with great power comes great responsibility. It's easy to get stuff wrong with rebase, and it's also hard for most developers to get their heads around because it offers something that no other VCS does. I'm also very much reminded of http://tomayko.com/writings/the-thing-about-git, except to extend it even further, most VCS will tell you "you should have not committed broken code or written a bad message" or "you should have cleaned up your patch before committing it." Git, on the other hand, is extremely forgiving. I've lost count of the number of times I've gone off the rails and deleted a branch name or some other pointer to a commit in git, yet I could always pull it out (even if I had to resort to reflog). But enough about my shortcomings; why should you rebase?

I can think of two very good reasons to rebase: to cleanup a patchset before submitting it to a "central" repo, and to move your patch forward on top of another patch set (avoiding a merge). Often times, you might be working on something and get distracted, so you stash or commit and come back to it later. For whatever reason, you may have multiple commits that are conceptually related, but are separate. With git, it doesn't matter if these commits are one right after the other, or separated by other commits. You can rebase them into one commit that makes conceptual sense. If one commit reverts part (or all) of a previous commit, once you rebase and squash them together, the change and reversion cancel out so you don't get unnecessary twiddling of code in your commits.

The other time to rebase (that the article argues against doing all the time, which may have some merit), is when another repo you are pulling from has newer changes. So you've got your nifty new cleaned up commit, now you want to push it. But what's this? You can't push it because someone else beat you to it! No worries, you'll just merge. Hold on to that thought though: what's really happening here? Well, you want to put your changes on top of the changes from the repo you pulled from (and hopefully double check that they still build and the tests still finish successfully). Should this operation necessarily be called a "merge"? What if you just pretended that your commit had been based on the new stuff in the remote repo all along? That "pretending" can be accomplished with a rebase, and it also has the nice benefit of keeping your history clear of content-free "merging" messages.

So, when should you not use rebase? Never on a public repo that others have pulled from. Even if you are exposing your own repo as a master and warn people that things could change radically at any time, it's still probably best if you clone a personal/private repo off that, do your rebases there, then push clean commits to the public repo.


This is a well-written article and it makes a compelling case, but it hand-waves away the benefits of a linear history by blaming the user for "doing it wrong" in some ill-defined way. The fact is even if you are doing it "right" with strict topic branches you still can get very hairy unbisectable conflicts that would be easier to reason about with a rebased history. Let's look at specific statements:

> In our simple one-off examples above, this is no big deal. If you've got an entire history that's been heavily-rebased and your problem is "This bug started happening on Tuesday last week", you have a big problem: You can't just track back through your simple, linear branch until you get to last Tuesday's commits. You have to keep going back until you're absolutely certain there are no other commits further back in the history that are newer chronologically.

This has literally never come up for me in 5 years of using git because if you are looking at bugs that were introduced at a certain point of time you aren't looking at commit timestamps anyway. The important thing was when was the commit deployed (in the case of production bugs) or pulled (in the case of development bugs).

> The whole reason to use a VCS is to have it record your history. Rebasing destroys your history, and therefore destroys the point of using a VCS.

That is a complete non-sequitur. Rebasing doesn't destroy history, it rewrites it. It's no different from accepting a patch via email. Or writing down a ticket that your going to make x change then y change then z change. The fact is VCS is a tool. Git gives you incredible power to curate history, and understanding of how to use this power in the here and now can make for a more understandable future. Rebasing is potentially just an extension of not breaking the build by committing half a feature.

> The "simple linear history" is a lie. The branched history might not be as pretty, but it's an accurate representation of what happened.

A linear history has real mathematical benefits that I wrote about at http://darwinweb.net/articles/the-case-for-git-rebase. In practice I've done things both with a topic-branch orientation and a rebase-to-master orientation, and I understand both intimately, and a rebased linear history does not destroy nearly as much information as this article would have you believe. You can still see when a commit was written in addition to when it was rebased which provides most of the value of seeing the whole branch structure (which incidentally is not a magic oracle into the developer's mindset either—there is always out-of-band information).

The idea that the git tree be immediately frozen and never-changing after every single commit is an unnecesarily rigid perspective. It works perfectly well in practice to imagine rebasing as the developer having implemented their topic branch instantaneously based on the current state, and resolving conflicts on a commit-by-commit basis rather than accumulating them into one opaque merge commit.

> It's so tempting to stay on master, to think "It's just a quick fix, it's not worth branching for!"

You can have your cake and eat it too. Just do `git branch new_topic; git reset --hard origin/master`. There's no reason to branch prematurely because there's nothing to stop you from branching any time. This is the thing about distributed version control, and git in particular, you do whatever makes sense locally and you don't need to care at all what anyone else is doing until you fetch.

> Because if you keep your work on the main branch and you frequently commit bad code, then the day will come when you hit the absolute no-no of rebasing: You'll push a bad commit to a remote, and then you'll be stuck because you absolutely must not rebase published history.

In that case you simply push your fix. Why does a rebase-based workflow lead this to be a catastrophe? It doesn't matter what workflow you are using, you will push a bug eventually, then you will fix it.

> The best thing that could happen to rebase is that it gets relegated to "power tool that you don't find out about until you're a git wizard" because far too many people use it as a crutch to support their ability to use git without understanding it.

I'll agree here, you shouldn't use rebase if you don't know what you're doing with git. New users should definitely be forced to work entirely with merge and possibly --no-ff to get the basics of how git works. Rebasing is a power user feature, but it's not difficult to understand if you understand git fundamentals. That is if you understand commits and trees and branches, if you don't then it's way too sharp a tool.

> If you use rebase more than once a week, I maintain that you have a problem. It might be hard to spot, it might be rough on your ego, but that's my opinion.

Rebasing is just a tool. Maybe someone likes to commit every file and then curate with rebase -i, who are you to tell them their doing it wrong? You certainly haven't demonstrated that with a bunch of strawman arguments about committing buggy code or that direct collaboration on a feature should axiomatically happen via pairing vs any other method. All of this is just a distraction to the core questions: what are the pros and cons of rebasing? What history is destroyed by rebasing? What are the advantages of branched history vs linear history?


> This has literally never come up for me in 5 years of using git

I get the impression that, unlike the article's intended audience, you know git well enough to not need to follow its advice. When giving advice, you have to aim at the expected level of understanding. That can easily lead to examples that frustrate people with a higher level of understanding, because they seem to over-simplify.

In this case, you're taking me too literally - I was trying to give quick & simple examples for a complex issue. It's just an illustration of how "simple linear history" is a non-sequitur - you now have to factor in that a specific commit could very easily have been made after a commit that comes after it in the history. You're moving complexity around, not reducing it.

> Rebasing doesn't destroy history, it rewrites it.

Rewriting is destroying.

Worse, it's lying - it's saying "These commits were made in this order" when they weren't.

> Why does a rebase-based workflow lead this to be a catastrophe?

It doesn't. You've ignored the context - which was of someone who wants his commits to be de-bugged and habitually uses rebase to do so.

> you shouldn't use rebase if you don't know what you're doing with git

That was kind of my point - people are using rebase instead of learning to use git. And should therefore be advised to stay away from it until they have got the fundamentals down.

> All of this is just a distraction to the core questions:

Sorry, those might be your core questions but they weren't mine: I was writing specifically to counter the suggestion that rebase should be heavily-used by people who don't really understand its ramifications.


Rebase is not a lie, it is the developer explicitly putting commits in this order.

Best practice certainly depends on your environment. In small projects, let's say two/three devs working with a couple dozen tightly-coupled files (like HTML/CSS/JS), you have to keep up with changes frequently; that means pulling all the time, otherwise you'll always have monster conflicts to solve. That 'tiny window of opportunity' when someone else has pushed and you have changes is all the time. In a codebase where you can happily work on an isolated feature for a week, things are different.


And you'll ultimately have to decide whether you want your codebase to represent a series of physical commits, bits that people happened to type out... or whether it should represent a series of logical changes to the codebase.

But git and git-rebase were explicitly designed to make the "series of logical changes" easier, because it was designed for Linux kernel development, and they like series of logical, meaningful changes over there because it assists in understanding the changes which are being integrated.

Now, I'm sure there's some business case where the series-of-physical-changes sequence is more important to someone. I don't personally agree that's the best way to program in general, but not everyone's compelled to agree with me. And in that case, don't use rebase.


> And you'll ultimately have to decide whether you want your codebase to represent a series of physical commits, bits that people happened to type out... or whether it should represent a series of logical changes to the codebase

We're on the same page here, but there's a bit of a deeper point being hinted at, which is that being able to be "physical-series-of-commits" is useful when you're in experimenting-dev/refactor mode, and isn't useful at all for people trying to understand your history. Well, mine's not at least -- it's a lot of shitty diffs and "fuck shit checkpoint" commits that don't live long.

Logical history is basically always useful, but you often want temporary checkpoints for yourself before your changes are going to coalesce into something that represents a logical patch-set.

Rebase, in it's most commonly useful ... use case, lets you tack new incoming history that other people have shared onto history you haven't shared yet. For the people other than you, your changes exist in the future.

History rewriting in general lets you fuck around locally with stuff you haven't shared, making trashy checkpoint commits or whatever you want to do, and clean that all up to represent a logical series of changes for sharing.

In short:

Don't work on tracking branches. Branches are cheap.

Periodically bring in others changes into your work with rebase. Your work isn't public and theirs is, theirs is part of public history, yours is part of future history. Always rebase before bringing your changes back into the tracking branch, and then merge your changes into the tracking branch. Often, the history cleanup I want to do is handled by merge --squash, bundling up the change I'm pushing as one commit.

You, whoever is looking at my incoming change, care about the fix for bug FOO or feature BAR being completed. You don't care that I made a bunch of stupid typos. If you do, I don't want to work for you unless you care because your doing some sort of awesome stat analysis on typos made by people like me.


Exactly. An author of a book may have written chapter 23 before chapter 4, but the reader doesn't care. They want to see a logical progression.


> Rebase is not a lie, it is the developer explicitly putting commits in this order.

See my comment on the post - rebasing absolutely is a lie. Sometimes it's a good idea to lie; sometimes the lie is only the tiniest of fibs. But it's always a lie.


I guess the word "lie" is too negative word to be used here. It's as much or less a lie as not including all the different revisions of the code written before the final version.

As you said, sometimes it's a good idea to refactor the history, or not comment on algorithms that was not chosen for the problem at hand. But there are times when that information might have some value.


Rebasing isn't saying "these commits were made in this order" when they weren't, it's explicitly choosing to make those commits in that order.


> It's just an illustration of how "simple linear history" is a non-sequitur - you now have to factor in that a specific commit could very easily have been made after a commit that comes after it in the history. You're moving complexity around, not reducing it.

You are removing several sources of complexity when you rebase.

1) You have fewer logical branches to keep track of. Before you rebase, it's unclear what code in the main branch your feature branch depends on -- clearly it depends on all the things in the common ancestor, but if there were any merge conflicts, it depends on the resolution of all of those as well. Rebasing lets you easily see: "This commit depends on the commit immediately before it, dependencies on older code have already been resolved, no need to look elsewhere."

2) You remove a noisy merge commit. Merge commits, unless they are actually resolving a conflict between two branches of code that are both in use somewhere, serve no functional purpose. They only record that a particular physical operation was performed. If the merge is resolving some conflict that only one developer ever saw and only on his local machine, then why not rebase that content-less piece of information from the historical record?

3) This is the softest point, but rebasing encourages manual curation. On average, I would expect that this increases the quality of commit messages, and encourages the removal of commits that are just useless noise. The developer wrote one commit message in the middle of work, or maybe just because it was 5pm, and it reflected his thinking at the time. Later, after he knows the context the commit falls in, he looks at and evaluates it. I think of it as revising a draft -- sometimes you got it right the first time, sometimes you didn't. In any case the end product is unlikely to be worse off unless someone is throwing away EVERYTHING by squashing down to one commit and giving it a message "squashed".

In any case, the end result really isn't any more complicated. Take your example: you want to know what caused the bug at Monday at 8pm on the production server. You know that a particular commit was deployed, say "deadbeef". Is there really a meaningful difference between commits with timestamps before 8pm Monday that come after "deadbeef" on the master branch, and ones with timestamps before 8pm Monday that weren't yet merged?

If your problem is really, "I don't know what commit was on the production server at 8pm Monday, and I need a branch with linear timestamps in order to figure that out" then you have a totally different problem.


> I get the impression that, unlike the article's intended audience, you know git well enough to not need to follow its advice. When giving advice, you have to aim at the expected level of understanding. That can easily lead to examples that frustrate people with a higher level of understanding, because they seem to over-simplify.

Fair enough. I think I would have been more understanding without the unilateral declaration that rebasing more than once a week is a "process smell" as it were.


Gotta agree with you on that last paragraph. Rebasing is just a tool. It can be used properly, or dangerously. Even so, a simple peek in your reflog and a `git reset` undoes the changes of a rebase just as easily as when you started. It seems the author of the posted article doesn't know too much about the reflog...

Just today, in fact, I had to use the reflog to go back after accidentally rebasing when I should have merged in changes from master into my pull request branch. So all I had to do was `git reset` to the SHA before I did that rebase, and I could redo my merge as if the rebase never happened. The way I found that SHA was `git reflog`, and I urge you to take a peek sometime just to see what it's all about. It's saved me from many an embarrassing situation.


No no no. Rebase is an important tool that should be used for cleaning up pre-pushed history, and should be avoided on commits pushed to branches that other people pull from.


This article is pretty much the correct response to the pull --rebase discussions. No need for it -- you should be using a branch.

But I'm not sold on no rebase at all: a common workflow for me is commit often while getting something working, then rebase 4 or 5 commits into a presentable unit (using magit). I have to admit I hadn't noticed that the timestamps were preserved by rebase. And there doesn't seem to be a rebase option that tells git to create new timestamps with chronological order matching the new commit order? I could use `filter-branch`, but it seems heavyweight.

One other criticism of the article:

> A soft reset back to origin's HEAD, and then re-commit your work

That leads to errors: if you've added new files, you may very well forget to add the same set again after the soft reset.


I think you want `--ignore-date`, The man pages are very confusing about such things (in fact there is a minor documentation error in the current documentation on Debian Wheezy).

My concern with using a soft reset mirrors yours, but I would also like to add that it is effectively doing a manual rebase (albeit with different semantics around date) so it seemed rather odd in an article arguing against rebase.


Thanks. However `--ignore-date` is incompatible with `--interactive`. I should have said in my description above "rebase --interactive 4 or 5 commits into a presentable unit".


If people are still struggling with this, I can recommend looking into the git flow model. You can read more here: http://nvie.com/posts/a-successful-git-branching-model/


Please, stay away from not using caching with static websites.


Yeah, that's on my todo list


Better put it on your 'done' list.


"please understand wtf you're doing" "please read the docs" sums both articles up ;-)

if you don't get what its doing underneath, you'll never get it right. yes, effort is needed.


I've found that when teaching people Git, the most important thing is that they understand the state of their DAG, since that is at the core of everything we do in Git. A confusing DAG full of unnecessary merge commits is much worse for a beginner than exposing them to rebase.

Git lends itself as a development tool as well as a version tracker, and rebase is an important part of that for both beginners and experts.


To rebase or not is a religious war. Also on HN today/yesterday: - git pull --rebase until it hurts: http://jrb.tumblr.com/post/49248876242/git-pull-rebase-until... - Please, oh please, use git pull --rebase https://coderwall.com/p/7aymfa I've seen a ton of posts like this on both sides.

The non-chronological history is unfortunate. In practice, when `git pull --rebase`ing several times per day the way we do, commits will only end up a few minutes out of order, which won't affect queries like `git log --since=1.week.ago`.

One of the things we optimize for is early integration. I conceptually like his recommendation, but I think it'd slow us down since we're always depending on each others' commits.

> Don't fix the symptom by rebasing endlessly, figure out your problem. And you do have a problem, because not only are you writing crap code, but you're committing it as well!

Translation: make sure your code is perfect before you commit. That's ridiculous.

> You really, really need to collaborate in real-time with another dev. and so must share all your code? At this point you're pair-programing - maybe look up GNU screen & its 'acladd' command to allow you to share a terminal with your collaborator. Or just tell each other when you're about to commit.

False and ridiculous.

I am a fan of heavily rebasing non-public history. Non-public obviously meaning local branches, but also, a loose definition for us is a public branch that you own and are confident no one else is working on. (If GitHub let us fork private repos without counting against our private repo limit, we could have true private remote branches). Sometimes I commit things very cleanly, creating separate commits for each concern, and sometimes I don't. When I don't, I'll often do a quick interactive rebase to re-group changes into logical commits–the purpose being a clean history that makes it easy to read/revert/cherry-pick/bisect specific, individual changes.

So there are pros and cons to both philosophies - it just depends on what you're optimizing for. This guy is writing in support of his particular needs, but makes the mistake of asserting they are the best or only.

Also, I think a future version of git could address the non-chronological history issue, which was the only con I'm aware of.


For me the posts arguing in favour of using rebase explain why much more clearly than the posts favouring avoiding using it.


Rebuttal: why not use tags to identify your major revisions and feature work? This way your bisect is rooted on feature releases or versions, rather than relying on git log --since.

It appears the time aspect is the primary argument presented, and I wouldn't consider using a workflow that didn't tag revisions, so I don't consider that an issue.


Please use rebase... but only if you understand it!

Do not commit to your commits...

I experiment on branches... I use commits as stashes... I end up with 100s of local branches...

-

Only rebase non-shared commits.

Normally people say "dont rebase pushed branches" but what if you merged your local branch into a different branch which is pushed? them commits are already shared...

Then again.. I would rebase branch X but because branch Y was branched off X I also rebase branch Y onto the new X.

-

I even rebase pushed branches, because they are _my_ branches.

-

To those who say rebase is a "lie".. I say don't commit to your commits... do you commit each time you hit the keyboard? well that is history which _could_ be captured for the sake of useless history...

-

Before merging cleanup your branch so it is clear what the branch is doing and add in more useful commit messages (use the body of the message, not just the subject).

-

Do you end up with a load of local changes which are hard to give clear commit messages to? so you end up doing "git add ." instead? then you should be committing more often.


> Error establishing a database connection! > (mysql_connect(): User oneandon_oneone2 already has more than 'max_user_connections' active connections)

I wonder why you have a website that needs mysql to render, and why you appear to be leaking the connections...

This problem was solved ages ago by the likes of CityDesk.


it's probably a wordpress blog


Nope, it's b2evolution[1]. There is a meta comment at the top of the page:

    <meta name="generator" content="b2evolution 4.1.6" />
I looked for all the common wordpress signs and didn't see them in the html. EDIT: also, there is a "Credits: blogging software" at the bottom of the page that links to [1].

[1] http://b2evolution.net/


Mirror: https://gist.github.com/anonymous/5488616/raw/019c5fbfad1c16...

Anyone know how to make gist do word wrap with textfiles?


Thanks for the mirror. Not sure of the answer to your question, but pastebin wraps http://pastebin.com/tBXZ7DDU


Giving tech recommendation while using the deprecated since several years mysql extension is funny, if nothing else.


Ad hominems generally are nothing else..


I learned it as rebase on branch, merge to master. Is it too much to assume, this being git with it's focus on branches, that what the original pro-rebase article was espousing was to rebase on branches so the merge back to the parent was cleaner? What am I missing here?


As someone that has used for years but never in a team settings it amazes me what I don't know about using git to its fullest. Were would one start to learn about proper use of git in a team setting... just in case I it ever happens.


the use of rebasing is simple: You have contributors and one committer. The committer is also a developer whose branch can have changes anytime. The contributors must rebase on committer's main branch so that the committer can do a fast forward, clean merge when the time to merge comes.

Clean upstream merge is important, since you have multiple committer's who are contributors to a bigger branch. The hierarchy goes on and on until the top (e.g. linus torvalds). So rebases are a way of achieving clean upstream merges, and it becomes more important when you have many levels of hierarchy.


gosh now I know even less about rebase than I did before, which was nothing...


I avoid frequent merging this way: 1. Make my changes. 2. try git pull, if succeeded - no merges needed, no merge in history. 3. Commit changes. 4. git push.


"Do whatever the hell you want with your git repo until it's public, at which point continue, but don't rewrite public history".


This is Hacker News at it's finest. One post says use rebase, the next post says don't.


Do people not know about git merge --no-ff?


Please, stay away from MySQL


I totally agree rebase is not needed, and used incorrectly a lot of the time.


Can't agree more. I never use rebase; the author's contention that using rebase is a sign that you've f@cked something up is spot on.


There's an elephant in the room that he missed.

Rebase breaks bisect.

If you have a sequence of patches, wherein each patch compiles cleanly, and you merge it with another sequence of patches, you still have a sequence of patches that compiles cleanly.

If however you rebase your sequence of patches onto another sequence, and this introduces build errors (not uncommon), bisect no longer works on this segment of history, even after you fix the build issues with a final "cleanup" patch.

Rebase destroys semantic history. Merge preserves it.

EDIT: whomever downvoted me, care to explain what's wrong with my reasoning?


"If however you rebase your sequence of patches onto another sequence, and this introduces build errors (not uncommon), bisect no longer works on this segment of history, even after you fix the build issues with a final "cleanup" patch."

Well, not if the cleanup patch goes at the end of that sequence rather than the beginning. Guess what tool you can use to change that.

I don't really see how introducing a breaking change in a merge commit rather than a rebase fixes anything. In any case, you should squash, build locally, and fix before merging to master anyway.


Well, not if the cleanup patch goes at the end of that sequence rather than the beginning.

That's what I meant by a "final cleanup patch". However this is not an issue that any sequencing of such patches can fix:

Without rebase, every patch has an implicit implication that if it builds cleanly when applied to its parent, it builds cleanly no matter what other patches are merged into the tree in the future.

Once you start rebasing, you lose that guarantee unless you edit the patches themselves: there's no guarantee that just because sequences A-B-C, A-D and A-D-E each build, that A-B-C-D will build. Hence bisect (which may need to build A-B-C-D) breaks.


The thing is, when you rebase, you don't have A-B-C-D at all. If you have A-B-C and A-D and you rebase A-D onto A-B-C, you actually have A-B-C and A-B-C-D', where D' is the same diff as D, but it's a different revision. And when you build, you may find that D' doesn't work when D did.

Merging doesn't even help, because if you merge A-D to A-B-C, you end up with:

  A-B-C-M
   \_D_/
at which point M would be broken in the same cases where D' would be broken.


The point is M contains the fixes required to make A-B-C plus A-D build. Then you have no histories in that tree that don't compile.

Like you said, A-B-C-D' DOESN'T build, PRECISCELY because D' is the same diff as D. You need to make A-B-C-D'-M to get a working build, but the history A-B-C-D' is STILL BROKEN.


So you run a build on D', make the necessary fixes, and amend it. Which is exactly what you would have had to do with M anyway, or else M wouldn't build either. I don't see any difference at all.

Merge commits don't magically fix broken builds. All they do is ensure that you've resolved merge conflicts, but rebases do the same thing!


I did not downvote you, nor do I think you deserve a downvote. I think your scenario highlights something to be careful about when rebasing, but I wouldn't say to avoid rebasing because of it. You could, for example, do an interactive rebase to fix or squash those commits.


Yes, commit squashing is one way to work around this issue.


I downvoted you because rebase doesn't break bisect, full stop.


That's not an explanation. Please address my argument.


Neither is claiming that rebase breaks bisect; I can tell you what does break bisect: commits that are noise. Commits that weren't properly cleaned up. Commits that revert changes made in the previous commit. Commits that are conceptually related but not squashed together, or worse, separated by unrelated commits, etc, etc. You know what these have in common? They can be fixed with rebase.

Sure, rebase can be abused, but committing in general can also be abused. Rebase offers the opportunity to make history legible, instead of just some ASCII journaling of a codebase.

And you're also wrong about rebasing onto the tip of a branch, as others pointed out: in my experience, it very rarely introduces build errors, and on the rare occasion it does, it's easy enough to fix them, commit, rebase to squash into a fixed commit, and push. As someone who rebases on a regular basis, and has done more than a few bisects to the same codebase, I can tell tell you rebase works just fine with bisect.

EDIT: I'm not the only one who thinks this way: http://darwinweb.net/articles/the-case-for-git-rebase


Neither is claiming that rebase breaks bisect

But I gave an explanation up front and took the karma hit.

You hid behind your keyboard for two posts before explaining why you disagreed.


In that case you wouldn't do a cleanup patch - you would fix the original commits you made that broke things before pushing the changes.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: