... "and I was getting ready to commit a series of important changes" ... Before doing so, I want to merge in the recent changes from the remote master, so I do the familiar git pull. ... "maybe I’m going slightly crazy after 3 days straight hacking" ...
Do I interpret this correctly as that the author has not commited any changes for 3 days?
With SVN there may be an excuse for this, but with Git the right way is to commit as often as possible, and then squash your commits before pushing them. With such a workflow the problem would have been a non-problem - just use git reflog and checkout your previous version.
Of course you wouldn't use a git pull then, but just rebase your local commits on top of master.
Learn how to use your tools, instead of complaining about them!
I agree that the author should've been committing far more regularly than he apparently does. In addition, "git pull" is a bad idea (you never know what nastiness someone else might have committed — "git fetch" + "git merge" is a far saner way to stay up-to-date).
That said, hats off to the author for tracking down the problem. Regardless of workflow flaws, the behavior he observed is a bug, and I'm glad it's fixed.
By doing a fetch first, and a merge in a separate operation, you're at least presenting yourself with an opportunity to check the code that you are merging.
With `git pull`, it is all done in one operation.
Of course, if you've already reviewed the code or are pulling from your own remote repository then `git pull` is likely fine.
Agreed. After a fetch, I always review the changes on the remote branch. Depending on the circumstances, I can then make an informed decision about my own code, with the following possible outcomes:
1. I might choose not to merge. The upstream code may be bad, or it may not be ready for a merge. I may need to do some more work to prepare my own code to merge.
2. I might wish to perform a fast-forward merge (if one is possible).
3. I might wish to force a non-fast-forward merge (if a fast-forward is possible). This is often a good idea, as it helps keep groups of commits related to a particular feature isolated.
When calling fetch and then merge (or using rebase), you have more control over how the other changes impact your commits. Just pulling will result in a merge commit that contains all of the changes between the point you last pulled. This becomes confusing because it will look like you are adding files or making changes, when you are actually not.
The reason this happens is because git uses directed graphs and there is no link between the point at which you diverged from master and the other changes came in. The merge commit creates this. Unfortunately, it looks like you just committed everything that changed on both branches and the history can be difficult to track and/or confusing.
Fetch then merge will make the history more clear, but you still create a merge commit that will include all of the file changes (correct me if I am wrong here). git pull --rebase will take your changes out of the equation, pull in the new history and then replay them against the updated master branch. This is nice because your commit will only include (at this point) any merge conflicts resulting from your changes.
> Unfortunately, it looks like you just committed everything that changed on both branches and the history can be difficult to track and/or confusing.
I don't think this is really true. You'll just introduce a single merge commit on top of both your commits and the other person's. Yes, if you diff that merge against your own work, you'll see all the work that other people committed, but I think it's fairly well understood that a merge commit is just that - you merging your changes with other changes.
It can be a bit tricky to read the log when merge commits are involved, but try the --graph option or a graphical log tool.
fetch is a fairly straightforward operation with predictable (i.e. get me all the objects and refs in the remote repo) but 'merge' actually moves the branch(es) in your local repo around.
After you do a fetch you can go on to do a merge just as though you did a git pull, but if you break down the steps then you now have the option to do a rebase or other operation as you see fit.
Sure, I would agree with that, but in this case I already knew what the one commit I was pulling did. And the fetch,diff, merge wouldn't have helped. The file that was destroyed wasn't changed in the commit that was fetched.
Be careful with this though: if you have a merge ready to push, and you pull with rebase, your merge will be 'flattened' and all commits in it duplicated on your current branch. I've been bitten by this a couple of times.
It rewinds your work, fast forwards to the fetched branch and applies your work on top of it. It doesn't create a merge commit, which can pollute your tree unnecessarily.
But really, the preferred way is to use topic branches. So if you're on branch "foo", this is how you integrate into master.
git checkout master # because it always matches upstream
git pull # this will always fast-forward
git merge foo
git push
That does create a merge commit but it does it in the right direction (master 1st parent, topic branch 2nd). You get to see the parallel development which is good to preserve in the history. Otherwise, rebasing is nice because it keeps the history linear.
Just to point out that fetch + merge would have caused the same problem. In this case it was a small team, with advance knowledge of what the commit was, so no need to inspect.
But, in any case doing the fetch would have shown that the file was not modified in the fetched tree, and it would have gone ahead and overwritten my changes (without even listing in the merge log that the file was changed).
FWIW, I agree that fetch + (merge | rebase) is in general the best way to go, but I think there is a case when you a pulling in a simple fix from head where doing a pull is legitimate.
After all the documentation says this is a safe operation:
"If any of the remote changes overlap with local uncommitted changes, the merge will be automatically cancelled and the work tree untouched" (from git pull --help).
If this isn't meant to be considered a safe operation git pull should abort if there are any changes to the working directory.
"Safe" or otherwise, it's (IMO) never a good idea to merge uncommitted files.
You're losing history that way: If the merge doesn't actually work, then you've got a screwed up file and no way to roll it back.
It's one of the great strengths of git that you can commit files even if someone else has changed them. It's a bad idea to merge when you have anything significant checked out (I'll leave temporary debugging changes checked out, or very small changes, but that's it). Heck, it's a good idea to check in every few hours, to track changes your making.
Agree 100% I've seen this same type of issue occur in mercurial when people were doing what I call "all or nothing merging". You should not be merging unless you can get to the precise pre-merge state.
the real "mistake" here was not doing 'pull' it was doing 'pull' while having uncommitted changes in the working directory. I'd commit or stash before the pull/
Came here to post this. The key thing to know about git is to always, always, always commit your changes before doing anything that affects history. This is not for data loss reasons, it's for being able to document what is happening; when you have a bunch of unsaved work, you don't know what's there and git doesn't know what's there. Therefore, it's very easy to get yourself into a state where you don't "care" about your working copy, and that's when you can do something where you lose work. If you commit before you do any merging or pulling every single operation that you perform afterwards can be reverted cleanly. And, you'll have a human readable note that documents what you were thinking at the time.
If you don't put your work into git, it can't track it for you. You can always uncommit if you don't like what you committed. In fact, you should consider your local history to be a work in progress; just like you're editing your source code to keep it clean, you should be editing your history to keep it clean. Git never deletes history, so even if you edit it, you can always get back to where you were. What you call "master" may have changed, but what was master before your rebase still exists, in its entirety, inside of git. (It's simply called HEAD@{0} instead of master. See "git reflog".)
You are correct, though I'd also point out "git stash" which does a quick no-fuss commit and is meant for situations just like this. "git stash" "git pull" "git stash pop" may not be exactly what you said but it should be enough to prevent what happened to this guy and I'd even guess it is what most new git users expect to happen when you do a plain "git pull".
For the life of me, I could not figure out why he would not save his changes before pulling stuff down on top of them. Some people still go days without saving, which is foolish in my opinion. Change a function (or similar unit), and save. Rule for dummies: save before leaving for the day (but that's not granular enough).
OK, thanks for the reminder. Even back when I did use SVN regularly at work, we usually worked on feature branches, and seldom ran into collisions requiring that process. I'd forgotten that SVN will force an updage (merge) before commit in some cases. It had not occurred to me that somebody would internalize that as SOP.
That's not my svn workflow; it's svn diff to a patch file, svn revert, svn update, then patch from the patch file (and resolve conflicts if necessary with a decent third-party tool), then finally svn commit.
I run svn diff multiple times a day. I frequently have many different patches related to different functional areas and bug fixes. Effectively I work with local branches implemented on top of svn. I have a whole suit of scripts that automate this. Seemed like the only workable way to use svn in a disconnected fashion to me.
> Effectively I work with local branches implemented on top of svn. I have a whole suit of scripts that automate this. Seemed like the only workable way to use svn in a disconnected fashion to me.
Why don't you just use git-svn or hgsubversion and stop doing that informally?
Or at the very list use something like `quilt`, so that your changes are represented as a stack of patches on your svn "upstream"
Interesting that apologising for not replying in more detail, yet hinting at where to find previous discussion of that detail, is treated with such hostility! I'll simply not reply at all in future, in such cases.
They were obviously saved. I was simply pulling before committing, which is a supported work flow of the tool. I don't think I ever said I went days without committing.
How is it relevant how long it was since he last committed? It sounds from the description like this could happen if he'd committed five minutes before.
Of course it's not cool. It's a bug, and it was fixed. But the relevancy is to the title and summary. When you say "git destroyed my data" it sounds like you're saying that git lost commited data in the repo.
What actually happened here is that the working tree got clobbered. That is a vastly less problematic situation. Working trees get clobbered all the time: rogue "make clean" changes, system crashes, someone-stole-my-laptop, errant rm -rf, forgetting which tree your changes are in... I've lost working data to every one of these, and I never felt the need to blame my tools in a blog post.
So yeah. It's a bug (and a pretty embarassing one). It was fixed. Is there anything more to say? Move along.
To add, it's commendable that he tracked down the bug himself, at least I think so.
Many people would throw their hands up in the air and walk away with a sour disposition, but props to this guy for working it out to the end.
The point was more that if you've ever written a "make clean" rule, you've probably blown away your source tree a few times trying to do it. The software development working directory is the wild west. Bugs that destroy data here, frankly, don't rise anywhere near "devastating" in my book, sorry.
Normally in git you never pull on a dirty working dir, you either stash or commit your work and then pull. That's why this bug is unlikely to trigger normally.
All this cp business struck me as a bit of not understanding how to properly work with git. I've never had my work clobbered by following the strategy above. But I also don't go very long between commits on my working branch.
He is using "git stash" to create a commit object for his uncommitted changes, giving him more documentation in the event that Something Goes Wrong. Git will never throw away your data, but you might get into a weird mental state and type a command that throws away your data. By creating a commit object, you can always get back to that state, and you have a little one-line note-to-self that helps you not get into a weird mental state.
I personally do a "real" commit before rebase, but "git stash" is really the exact same thing. I guess "git stash pop" is less scary than "git rebase HEAD^". Either way, you can always undo that operation via the reflog.
No, you don't interpret correctly. It was not 3 days of uncommitted changes, it was committing some changes after hacking for three days.
Completely different thing. The 3 days was relevant because I was tired and my first assumption is that I had done something wrong. It turns out there was a nasty bug.
Also, I don't think my workflow is that broken:
"Linus often performs patch applications and merges in a dirty work tree with a clean index."
If I am pulling down some unrelated changes, it is not unreasonable to do that. That is one of the features that git provides.
I know perfectly well how to use git, in a number of different work flows, and I use whichever is most appropriate for the given project at the given time.
Also, I don't think my post was complaining. It wasn't 'OMG git is the worst, I'm never using it again', it was an analysis of a particular nasty corner case in git.
You referenced that article to claim that Linus does what you did. But that article is quite clear that Linus does not do what you do: He either reverts, commits or stashes first. Had you done that you would not have had your problem.
Its one thing to fuck up and get panned on HN. Its another to go looking for justification that you are right. Its yet another to select a single quote from an article to justify your position when the entirety of the article refutes your claim.
You did a great public service by calling attention to this problem (the problem being not using git properly - nobody is going to hold that against you). Don't ruin it now by getting all defensive.
Sorry, that did come across as a bit defensive. However, I think if you read the whole article carefully it explains that after doing a git pull, and git refusing to merge because of outstanding changes you then have a chance to either 'revert, commit, or stash'.
That is exactly the behaviour I expect from git, and exactly what broke down in this case.
Definitely. However, this highlights a problem with git, it should naturally guide you to best practices. Unfortunately, with git it's quite easy to fall into a natural and comfortable seeming pattern of use that has some very negative downsides, such as the author of the article experienced.
Exactly, in fact it normally does. git pull is actually really useful for enforcing this most of the time. If you inadvertently do a 'git pull' before doing a commit, and you have any overlapping changes, then git aborts the merge. This is why 'git pull' is generally a safe operation.
Ofcourse it's the victim's fault! The grandma who's been writing a letter on word loses it after an unexpected blue screen of death, ofcourse it's her fault. A guy gets stabbed at 2am at a bus stop; clearly the guy's at fault. Shoulda known better than hanging out in that part of the town at that hour. A car blows up when a woman fills it up with regular gas instead of premium; you had it coming lady, learn to use your car rather than now sitting in the hospital and complaining about it.
We are cool like that, we blame the user when out favorite tool screws the pooch big time.
The article isn't as anti-git as the title might lead you to believe. I enjoyed the article for the research and up-voted it. I use and like git, but the ideas of silent data loss is scary (as it spreads).
Long story below.
However, if you want real fun try out what one centralized repository did for me once. I was (against my will) using Visual Source Safe in the 1990s (ick ick ick). Visual Source Safe at the time represented its data on a server with two RCS style history files (called a and b). When you committed both of these were updated (no idea why there were two) and then as a matter of policy Visual Source Safe re-wrote your local content from the repository. That is on a check-in: it wrote back over your stuff. Fast forward to the day the disk filled up on the server and a single check-in attempt corrupted a and b (so even if redundancy was the reason for 2 file, it didn't work) and the the server stayed enough up to force overwrite my local content. Everything lost for the files in question (no history, no latest version, no version left on my system, forget about even worrying about recent changes). Off to tape backups and polling colleagues to see if we could even approximate the lost source code.
My ears perked up when I heard that it involved renames on OSX. I don't know about the exact issue he had, but I recently found out the hard way that OSX's HFS+ is a case-insensitive* filesystem. You can get subtle issues by importing multiple files with the same case (such as "README.txt" and "ReadMe.txt") into the same repository; this isn't specific to git.
I had a similar issue with Perforce on Windows - Perforce was case sensitive, Windows wasn't, and thanks to CamelCase, there were two files that had the same letters but different casing. (I don't remember the names.)
* Technically, "case-insensitive but case-preserving", which in practice seems to mean, "case-sensitive, except when you need it to be".
I work in git on OS X all the time. When you rename a file (regardless of your OS), it's just plain good practice to git add . and (at least) git status to ensure you've got the file you want committed.
I still think it's a terrible default, though. I'm coming to OSX from BSD and using it as a Unix. Retrofitting case-insensitivity onto a Unix is bound to lead to ugly corner cases.
> Retrofitting case-insensitivity onto a Unix is bound to lead to ugly corner cases.
It generally works well, but the other way around does not work: many OSX software (most prominently — as usual — adobe's) play fast and lose with casing, and will break down on case-sensitive HFS+.
I'm pretty sure it hoses Norton's "security" software, too. Of course, if you can't handle documented file system options, I don't trust you to detect exploits….
Yeah, you don't even want to know what I had to do to get around Steam not handling case sensitive filesystems. Steam is the only thing I've noticed that doesn't work right. There was also one driver that installed itself into "/LIBRARY/EXTENSIONS/". That was annoying but easy to fix.
I have been bitten by that a couple times now. I develop on a mac, but deploy on Linux and had a couple cases of trying to import or open a file that couldn't be found due to a case issue. And it is next to impossible to rename the file on a Mac from SomeFile.txt to somefile.txt. I think I had to do SomeFile.txt -> someotherfile.txt, commit -> somefile.txt, commit.
The vast majority of their users aren't using it "as a Unix" and most of those who are don't mind case-insensitivity or know to use case-sensitivity when it's needed.
Unix's case sensitivity has long been a source of minor but obvious annoyance to non-technical users, to whom a name is a name -- the case of the letters is irrelevant.
On top of which, it would have been an added complication for developers porting from Classic, and an unexpected change for users, in exchange for no substantive benefit.
In truth, the only time HFS+'s case-insensitivity has bitten me was extracting an SDK from Broadcom, and even then, the SDK wasn't about to run on OS X anyway (it's entirely Linux-based), I was just pulling out some documentation.
On the other hand I recently wiped and rebult my laptop with Lion and case sensitive HFS+, only to find that Adobe Creative Suite explicitly refuses to install on case sensitive volumes. Incredibly annpying.
It also breaks Steam. I'm sure it breaks a lot of other (badly-written, I should add) apps.
I definitely recommend against using case-sensitive HFS+. It's a pain to fix because you have to reformat, and you can't restore from a Time Machine backup because Time Machine requires that the source and target file systems are case-compatible. :-/
Yeah, HFS+ is a nasty beast. In this case not the culprit, firstly because the first thing I do when getting a new MacBook is reformatting as case-sensitive, and secondly, because it was just a directory rename (e.g: ./foo/ to ./bar/).
Yeha, I wouldn't blame a tool for not dealing with the brain-dead almost case-sensitive HFS+ filesystem.
Aside from having to import data from case-sensitive environments, could you give an example of a case where you would actually /want/ you filesystem to be case-sensitive? When would it ever, be a good idea to have two distinct files called readme.txt and README.txt?
I really wish I could remember the specific example. We used camelCase for filenames on that project, and there were two distinct names (like "bookcase" and "bookCase") that did not sound alike, yet were the same letters. The sort of thing that meant something very different if you moved the space.
I prefer case-sensitive filesystems because the separation between works is semantically meaningful in English.
Example: checking out a repository where someone named (say) a file 'foo' and a directory 'Foo'. It happens. Most of the time that checkout will simply fail. Having this not work is less than enjoyable.
"OK, so the bug never trigged in 16,000+ Linux kernel merges."
If your team is avoiding a tool with a 1 in 16,000 chance of failure then they'd probably also want to avoid flying (1 in 20,000 chance of death by failure), large bodies of water (1 in 8,942) and run terrified from cars (1 in 100) (source: http://www.livescience.com/3780-odds-dying.html.
The car stat seems rather high, and git won't kill you, but the general point is that a 1 in 16,000+ chance of losing a few hours of work is "s--t happens, find a workaround and get over it" odds.
You read those stats wrong. Those are lifetime odds. Not odds per event. Otherwise, over the course of a year, everyone who commuted would likely be dead.
> If your team is avoiding a tool with a 1 in 16,000 chance of failure then they'd probably also want to avoid flying (1 in 20,000 chance of death by failure), large bodies of water (1 in 8,942) and run terrified from cars (1 in 100) (source: http://www.livescience.com/3780-odds-dying.html.
That's not how probability works. That assumes a uniform distribution, but it's not uniform at all: if you run the affected versions of git and run a particular sequence of commands, you will hit this problem every time. Assuming that the Linux community's workflow doesn't change that often, the Linux devs would likely never have hit this because they don't run that sequence commands. Because it's not actually random, we do have control over it, and your suggestion that we all just "get over" the fact that an important tool lost user data seems pretty cavalier.
Actually though, there was a 0% chance of this happening to anyone who uses git properly and commits before merging, and a 100% chance of it happening to this guy.
If this is how you properly use Git then why do they allow the improper way to do so? This seems like broken UX if you allow people to do something that will 100% cause catastrophy -- unless there is virtually no way to design it otherwise (but it's obvious that this is not the case with Git).
... which git failed to do properly. So the git-pull man page is clear that it won't clobber your uncommitted changes. Except in the (buggy) case when it will.
To be completely honest it's because git is made by and for hackers. It gives you a lot of rope. If you choose to "git reset --hard" after doing some very important work and not committing, hey, guess what, git won't stop you. Neither will "rm -rf".
This reminds me so much of the UNIX Hater's Handbook, which argues that this attitude ("it's just a rite of passage", "it is better now anyway") is a result of growing up in a world where the design errors of UNIX are seen as features. I don't know yet, but it is an inspiring read nonetheless: http://www.simson.net/ref/ugh.pdf
I've read the UNIX Hater's Handbook a long time ago. I have a hard copy stored in a box somewhere.
I don't quite understand what point you are trying to make though. Ultimately the code loss was my fault, however it ended up much better for the source code I was working on because I had time to think about what had to be done and how I should do it, ultimately leading to less code that ran faster and was much better organised.
When you rebase in a dirty tree, it won't let you. I think merge should be the same. I manage a manage a large number of products, and from time to time I rebase and find unexpectedly there are changes in my working tree. If I merged, I'd not have seen it.
I really wish merge, by default, was disallowed on a dirty tree. Yeah, it'd be fine if there was a command line argument to override this behavior.
It was a bug. It was broken UX. Everyone agrees with that. And it has been fixed.
But the reason it went unnoticed for (horror of horrors) 2 minor releases, was that it only results from an unorthodox (although supported) usage practice.
> Before doing so, I want to merge in the recent changes from the remote master, so I do the familiar git pull. It complained about some files that would be overwritten by the merge, so I saved a backup of my changes, then reverted my changes in those specific files, and proceeded.
Yikes. git stash / git stash apply. Pulling into a dirty working tree is asking for trouble.
git losing data is Very Very Bad (and massive kudos to the author for tracking down the bug rather than just bitching about it), but if you're following a proper git workflow (pull to clean working trees, save often), you shouldn't ever be in a position to trigger this bug. That's not an excuse for git to break like that, but the reason that it was likely never seen in the 16k Linux commits is that it's not the "right" way to do things.
To quote: OK, so the bug never trigged in 16,000+ Linux kernel merges—kernel developers are probably sane people quite proficient in git so that's quite unlikely to happen. That's probably the reason the bug was out there for a year, nobody ever bumped into it. I would bet some money on none of them kernel developers ever having git-pulled into a dirty working tree. (Most of the newbies around the world who probably bumped into it didn't understand git was in error there—excluding the author.)
I can't explain why the opposite happens. Most of the people I know intuitively commit or stash their local changes before merging. They have this intuition even if git is relatively young piece of software. But then there are always a handful of people that I imagine who could do something like that. And I'm not quite sure why.
One possibility is that it could come down to the level of trust in computers. I don't think I could issue git-merge without git-stash/git-commit first—probably because I don't instictively trust programs to handle complex operations too well in the first place. Operations such as handling unsaved data or letting random commits from different place three-way merge themselves into a single branch. Or both.
This mechanism of distrust might be similar to how drivers who think they're bad drivers are, in fact, the best drivers. They underestimate their capabilities enough to assume everything won't always go right, and then they're a few steps ahead when something goes wrong.
I'm really late to this party but I want to stress a. Point that doesn't get mentioned often enough:
Do not use "cp". Please.
Copying changes to save them and reapply later is nearly guaranteed to quietly lose changes, reintroduce removed code, or otherwise screw up your work.
If you want to move changes stash them or commit them and then apply them elsewhere. Using cp throws out all of git's ability to help you do what you mean and not what you say.
Also, the data corruption was caused by a bug, yes, but the cp based workflow being used will result in a nasty suprise sometime in the future.
That's kind of an odd response to a bug - not adding a new test, but just noting that it didn't hit one particular project. Is Git's entire test-suite just 'the Linux kernel changelog'?
Even if you choose to keep three days worth of changes uncommitted, you're still doing local backups of your machine anyways, right? He'd be facing the same amount of information loss if his hard drive died.
If you're on OS X, Time Machine will get you back to where you were recently (except if your home directory is encrypted, then it backs up on logout). Or use Dropbox/SpiderOak/other to keep the last n versions of your changes.
At first, it seemed that this was another rant about a misbehaving piece of software.
But I was impressed that, unlike so many others (myself included), the author went beyond just complaining. He actually made a real effort to identify the conditions under which the issue occurs. But I was blown away when he actually examined the source code and identified when the bug was introduced. Great work!
Not to mention this "microcommit" approach in your feature branch allows you a fine level of control over your working branch without polluting the master branch (because of the squash). We use this one at work, and I do on my own personal projects as well. Never have I been happier with a VCS than I am with git after moving to this workflow.
So you never commit on master, only pull (fast-forward merge) from origin/master and merge from your feature branch? Does merging your squashed commit from MyFeatureBranch to master commit a merge record or is it always fast-forward?
I suspect that certain types of people (including myself, at times), actually want continuous backups being taken on the state of their working copy prior to actually performing a commit. Bugs or not, Git doesn't do very well with unversioned changes: an accidental 'git reset --hard' can easily blow out lots of work (happened to me), even if that was exactly what the command was supposed to do. The correct thing to do is commit early and commit often (git commit -am "Wibble"; git reset HEAD~ works well for me) but from a user experience standpoint this ought to be automatic.
Yes, it would be nice to have the assurance that the WC state was stored in the reflog before every git command that could possibly change it. This would probably be a relatively easy patch to write, although it could also slow git down too much to be accepted.
Git can do that, with a bit of help from your editor (assuming your editor is configurable enough, i.e. you use emacs or vi): https://github.com/bartman/git-wip
I admit I skimmed through the article, but, "git destroyed my data" .. hum ... how come ?
Git is a versioning system that doesn't free you of making backups of your central master. And by master I mean the global central reference repository or whatever you call it.
So you screw up your repository using an unconventional workflow and now you and your co-worker don't trust git anymore ?
Well maybe you shouldn't have blindly trusted it in the first place. It's a better tool than many but still is just a tool you should use with care. As any tool. It has bugs.
That said, I feel your pain. Finding bugs in other tool can be a very frustrating experience. Well, shit happens :-)
Git can be a beast conceptually speaking if you don't learn it the right way. This was undoubtedly a bug but the author's story surfaced some suboptimal git habits.
The way I avoid problems like this is:
1 - Always do new work in a branch off whatever branch you plan to commit to eventually.
2 - commit often
3 - Use rebase -i to squash commits when everything is looking good
4 - Use rebase parent-branch to replay your commits on top of whatever new stuff is in the parent and resolve any conflicts
5 - Only then go back and merge the working branch back into the parent-branch
But this is really a small corner case and I can see how it went unnoticed for a year.
I almost never use 'git pull' (I do git fetch and then "git merge" or "git rebase" depending on the results), but more importantly I never ever use pull when I have changes in my current working repository. I commit, and then I pull or pull --rebase etc. This way I'm really sure that my data is safe as git has a lot of safety features for committed stuff. all files are stored as objects and there is a reflog to help if you loose track of rebased branch etc.
Another thing that I sometimes do is 'git stash' before pull/merge/rebase etc. git apply later is also a very safe op.
The biggest issue here is not committing before pulling. Always commit all of your changes before updating your history (either through git pull, git fetch/merge or git pull --rebase).
Likewise, I keep all my git repos backed up to Time Machine. As long as I'm at my desk, I can never lose more than an hour's worth of work. I don't think I'll ever understand git well enough to solely rely on it.
Your browser -- and my browser -- did not autodetect the character encoding correctly. Switch to Unicode (UTF-8), and it should look fine. I used Firefox.
The reason our browsers don't autodetect the character encoding as utf-8 is because that page doesn't say it's utf-8. In that case browsers are allowed to assume it's latin-1.
If you want to get pedantic about it, the content is being served as text/html (as opposed to application/xhtml+xml) which means the XML declaration isn't valid.
Well, I'd like to know why it doesn't work. If that's pedantic, so be it.
Edit: I think people took this the wrong way. I mean it's not pedantic to find the real problem. We're all better off because the poster I'm responding to showed us the real problem.
It's not cool that the guy lost some data, but as he points out - all software has bugs. This is just one more reason to make your commits as atomic as possible - which it sounds like he wasn't doing at all.
While he could improve his workflow, git had a legit bug in this case. If he was complaining that "OMFG, git reset --hard wiped out my changes!" then we could laugh at him, but git did a Bad Thing here by any reasonable measure.
I appreciate the replies but down-voting is silly.
Look, I have to deal directly with co-workers that misuse git. When I see them doing something bad, I tell them about it. The biggest resistance has been from those that refuse to change.
Here's a real-world example: "git messed up my merge again". What? I go to look into it -- well it turns out they don't understand git and work around it by copying files out of their sandbox, run "git pull", and then copy stuff back in. This is such a recipe for disaster (it easily can and will lose others' changes) that not telling him to "improve his workflow" is disastrous for the project as a whole.
Yes, git messed up his merge. But as others have noted, so could have a bad Makefile rule. All I'm suggesting is that by improving his workflow (committing early and often) then there is no chance of work getting lost, ever.
I don't think anyone takes issue with the "fix your workflow" part of your statement; bad workflows should be fixed. They aren't an excuse for bugs in the software, though. A version control system should never unrecoverably wipe out data unless you explicitly ask it to.
Heya, yes, bugs are completely unacceptable. Trust me, the git project completely understands this.
Sorry if my comments seemed rude or snarky. Actually, I tend to dislike "tl;dr" comments altogether. I don't think my comment added much to this discussion. If I could I'd delete it. cheers
I'm not sure where you get the idea that I'm not committing early?
If it was the comment saying '3 days of straight hacking' then I apologise for my prose being unclear. I was only bringing that up because my first assumption was that after 3 days of hacking I was tired and assumed it was indeed my fault.
It was a couple of hours worth of code that needed committing.
And doing a pull, which is meant to safely pull in unrelated changes, to test some things before committing, especially when it was a very simple change on head, is completely reasonable.
I'm not ignorant about the many ways in which git can be used, the post wasn't even particularly blame placing. Tools have bugs, shit happens. Most of the post was tracking down the bug, which is much more useful. Thankfully it was already fixed, so I didn't need to fix it myself.
And doing a pull, which is meant to safely pull in unrelated changes, to test some things before committing, especially when it was a very simple change on head, is completely reasonable.
I don't think its reasonable. 100% of posters here don't think its reasonable. You just demonstrated why its not reasonable. Yet you say its reasonable.
Here's my rule for version control: if an operation can go wrong, it will go wrong.
... "and I was getting ready to commit a series of important changes" ... Before doing so, I want to merge in the recent changes from the remote master, so I do the familiar git pull. ... "maybe I’m going slightly crazy after 3 days straight hacking" ...
Do I interpret this correctly as that the author has not commited any changes for 3 days?
With SVN there may be an excuse for this, but with Git the right way is to commit as often as possible, and then squash your commits before pushing them. With such a workflow the problem would have been a non-problem - just use git reflog and checkout your previous version.
Of course you wouldn't use a git pull then, but just rebase your local commits on top of master.
Learn how to use your tools, instead of complaining about them!