What happened: he had Dropbox synced to 2 computers, but one of them hadn't been used in a while. He turned on the old computer, and apparently, for some reason, Dropbox decided that the fact that many files weren't there was because they were deleted. Of course, the files weren't there because they hadn't been synced yet!
So Dropbox decided, when the old computer was turned on, to delete 10s of thousands of files that were in my colleague's account.
He reached out to them to help restore. It took several days (this happened on a weekend), but they did eventually restore it. Unfortunately, since it was difficult for them to pinpoint exactly what was the start of the event, some files were not recovered (I'm not sure the exact reason this was difficult to do).
I love Dropbox , but this was a serious blow to how much I trust them with my files.
 Just look at my comment history to see - I've raved about them several times on HN.
Just to dive in a bit for the HN crowd: We base the changes that should be applied to your Dropbox on whether the client has successfully synced the previous version of the file. The server doesn't allow the client to apply changes unless the client is fully up to date. And in the case of deletions, the desktop client is only permitted to issue deletes to files that have been successfully created on disk and then later deleted.
We also have a proper support team to catch things that fall through the cracks, and if you have issues, you should reach out to them first (and as soon as you notice; their ability to fix things becomes harder the more changes you make to your account).
Again, not commenting on the above issue, but I did want to point out that we take the integrity of your data very, very seriously.
sean_lynch is not kidding about Dropbox's support staff. They are great, very helpful, and according to my colleague, are very serious about trying to restore everything. The issue he had was a software bug, and apparently some things are problematic to restore because of other technical reasons, but Dropbox has been very helpful.
And to make this clear - they were very helpful when this happened last week, before this thread - you don't need to write a blog post to get their help and attention.
The guy lost everything because a temporary team membership was revoked. And now you're coming back to talk about your rigor, and how you catch things that fall through the cracks.
This really comes off, to me at least, like LinkedIn did when after their breach and it was revealed they weren't even salting passwords, they tried to brag about their security, as a way to step around owning up.
I think you should consider this tone very carefully.
The guy lost everything because a temporary team membership was revoked.
What would you like him to do, apologize profusely and robotically without knowing the details of this incident? Dropbox is a decent-sized organization and I am pretty sure they have guys already responsible for addressing specific issues like this. To expect every employee to know details of every issue seems unreasonable.
If you had stopped there, I'd have said yes.
> To expect every employee to know details of every issue
I ... had no such expectation?
Sorry. Thanks for explaining. `:(`
I've read through a lot of this thread, and I'd like to follow up with a few thoughts.
First of all, just to make it clear: Dropbox support has been GREAT. Though it took a few days because this happened on the weekend, Dropbox support did contact my colleague shortly after, and have made it clear that they will work hard to make sure everything that can be is restored. The problems with realising what is left to restore seem to be technical (again, I'm unsure of details). The support staff themselves have helped a lot. In addition, someone has reached out to me from Dropbox support because of this thread, to try and find the root cause of the problem - I hope we managed to help Dropbox find whatever it was that caused this to happen, so it doesn't happen to anyone else.
Another misconception I saw, about "Dropbox is not a backup solution":
My colleague and I both realize Dropbox is not for backup. The most important files were backed up, at least to some degree. Having said that, Dropbox is used by myself and many others as a semi-backup solution. My personal strategy is to backup the whole Dropbox folder to an external disk every few weeks, but rely on Dropbox for everything else. This is not a good backup solution, but realise that for most people, it's not "Dropbox vs. a real solution", it's "Dropbox vs. no solution at all".
I'll close with the same message I give to most people - Dropbox is brilliant, most of the time. It's not a great backup solution, but it's better than what most people do. If you're not using anything else, Dropbox is a life-changer in terms of ease-of-accessing-your-files-anywhere, and feeling secure that everything is probably backed up.
(I have an old machine that hardly use now that has problems with the motherboard battery and sometime it travels in time to a funny date like 2000 and all the security certificates are marked as wrong.)
I pray they wouldn't ever rely on users' own system clocks.
Finally, we noticed the user's system time was set to several weeks in the future. "Oh yeah, that?" He showed us how he uses the system calendar to check dates in the future. When he was done, he'd click the "Ok" button, thereby setting the current date in Windows to some future day.
IE, Unix time started then.
If the time is just used to check for certificate expiration, maybe nobody considers it worth worrying about. Which I guess is fine as far as it goes... but the other day I couldn't get Dropbox to do its thing when my clock was only a few hours off. It didn't seem likely that an SSL date check was involved in that case.
It's a common thing on school/university computers, where for various reasons, system time is always wrong and it requires Administrator privilenges to change it (I can't understand why).
> You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost.
Although, I thought Dropbox kept a history on each file going back a month.....
I have a local Time Machine backup that is going continually. This allows me to quickly recover from most problems. However, if my house ever blew up, it wouldn't be sufficient, so in addition...
I have a Backblaze account. $5/month for everything on one computer, which is fine because Dropbox syncs my laptop to that computer. That backup runs continuously and keeps 30 days history (so if I delete a file and need to get it back, I can). Since I only care about disaster recovery here, if my house does blow up, they'll send me a hard drive with everything for around $100.
The combination means I feel pretty comfortable. I just hope my house doesn't blow up!
Once a month do a full image on both Windows and OSX systems. Save the images to another backup drive and take that backup to a safety deposit box at your bank.
If you want to be really paranoid, find a provider that offers fireproof storage for media. If you can't, go to your local storage facility and rent a small unit (maybe $30 to $75 per month, depending on where you live). Buy a fireproof safe and stick it in there. Once a month (or whatever) rotate drives through the safe.
You can also avoid the storage unit rental expense and locate a fire-proof safe at a trusted friend's or family members home or garage. Be creative.
In other words, creating a system that will virtually ensure the security of your data isn't hard or expensive --particularly when compared to the value of the data you are trying to protect.
You need to have context before you say something like that. Just ONE of our engineering machines probably takes two or three weeks to setup from scratch due to the amount of software and configuration it requires. It has some 600GB of valuable data, without counting the OS and applications installed.
So, in that context, taking the time to make multiple redundant disk images is invaluable and, yes, very efficient. If the system drive dies and no other hardware failure is present, you can be back up and running within an hour or two. That's worth money.
> over kill for most people
Absolutely true. Every situation is different.
On Windows you can get the same effect using VSS - http://en.wikipedia.org/wiki/Shadow_Copy
Also, since you seem to be knowledgeable about Linux, why do you use Dropbox at all, instead of just git or rsync or scp or whatever?
Works fine for me. I use it on every system and have it on a desktop, a laptop, a server and a HTPC. It is used on SSD, HDD (SATA and USB), RAID 0, RAID 1 and over dmcrypt. The only place I do not use it is one filesystem for MongoDB, and it would probably work fine there too if I disabled COW.
The builtin checksums (and scrub) functionality are crucial to me. I had a drive start to fail with ext4 and the only reasonable way to do a scrub would be to take it offline which I calculated would take 23 hours (plus a huge amount of extra work to map bad sectors back to relevant files). At some point a failed sector had also lead to ext4 giving back zeroes when the containing file was copied. btrfs has two copies of metadata by default so an unfortunately placed bad sector is less likely to wipe out knowledge of entire trees or files (happened to me with ext4 as well when a directory disappeared).
The volume management is great too. It is really easy to add and remove devices/partitions without having to take systems down, change between raid levels etc. You can do the same thing with LVM (which I was using before) but it is a lot of fiddly work to get the right commands and options, plus deal with physical and logical. It is one command to add or remove a partition with btrfs and it is trivial to work out what the command is from 'btrfs --help'. (LVM requires several different commands to be run which I always had to lookup and often had a lot of difficulty with such as running RAID 0 with partitions of different sizes.) btrfs also only does operations on the actual used portions. eg if I change from RAID0 to RAID1 on a filesystem where I am using 1GB of 1TB space then it will only worry about that 1GB not the whole 1TB. LVM can't see into the filesystem to know what is actually used.
I have compression (LZO) turned on everywhere. You can't (currently) find out how effective it has been, but I do have a lot of data files that are highly compressible (eg CSV files, SQL dumps). It is more convenient to have them expanded than to teach every single program that accesses them how to decompress on the fly.
I used to use rsync and hard links to make snapshots. These completely hammered the machine when run causing large amounts of I/O since all metadata has to be scanned plus all changed/new files. Consequently I only made daily snapshots. With btrfs making snapshots is virtually instantaneous and is unaffected by how much data has changed, size of volume, I/O speeds etc. I now make snapshots hourly.
Ultimately I have things such that I will proactively find out about bad sectors and similar low level corruption, have snapshots to deal with issues over time, and have data replicated over machines (mostly via Dropbox and git/hg), both local and remote. I am not relying on the filesystem of any one machine to always be perfect.
> Also, since you seem to be knowledgeable about Linux, why do you use Dropbox at all, instead of just git or rsync or scp or whatever?
Because Dropbox just works. Many of the alternatives haven't figured that out yet, assuming they even support Linux. The importance of actually working can't be overstated. It also requires no administration or maintenance. Sync automatically happens when machines are on and there is appropriate network connectivity without requiring any baby sitting from me.
I use git/hg for source which is their sweet spot. Using something like rsync is a pain once you have more than two machines, and it requires a full blown system accessible offsite for an offsite copy which is yet more administration and maintenance.
git/rsync/scp aren't usable from mobile devices. I do things like put documents, ebooks, photos, and music into dropbox which makes them be present on everything, no messing around needed. Dropbox behaves very much like DVCS for that kind of content doing an N-way sync, having a history (you can get last 30 days by default, more if you pay more or use a team account).
Finally Dropbox allows collaboration. You can easily share files and directories. I can do a software build for Android, put the APK in a shared folder and a colleague can install the app on their device without hassle. Various tools run periodic reports and put their output in shared folders which makes it available to everyone even if they then happen to jump on a plane.
There is no other simple option that "just works". So my choice is Dropbox + encfs (to encrypt everything). But I also keep another local copy of the entire dropbox folder in case the sync fails and decides to delete everything.
I used to use Mozy but when I tried a trial restore it was taking forever and missing some files.
Also it inserts these .PART files into your system that can almost double the size of the folders you're trying to back up.
I looked at backblaze but it doesn't work with server computers.
So I'm now trying crashplan.
I'm really interested in the feature of crashplan which allows
automated backups to other computers. Only problem is that I've not been able to get this to work so far. Backing up to the cloud is working well though. Also its very competitively priced. I've yet to try a restore.
For servers, I've heard good things about Tarsnap, but never used it myself.
No, that's a tough one, especially with a back up the size you're doing.
Everytime you do a restore that's just confirmation that a particular backup was/was not sound.
Wouldn't it be neat if the back up program had functionality that allowed you to restore a random sample of your files ( and maybe then compare these to the originals ) and give you a likelihood that your entire backup is sound?
A sort of sampling quality control process.
 though I'm switching off of them, not sure where to yet. There's some sort of issue in their client software that murders my home network, rendering the entire thing unusable while a backup is running. Their support was... less than helpful in trying to remedy the situation. For $5 a month and with the amount of stuff I was backing up, I can't blame them.
I don't have any QoS set up on my home network, I found two fixes: running the backup manually before going to sleep (maybe I can create an Apple Script and just cron this) or turning the Automatic Throttle off and giving it the minimum available (for me) 20kbps.
If you accidentally delete a file in Dropbox on one computer, Dropbox will helpfully and immediately propagate that deletion to all of your other computers. It can be used for backup, but it's not very good at it.
Why not come up with some sort of open-sync protocol where you could have your data mirrored by many Dropbox like services at once
Seems dropbox really does believe if a file isn't there, it's been deleted.
My approach for Dropbox on both Windows and OSX is to have the Dropbox folder on a drive that will never see development work, even if that means creating a separate partition for it. In most cases the system drive works fine (you do have separate system and data drives on your machines, right?).
This forces a COPY operation rather than a file MOVE if you drag-and-drop files into any Dropbox folder. Which means that everything in the Dropbox folder could be trashed tomorrow and nothing whatsoever would be lost.
On Windows you can also do this by dragging files while holding down the RMB and on OSX while holding down the Option key. This, however, is fraught with issues because it is easy to forget and move something in or out of Dropbox accidentally.
In addition to that, daily system and data backups to external media on EVERY SYSTEM in the office ensures that even if you took a sledge-hammer to any given machine all the data could be recovered to within a 24 hour boundary.
So, no, this is NOT a "Dropbox horror story" as far as I am concerned. At best this is an unfortunate "pilot error" and at worst this is a sign of incompetence.
Sorry, a little harsh there, but, as a student, I lost six months of heavy coding work once in college. That really hurt. And that was probably the best time and place to learn that lesson. That is all it took for me to be completely insane about having backups of backups on anything important. Now, may years later, you will not find any critical system in my office that is not backed-up at least once and usually twice.
These days there's almost no excuse for loosing your work. Don't go around blaming Dropbox, not their problem.
I am not associated with Dropbox in any way other than being just another user.
FEATURE REQUEST: It would be very nice if Dropbox client software could have an option to auto-magically COPY files in and out of Dropbox rather than allowing any files to be moved. The above-mentioned hack works fine but it'd be nice to not have to use it. Also, I'd venture to guess that most Dropbox users don't do this, which opens them to unintended loss.
It is in no way "pilot error". If the post you replied to is correct, the dropbox software performed an action it should never perform - deleting files the user did not tell the OS to delete. What's more, every dropbox plan I'm aware of keeps deleted files for 30 days, so if it wasn't possible to recover some of the files, dropbox failed not once but twice.
It's all very well blaming the user for not having multiple redundant backups, but that doesn't change the fact that deleting user data without explicitly being told to do so is one of the most egregious sins software can commit. It's not "pilot error" when the software performs an action that should never occur.
Let me put it into a financial context. I did a project that added up to some $800K in cost. Two developers over about a year. Do you think that for even a microsecond I would trust Dropbox as a backup mechanism without extensive qualification and testing? And, would I have that as my sole approach to backup?
And this isn't me coming down on Dropbox or suggesting that they are unreliable. I use Dropbox and I am very happy with the service. However, based on experience gotten the hard way, I do not use it for anything that is mission critical. The important stuff is backed-up locally, sometimes with redundant backups.
I don't have any sympathy for an engineer who uses a service and relies on it without the proper testing and qualification phase. Behaving that way is most-definitely, as I said, to be kind, "pilot error".
It is completely unfair to blame Dropbox for anything. This is like the idea of cutting-and-pasting code from the Internet and trusting it with a mission-critical aspect of your application without dissecting the code, testing it and fully qualifying it for what it is you need it to do.
There are tons of examples of this behavior, perhaps the most common one in the web development community is email validity verification. How many careless developers copy a regex like this to validate their email and don't think twice about it:
How many engineers (and, as a result, websites) fail to test and research and end-up with questionable solutions?
Would you blame the regex writer for this or the engineer who chose to use it without doing any testing at all?
It doesn't matter what the regex author claims or how authoritative the website featuring the regex might be, you have to test it before deploying it! If you don't test, deploy and then loose customers because it is a bad regex it is YOUR fault, not the author's. Blaming them is nonsensical.
In case you are curious, here's "the" solution:
Not only is it well documented, the author offers test vectors and all of the relevant code and references so that you can take the time to qualify the solution.
So, yes, to be redundant, if an engineer does not do his job I am perfectly happy calling it "pilot error" and even going as far as calling it "incompetence".
Dropbox is a consumer service, also useful for businesses. Nobody's trusting it with the nuclear launch codes. But no matter what, it should never, ever delete your files, unless you do it yourself. Period. And if it's ever unsure, it should ask you.
It's meant to be used with the primary copies of your files, not relegated to a special drive. If it deletes your files wrongly, the fault is clearly with Dropbox. It's marketed as an easy-to-use product you just install and use.
I cannot blame Dropbox if I loose data. Maybe the service has issues, but loosing my data is MY ISSUE, not theirs. The minute you trust service X with your data without full qualification and testing is the minute you lost your data. Don't blame them for it.
> Dropbox is a consumer service, also useful for businesses.
Same point. Same issues. If you put all of your documents and financial data on Dropbox (or service X) and don't take the steps to have local (and possibly redundant) backups you could easily be confused for a moron. Blaming service X for it is a cop-out. They had nothing to do with the total loss. Yes, they had everything to do with the partial loss.
It is my contention that it is your responsibility to fully qualify such services rather than using them blindly.
I use Dropbox extensively and have ZERO concerns about data loss.
Maybe experience has made me less trusting with what is important to me than a typical user? I could not imagine saving my kids pictures on service X without any other backups. That would be insane. And, if for any reason, service X looses my pictures I was the moron who allowed that to happen. They might suck, but it was my fault, not theirs that I lost EVERYTHING.
Maybe that's the real point I am trying to make: Loosing EVERYTHING is YOUR fault. Loosing WHAT IS STORED IN DROPBOX could be their fault. If you don't have backups for what you store in Dropbox (or service X) you are responsible for the TOTAL LOSS not service X.
These are the kind of lessons that for some reason don't transfer person-to-person like they should be only sink in once they've happened to you (at a guess, that's why you're being downvoted). Whether it is your corporate data worth $800K or your photographs of your first child hours after being born through the point that you realize that it is all gone data is priceless.
In the end it all boils down to the same thing: you can't outsource your responsibility on this one, and a single service holding your data still counts as a single point of failure.
Personal responsibility and the acceptance thereof is a major item on the checklist, anybody downvoting robomartin in this thread still has some learning to do on that front. Don't shoot the messenger if you don't like the message. Dropbox is not a backup, no matter what their marketing literature says and no matter how safe you think they are. In the end you and you alone are responsible for your data and its safety.
A back-up that you have not tried to restore might as well not be there.
There are two separate things going on in this discussion.
You have a personal responsibility to keep your data safe. I don't think anyone is disputing that.
Independently, softwares and services which handle user-created files should never, ever delete them without a direct instruction from the user. Regardless of your backup procedures, it is completely unacceptable for this to happen. Even if I had a copy of my data in a fire-proof safe on every continent, I would be appalled if I found that a file syncing and backup service like Dropbox had deleted some of my files and was unable to restore them, because i) it's an unacceptable thing for any software to do, and ii) it completely contradicts all of their company messaging.
Show me a business that is willing to offer you that guarantee and I'll show you a business that is out of business.
What if there's a fire? Or malicious destruction? Or massive hardware failure?
Anyhow, the only people who would down-vote some of the comments I am making are those who have never suffered the pain and agony of data loss. If you've ever been in those shoes you know, without an iota of doubt, that what I am saying is 100% on point.
There's nothing wrong with Dropbox. The problem is in how naively people are choosing to use the service.
Could they be better? Of course. Would I still have my own redundant backups and recommend that Dropbox users do the same and fully understand the service. Absolutely!
Personal responsibility is a bitch. It's much easier to blame someone else. Guess what? Blaming someone else for your failings will not make things right or bring back your data.
Look at bug fixing. There's a list of bug severity.
Crashes machine neeeding reboot
Crashes program but not machine (loses data, possibly saved data)
Mangles saved data
Feature not working
part of that feature not working properly
> There's nothing wrong with Dropbox.
There very much is something wrong with dropbox if it deletes user data without that user expecting it; ideally the user should give clear explicit permission for data to be deleted.
We're 100% in agreement on that. But since you can never be sure that the software that you're using is bug free (and in the case of dropbox there is now at least one more datapoint that confirms that their software contained at least one bug) you have to more or less count on losing all your data.
Dropbox messaging is of course not going to highlight the fact that their software may contain bugs, it simply isn't in their best interest (and none of their competitors do so either).
Maybe they have a section to that effect in their fine print but even if they don't on hacker news we really should all know better. Unless there is someone here that only writes bug free software... In that case please send me your resume.
What if the message is liked but the messenger is behaving like a silly sausage?
Everyone knows that multiple backups are important. If I'd had multiple backups, and Dropbox had deleted items that I hadn't asked it to delete, I'd still be able to comment about that because data deletion is serious business.
This thread isn't (at least, shouldn't) be about data being deleted; it's about unpredictable behaviour from software. When that unpredictable behaviour includes deleting data that that's a valid serious concern even if everyone has great backup strategy.
I think that's mostly a format issue, the underlying message is solid.
> Everyone knows that multiple backups are important.
My experience and your experience are apparently not the same in this respect. And I look at a lot of companies every year. The number of times the question 'do you do trial restores of your back-ups' is answered with either 'what backups' or 'no' is way larger than what you'd expect.
And those are companies, not individuals where I'd expect the situation to be much worse.
> If I'd had multiple backups, and Dropbox had deleted items that I hadn't asked it to delete, I'd still be able to comment about that because data deletion is serious business.
Yes, absolutely it definitely is. Dropbox dropped the ball here. The problem is that you can pretty much expect them to drop the ball on occasion, no large service has every been 100% without data loss. Amazon, google, dropbox, microsoft, they've all lost some customer data at some point.
It's the rule, not the exception. When you're dealing with data data loss is the thing you're working hard to prevent but it'll never be 100% perfect. It can't be. There will always be edge cases and the simpler you try to make it on the outside the more complex the actual software becomes. Complexity leads to bugs, bugs (can) lead to data loss.
> This thread isn't (at least, shouldn't) be about data being deleted; it's about unpredictable behaviour from software.
All software that I'm aware of contains bugs. The discipline and experience required to write bug-free software is universally claimed to be present in one industry only: aerospace/aviation. And even there they have bugs, just fewer of them at massive expense. Bugs are the norm, not the exception.
> When that unpredictable behaviour includes deleting data that that's a valid serious concern even if everyone has great backup strategy.
Yes, it is a valid concern. And the way to mitigate that concern is by looking it in the eye and saying 'I don't trust software'. Any software. Including dropbox.
He brought out the part of the HN crowd that wants things to be on-topic and/or useful instead of user-blaming. :)
Right. You didn't obviously read my comments, did you? Maybe you just skimmed them and chose to jump on the band-wagon?
I'll quote the most relevant and definitely constructive/useful parts here you don't have to bother with the links if you don't want to. If you do, please slow down and read it all. Then think about the idea of loosing, I don't know, a year's worth of work and blaming someone else for it.
Here it is (reformatted because HN does not have a much-needed block-quote markup):
My approach for Dropbox on both Windows and OSX is to have the
Dropbox folder on a drive that will never see development work,
even if that means creating a separate partition for it.
In most cases the system drive works fine (you do have separate
system and data drives on your machines, right?).
This forces a COPY operation rather than a file MOVE if you
drag-and-drop files into any Dropbox folder. Which means that
everything in the Dropbox folder could be trashed tomorrow and
nothing whatsoever would be lost.
Then I also said:
FEATURE REQUEST: It would be very nice if Dropbox client software
could have an option to auto-magically COPY files in and out of
Dropbox rather than allowing any files to be moved.
The above-mentioned hack works fine but it'd be nice to not have to
use it. Also, I'd venture to guess that most Dropbox users don't do
this, which opens them to unintended loss.
And then, trying to understand further, I asked this:
So, I'll say that overall I've been pretty constructive. If anyone adopts my approach of storing their Dropbox folder in an unused drive they are almost guaranteed that Dropbox can't loose their data. That is worth money.
Please point me to a link or blog post with your immediate solutions to this problem. I'd love to compare notes. Maybe there's a better approach. Always interested in learning. I am sure there are lots of people who, tonight, might be interested in adopting your approach to ensuring that accidental data loss by a remote service of any kind does not mean total data loss locally.
Mark Twain: A man holding a cat by the tail learns something he can learn in no other way.
I love done that. I've held a number of cats by the tail over the years. And, he is right, sometimes you don't learn until you have claws painfully sunk into your skin, no-matter what anyone around you might say.
And, I love Dropbox. It's a great service. As far as I am concerned, there's nothing wrong with it. If you own your data Dropbox could go out of business tomorrow and you would not care.
How the heck do you effectively test a service like Dropbox fully, without making it your full-time job (IT dept). Take the other comment on this article, where someone fired up an old machine and it wiped out his files. What test would have caught that?
And, if you can't reasonably expect to fully test it and verify it's reliability (of any service) then you DO NOT rely on it for critical data you cannot loose. Choosing to do so and then crying foul when you do loose data isn't engineering, it's, at best, wishful thinking.
Again, this does not mean that Dropbox --or any other service for that matter-- is bad. Not at all. It just means that you have to understand what you are walking into and, if you can't, or don't, then take steps to safeguard from the potential for external data loss, because you just don't know.
I'm not sure knowledge of an obscure Windows-specific drag and drop quirk should be a requirement for using Dropbox safely.
> you do have separate system and data drives on your machines, right?
I'm not sure use of a partition table editor should be a requirement for using Dropbox safely. Most people use the operating system that came installed on their PC as-is.
> daily system and data backups to external media
Dropbox makes the following claims on their website:
"Your files are safe"
"Even if you accidentally spill a latte on your laptop, have no fear! You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost."
"Even if your computer has a meltdown, your stuff is always safe in Dropbox and can be restored in a snap. Dropbox is like a time machine that lets you undo mistakes and even undelete files you accidentally trash. "
I think it's completely reasonable to expect Dropbox to be a reliable backup solution as-is.
> I'm not sure knowledge of an obscure Windows-specific drag and drop quirk should be a requirement for using Dropbox safely.
Not a "obscure Windows-specific drag and drop quirk". This is how the OS works. If you are a developer you need to know your OS.
> I'm not sure use of a partition table editor should be a requirement for using Dropbox safely.
It isn't, but if you are smart you'll take my advice and go implement it right now.
> Most people use the operating system that came installed on their PC as-is.
Developers are not "most people" and should certainly not behave as such. Your data is the only thing that is important. Your work product. Investing in separate drivers (not partitions, separate physical drives) on every development machine is invaluable. And, backing-them up with disk images on a daily basis is just as important. If the system drive goes out your data is intact. If the data drive goes out the system is intact. Recovery is an academic exercise. If both drives go out it is a little more of a pain in the ass, but at most you'll loose a day's worth of work, not a lifetime or a whole year's worth.
> Dropbox makes the following claims on their website:
Pardon the raw-ness: I don't give a shit of what anyone says or claims about any service. Neither should you or any serious developer. You, and only you are responsible for the safety of your data. You HAVE TO assume failures, not just locally, but also remotely. Hardware fails. Software fucks-up. Even if Dropbox (or name your service) guaranteed 500% redundancy I'd still have my own backups. It would be irresponsible to do otherwise.
Then there's the practicality of the whole thing. The machine I am using to type this has the following physical drives (not partitions on a single drive, these are independent pieces of hardware in the chassis):
Internal backup drive
External backup drive (USB)
The data drive alone has over 300GB of data right now. The Library and Development drives are about 150GB of data each. That's 600GB of valuable work in three drives probably representing millions of dollars of work-product and value.
So, Dropbox is backup, right? Well, uploading 600GB to your "backup" would take somewhere in the order of 100 days of continuous 24/7 upload with my current DSL connection (~600Kbps upload speed). That's a third of a year. Nope, it's not backup.
Also, since this is HN, let's talk about how we build software, as in, put yourself in the position of the Dropbox developer and instead of just the end user. Most of us, I imagine, build software used by non-developers. I hope your attitude towards these things isn't "fuck it, if they were a competent developer like me, they'd have known better", and instead, "damn, I really need to go bulletproof my software so it doesn't delete people's data like that." In that context, there's plenty to criticize and learn from in Dropbox's mistakes here.
Dropbox for teams is specifically marketed towards businesses.
Did you read the article this thread is about?
You could, of course, take the approach that Dropbox should be there to protect everyone from themselves.
I have, in other comments, given the example of my own usage, which guarantees that no data can ever be lost by Dropbox. It's simple and it works. I've been constructive here. If you implement my approach you should be safe. Still, don't take my word for it, implement and verify. That's engineering.
I have also, in another comment, posted about a feature request that might help insure that this does not happen to casual users:
Dropbox, if possible, should modify their client software such that files can only be copied in and out of Dropbox folders and never moved. For the adventurous, they could make this an option that is set by default but can be disabled. A huge warning about the potential for data loss would accompany the act of disabling this feature.
In that regard, I've been as constructive as one could be in this thread.
In the end, you can't protect everyone from their own mistakes, incompetence or carelessness. And, no, I don't think everyone is stupid. We all make mistakes. I had a very painful data loss event in college, so I learned my lesson very early on. I have never lost valuable data since.
I am not going to blame a service for something that is the responsibility of the user. Engineers, in particular, have no excuse for data loss. A 2TB external hard drive is about $100. Please.
You could also argue that business users in this day and age cannot be computer illiterate. Being "literate" does not mean being able to open a browser and push a mouse around. If a business owner is not capable beyond that they ought to be smart enough to hire a qualified IT service to help them manage and secure their systems. If they do have in-house IT then data backup and security ought to be one of their top functions.
Private users are a little bit of a different story. There's a huge chunk of them that, today, could loose every digital asset they own and have no way to ever recover it. This is still a business opportunity for a startup somewhere. Services like Carbonite and others abound:
At one point you have to point your finger at the data's owner and say "You, and only you, are responsible for your data".
I can guarantee you that if you read the TOS of any online data backup service there's a clause there about the potential for accidental loss. There is no way in hell that an attorney would advise anyone not to have that clause there. And, no matter how good of an engineer you might be, there is no way you would risk it all to offer some kind of an absolute no-loss guarantee.
The test is simple: Would you, personally, be willing to be sued into absolute poverty for issuing that guarantee? Probably not.
So, we can sit here in righteous indignation for me daring to suggest that users are ultimately responsible for their data and lie to each other or accept the reality that the reality, developer or not.
Down-vote away, but I am right.
If we go back to the very original article that this thread is about, the programmer who joined the startup team seems to have not had any explicitly-created local backup of the data in question. He got lucky. He said "The client left all the files on my machines, so I didn’t lose any personal data - it wasn’t a catastrophic failure.". That's luck, that is not planning. So, he did OK, he talks about having to re-upload some 2GB of data to a new account. He was off and running after that. I can bet you that, unless he isn't very smart, he now has private local backup of his data in case something like this happens again. And that's the right way to handle it.
Don't trust anyone with your data unless you can independently verify the ruggedness and reliability of their solution. Period. If you did not do that, engineer or "civilian", total data loss is on you, not the service.
Apparently you've never heard of incremental backup? Dropbox may not be a backup but there are many services that backup via the internet. Assuming you don't change backup systems every month, 100 days for the initial sync is perfectly acceptable.
> Dropbox may not be a backup but there are many services that backup via the internet. Assuming you don't change backup systems every month, 100 days for the initial sync is perfectly acceptable.
Oh, please. Again.
If you don't have local backup, for a third of a year you have no backup whatsoever. Even after that, depending on the nature of your work, even incremental backups could have you unprepared for failure for days.
Not saying at all that remote backup is a bad idea. Not at all.
Remote backup BY ITSELF, without local high-speed backup that you control is a very bad idea.
The best pattern is to have single or redundant backup under your control (yup, use that "incremental" thing I didn't know about) and remote backup. Don't expect anyone backup destination to be 100% reliable, not even local backup. If your data is important you need multiple redundant backup, local and remote. Then you can sleep at night.
My local backup strategy is a collection of external --and these days inexpensive-- hard drives as well as a large rack-mount NAS RAID array. Each of about a dozen systems has it's own local external backup drive right on the desk next to the computer. Some have dual local backup drives. We are talking in the order of $100 for a couple of terabytes today. Then, a number of systems also backup to NAS. Every so often we rotate drives for longer term storage at a fireproof external location. It'd take a lot more than Dropbox or any service having a glitch for me to loose any data.
I really don't understand folks who don't, at the very least, have one external USB backup drive on their system. On OSX you have Time Machine which is ridiculously easy to use. On Windows you can spend a few bucks and get Norton Ghost and you are good to go. All-up, probably not more than $200 per system and maybe half an hour to set it up.
Do that plus my recommendation to host your Dropbox location on a dedicated partition in order to force a copy operation during drag-and-drop (both Windows and OSX) and you will not care less about anything that happens at Dropbox or any other service.
It's about engineering, not hoping for, a system to protect your data.
As far as remote backup is concerned. I'd be interested in a system that might allow me to send them encrypted disk images on physical media for backup while providing some online access to the same.
Even with incremental backup you have to do a full backup every so often. In the case of our Windows systems running Norton Ghost, they are setup to do full backups the first day of the month and incremental backups every day after that. It's dead-easy, reliable and works great. Saved my hide a number of times.
A full backup of about 600GB happens in --I think-- about three or four hours. That's the problem with remote backup, the same full image would require a third of a year on a typical DSL connection available in the US today. Actually, it could take twice as long, two thirds of a year, because you would have to interrupt your backup in order to get your bandwidth back for use during business hours. So, if it takes you nearly a whole year to backup this much data the whole thing is just-about useless as implemented. Your incremental backups are likely to take days and you can't even consider the idea of doing full images every thirty or sixty days. That's what's broken about the concept of remote backups without even looking at the issues with potential software bugs at the various providers that could lead to data loss.
A more usable system would be one that, as I said, would receive my full images in physical media to absorb into their storage arrays for both backup and remote access purposes. If you needed to recover a few files here and there you could easily do so over a decent DSL connection. Full recovery would require physical media being shipped to you at a greater cost. Every x number of days you'd send a new full image set and go incremental after that.
The game changer here will be if we ever get to 100Gb network connectivity to the home and office. That would change the landscape in amazing ways. You could talk to remote storage probably as fast as you talk to local storage. At that point in time, having multiple redundant and geographically separate remote backup locations might very well be the most sensible approach to an organization's backup strategy. Such a system could even talk to a locally installed "backup server" in order to make sure that if connectivity is compromised in some way you still have access to your organization's data during the blackout.
The topic of backup is conceptually very but becomes really complex when you consider the multiple potential points of failure and how to deal with them.
This is why I don't consider any issues at Dropbox to be serious. I obviously don't think of them as backup. And they can't convince me to think that way no matter what they do or say. This isn't to say that I think the service is bad. Not at all. It's because I've been around and I've seen too many failures (some of my own) that I take a very careful and guarded approach to my data. And that's healthy. I use Dropbox for team communications. I almost think of it as a really neat way to "FTP" stuff around. So, instead of setting-up my own FTP server and having to manage my users and storage I can use Dropbox. No data is ever moved to Dropbox. All data is copied to Dropbox. That means that the data remains locally stored and, more importantly, locally backed-up every night. So, through engineering, failures at Dropbox or anywhere in between my DSL connection and their data centers are of no consequence whatsoever.
The thing is, you don't need independent full backups within a single online backup provider, and they would probably deduplicate your backups anyway. Incremental backups work just fine, and will always get your data backed up that night unless you are the kind of person that generates many gigabytes of content in a single day.
I use Dropbox for team communications. I almost think of it as a really neat way to "FTP" stuff around.
Yeah, I can understand that. I was considering replying to one of your other posts with the comparison, but I didn't want to start too many thread at once. The problem is that when you use dropbox as a better FTP, you lose out on a lot of the benefits of syncing. No longer can you go to a different computer and pick up right where you were, if you didn't happen to copy in the files again since the last change. If you're sharing files with a coworker you no longer have any idea where the most recent version is. You don't have a full list of file versions. And your FTP-method of dropbox doesn't really have anything to do with backups. You could move files into the dropbox folder and then set it to be backed up every five minutes if you wanted to.
I just couldn't bring myself to using it that way. Our files are our work product. There is no way I could consider having the only copies of the files connected to a service that could cause total loss of data. It is, in my world at least, an absolute non-starter.
With regards to the idea of sharing files with a coworker and not knowing where the latest files are, well, that's what Git is for, isn't it?
As a matter of principle I have a bad reaction to the idea of calling a data loss event "Another Dropbox horror story". And this has nothing to do with Dropbox or any other service. Dropbox is not responsible for your data. You are. If the data is important enough that total loss would be catastrophic, what are you doing placing all the eggs in one basket? It isn't a Dropbox horror story, it's a story about someone who didn't care enough about their data to safeguard it and then the blame is shifted towards whoever last held the data. That just ain't cool. I own my fuck-ups. It isn't fun, but I stopped blaming others for my crap a long time ago. In the end it works out better that way.
Does Dropbox have issues to fix? Sure. What piece of software doesn't. Knowing that software is imperfect is more of a reason to not, again, place your valuable eggs in one basket.
I have a feeling that there are a lot of young and inexperienced people surfing HN who have never lost anything of significance. They come across someone like me who calls bullshit when he sees it and is willing to stand for what is a solid position and they don't understand it. The only tool they have to express themselves is to mindlessly down-vote cargo-cult style. That's fine, with that approach they'll learn soon enough.
I don't know of anyone --not ONE person-- with say, twenty or thirty years of successful work in computing who would trust the only copy of their projects to any one service --no matter what they claim or how good they might be. Hell, most of these people, just like me, would not trust the only copy of their projects to any single device, local or not. Shit happens. And some of us have seen it happen and have had to clean-up the mess on more than one occasion. Eventually you learn.
I do love Dropbox, but they are not going to come and redo two years worth of work if something goes wrong. And that can happen. And, if that were to happen, I would not blame them. I'd blame myself for being a moron and not having my own backups. Maybe I'll figure out a way to use their active sync technology while still satisfying the requirement for solid redundant backups. Until then, I have enough problems with other stuff to have to worry about having only one copy of our projects stored anywhere, local or not.
I own more than one business and we also do work for other businesses, so, generally speaking, the "Data" drive holds business data partitioned-off into appropriately-named directories.
I also have what I call "support" directories there. For example, I have a "SolidWorks Support" directory with models, macros, tools and other items that are there to support work with that software. The same is true of Photoshop, Altium Designer, Web Design and other segmentable tasks.
Then there's more mundane stuff, like "Digital Photography", "Personal", "Scans", "Temp", "Outlook Data", etc.
Projects started to kind of pollute the data drive in some form. Also, the directory hierarchy started to become a little crazy. For example:
D:\<company name>\Projects\<project name>\Design\Mechanical\<project work folder>\Suppliers
D:\<company name>\Clients\<client name\...
At one point I thought that I should segregate active development from finished or shipping product. In other words, make a distinction between "project" and "product". That's why I introduced the "Develpment" drive.
There's another reason. When doing FEA thermal and/or flow simulation for various projects the Data drive started to become clogged-up with simulation data. As you navigate through many iterations of design ideas and simulations you can easily fill-up gigabytes. I didn't think that this belonged anywhere near the "Data" drive.
Finally, some tools (Xilinx ISE back then) did not take kindly to paths with spaces. So I always had to have a separate folder for Xilinx projects which has always bothered me.
This "Development" drive now allows me to create a simple folder with a short path:
I also have an "xampp-sites" folder on there for web projects. The projects can live in their own directories in the "Development" drive:
Once a "project" graduates to being a "product" it is zipped-up and copied or moved into a "Products" folder under the appropriate company directory in the "Data" drive. As an example, if we are talking about an iOS project it turns into a product once it is submitted to the app store and approved.
The jury is still out on whether or not this is a good idea or simply a reflection of being insanely anal about organizing data. I don't know. I am always open to new and interesting ideas on this front. I try to think of the "Development" drive as a dirty or work drive of sorts where I can mess with a lot of things without polluting the "Data" drive. I have this implemented in one machine and have not been compelled to do so in other machines yet. Still thinking about it.
With that said, I think Dropbox is a pretty amazing service and I credit them for getting Google and Microsoft to wake up and make some similar services which are solid and different in some respects (Google Drive and SkyDrive).
So, you can travel two paths: One where you believe that service X can absolutely-positively not loose your data for whatever reason. Or, another, where you understand that your data can be lost at any time and for any reason by both remote service X or even your local $100 USB drive.
If you choose the first path, you are going to get stung sooner or later. And, it is my contention that blaming the service provider for your total data loss is nothing more than not wanting to admit the truth of the matter.
If you choose the second path, which could sound really paranoid but is actually very realistic, you take steps towards creating enough redundancy that a single point of failure isn't going to burn days, months or years of data. And, while no absolutes exist in this world, it doesn't take much in this day and age to have a system that will almost guarantee that your data is safe from most loss-inducing events.
I consider problems like data loss to be serious engineering problems. The difference is that I choose to look in the mirror first and blame myself first before pointing the finger at someone else.
This removes the main point of Dropbox, that when you edit your files they're automatically continually re-uploaded. If you're only making copies into Dropbox, you might as well just mount an SFTP server and save a bunch of money.
It's certainly neither 'dumb' and nor is mentioning a friend who did this down-vote worthy.
The down-vote wasn't for posting about his friend's experience. The down-vote was about his friend and also for blaming Dropbox for something that was 100% his friend's responsibility.
Let's bring it down to a "skin in the game experience".
- Save-up $250K
- Invest all of it in a startup
- Hire a couple of programmers to help you out
- None of you setup any local backups
- You choose to rely on web service X for backups
- You don't test anything 'cause everyone is doing it this way
- You still don't have local backups
- A year later something goes wrong and you loose all of your work
- You just lost all of your time, money and the startup tanks
- You go online and blame service X for your loss
Clearly service X had nothing to do with the series of decisions and actions that led to your loss. Your loss was due to "pilot error" and plain-old home-grown incompetence. Nothing more, nothing less. Blaming someone else might feel good, but the reality is that a less-than-professional treatment of the matter is what caused the loss. Service X was just along for the ride.
I quit blaming others a long, long time ago. In my experience you can do forensics in almost all of these situations and identify someone who was either dumb, lazy or incompetent who ended-up giving you the gift of irrecoverable data loss.
Like I said in my original post, I was EXTREMELY lucky to have learned this lesson while in college. I lost six months of work of a project for the Physics department to a drive failure. Horribly painful. Ugly. I am so thankful for having learned that lesson in that context. It would have sucked to have learned it outside of academia and while working on projects where data loss could have resulted in significant financial loss.
My sentiment stands: With local storage being so plentiful and inexpensive these days there is no excuse for not having multiple redundant backups of anything that is important. You can even have a policy of shipping physical redundant backups to another storage location to mitigate the possibility of your local backups being compromised by something like a building-wide fire.
In another post I treat the other fundamental issue of backups over connectivity such as DSL.
I reached out to Dropbox support but got no response yet.