
Dropbox Bug Can Permanently Lose Your Files - joshuacc
http://konklone.com/post/dropbox-bug-can-permanently-lose-your-files
======
jpadvo
It's always important to remember the difference between a _syncing_ service
and a _backup_ service. A syncing service sometimes feels like a backup,
because you can use it to recover files if a local device is destroyed or
lost. HOWEVER, any service capable of syncing files is equally capable of
destroying them.

It's important to have an automated _one-way_ backup system that you can
manually restore from. Something like Tarsnap [1] looks like a really good
possibility (I haven't used it myself, but it seems solid)

[1] <http://www.tarsnap.com/gettingstarted.html>

~~~
fusiongyro
Tarsnap is amazing for Unix systems. Flat out, one of the coolest services
I've ever used, and one of the cheapest. I'm backing up my VPS with it, and I
can't recommend it enough.

I have a friend that strongly recommends CrashPlan, but I haven't tried it out
yet on my Mac. I'm curious to though.

~~~
bitcartel
Doesn't Tarsnap store your data on their S3 account, rather than your own? If
so, if how do you get your data back from Amazon if Tarsnap vanished tomorrow?

~~~
acabal
You can say that about any service--what if Crashplan's data center caught
fire tomorrow? Tarsnap might be run by a single guy (I think?) but saying S3
is "more" or "less" reliable than any other private company isn't a great
comparison. In any case a massive company like Amazon is the most likely to be
reliable in these cases, I imagine.

~~~
bitcartel
That's why I cobbled together a poor man's cloud RAID :-)
<http://news.ycombinator.com/item?id=4689238>

My earlier point was that I think data is stored on Colin Percival's S3
account (he is the creator of Tarsnap) and therefore you might lose access to
the data (if he couldn't pay the bills or got hit by a bus) even though S3
itself is fine.

------
matt_holden
(Hi all, I'm the PM for the Dropbox desktop client team.)

I just wanted to let you all know that we take any claims like this really
seriously. There aren't any known bugs on the Dropbox side that would cause
this, and unfortunately there are potential causes such as hardware errors,
filesystem corruption, and other OS issues (including those like
[http://www.phoronix.com/scan.php?page=news_item&px=MTIxN...](http://www.phoronix.com/scan.php?page=news_item&px=MTIxNDQ)
which another poster pointed out) that can corrupt data or create zero byte
files.

Nevertheless we will continue to look into this just to be sure, and we also
work hard to find ways for Dropbox to shield users even when the OS, disk or
other components fail (our undelete/file revisions and Packrat are among
these).

~~~
CamperBob2
Text of originalgeek's hellbanned(?) reply to this subthread:

    
    
      I ran the find command suggested by the OP, 
      and it came up with a long list of files -- all 
      names that I had intentionally and manually deleted. 
      It seems when one of my "other" machines booted up, 
      it put 0 byte files back in their place. A review of 
      the file's history, by clicking Dropbox -> Browse on 
      Dropbox Website shows the original file, the day I deleted 
      it, then a few minutes later, a 0 byte file added back.
    

Edit: posted because it seemed like the sort of information I'd want if I
worked at Dropbox and were looking for clues to the nature of the bug. If this
isn't kosher let me know (rather than simply downvoting) and I'll delete the
reposted comment, if possible.

~~~
peterhajas
Why was this reply hell banned?

~~~
CamperBob2
The user who posted it presumably ran afoul of a rule (possibly an unwritten
one) in an earlier comment. You'd need to dig through his/her comment history
to try to figure out what happened.

~~~
Dylan16807
Sometimes. There is also a flawed system that can auto-hellban people unlucky
enough to get a bunch of downvotes on an early comment, before they have a
karma buffer.

------
apike
I have all my data, media, and documents in Dropbox (80GB) and while I have
11255 zero-byte files, none of them are likely Dropbox's fault. Most of them
are empty logfiles, .svn and .git noise in old projects, and the like.

------
goostavos
How many of you is this affecting? I'd be curious to know how many keep the
_only_ copy of their file on dropbox, or any cloud service for that matter.
I've never trusted any service enough to be solely responsibly for my
important files.

I use Dropbox primarily as a tool to synch content between my
desktop/laptop/phone, but any significant change I make to those files gets
saved locally 100% of the time.

I am not a very trusting man.

~~~
blaines
Somewhere between 2000-2892 Files...

I'm turning Dropbox off now and disabling it on startup.

I'm thankful for this post, now to figure out what's going on.

~~~
blaines
My photos folder was zeroed... =/

------
uptown
Dropbox for syncing.

Crashplan for backing up.

That combo hasn't failed me yet.

~~~
bitcartel
I quite like Crashplan, but they have lost data for customers before.

[http://jeffreydonenfeld.com/blog/2011/12/crashplan-online-
ba...](http://jeffreydonenfeld.com/blog/2011/12/crashplan-online-backup-lost-
my-entire-backup-archive/)

~~~
ejdyksen
PSA: with CrashPlan, you can simultaneously backup to your own server(s) or
external hard drives right alongside their service. You can also backup to a
friend's computer (this was actually their original business model--P2P
backup).

~~~
usefulcat
Clarification: you can back up to your own servers iff said servers are also
running Crashplan. What you can't do, unfortunately, is back up to an ordinary
shared folder.

~~~
ejdyksen
Further clarification: this is correct, but (if someone is reading this and
thinking about doing it), you can install the free version of CrashPlan and it
works just fine as a backup server.

~~~
Dylan16807
It's kind of a ram hog, though.

------
c16
Yep, I'm affected by this bug too.. Thankfully nothing important, but makes me
wonder if I should leave my truecrypt folder on DropBox.

~~~
gte910h
You should both distribute your file you want to keep to several devices, as
well as a service who's job it is to back things up.

Dropbox is only the former.

~~~
c16
I have a backup on a USB, but updating that daily is a hassle hence why I used
DropBox- their added 2 step auth was also a big bonus, but if I lose that file
I'll have a hard time getting my passwords back..

As jpadvo said, I might look into making some one way backups to S3 or
something.

~~~
gte910h
Buy a time capsule and back it up locally or get a subscription to something
like crash plan pro.

------
mech4bg
It surprises me that someone would rely entirely on dropbox as their backup
tool AND primary file location, especially when it happy manipulates the files
locally.

I backup to an external HDD and to the cloud and still have the originals (as
well as having extra copies again of my music and photos synced across my
computers) - the more redundancy you have the better.

It sucks that so many people need bad stuff to happen to them to do something
about it - I'm so thankful that storage became so cheap before anything really
catastrophic happened to me. I've lost data in the past but it was back when
so many things were offline, nowadays it's CRITICAL to have a good backup
plan.

~~~
konklone
I didn't go into detail on my backup strategy in the post, but it's (slightly)
better than that. I only use Dropbox for media - music, images, photos. Some
if it is irreplaceable, but none of it is life-crucial. That stuff is backed
up separately, and copied to private space on my web server.

~~~
mech4bg
I think I consider photos to be some of the most important stuff to be backed
up... obviously you can live without them, but as you said, some photos are
irreplaceable, and can hold so much meaning. At least with Facebook / Picasa
Web Albums many of the most important photos have been backed up (potentially
at a lower quality).

To be fair I never would have considered such a bug when using dropbox - I
would probably have considered it safe considering you have a local copy of
your data, especially since as other pointed out they present themselves as a
backup service.

I almost got hit by this locally actually, I synchronize folders against
multiple PCs, and the exact same thing happened, and a number of files had
their bytes zeroed out and then this was propagated through the network.
Thankfully I spotted it before all copies were overwritten and fixed it, but
that's where you also want something like Time Machine.

Ugh, so many ways to lose data, even when you're doing the 'right' thing!

~~~
konklone
You're right, photos can be irreplaceable and I'm probably going to change how
I back them up now. My choice of what is and isn't in Dropbox isn't actually
based on importance -- but on what I both want on my laptop, but would want to
restore if the laptop were stolen. Kind of weird criteria, but it (at one
time) made sense to me.

------
M4v3R
Isn't this related to Ext4 bug that was found recently? [1]

[1]
[http://www.phoronix.com/scan.php?page=news_item&px=MTIxN...](http://www.phoronix.com/scan.php?page=news_item&px=MTIxNDQ)

------
hnriot
Wouldn't this seem like something that can be easily solved. Dropbox or any
rsync+rcs scheme will never destroy a file unless either one endpoint of the
sync has truncated the file or there is a file system corruption. In the case
of the former the rcs portion will take care of things, restore the pre-
truncated file and have that sync'd around. In the case of a file system
corruption Dropbox should be doing file integrity checks. For example some
simple things that spring to mind, most file types b&w e built in signatures,
header blocks, that validate the files extension type, if the file fails to
"look like" the claimed extension type then don't sync it, it flag it for user
attention.

I do Dropbox manually, I rsync a list of folders to a linode instance running
mercurial. It's simpler, scriptable, more flexible and as fail safe as
Dropbox. If something does get corrupted its really simple to go back in the
version history. I can't remember the last time I had a file system corruption
with ext3, I suppose they still happen, but not to me in solid use for years.
Obviously my mercurial repository is also backed up regularly.

I don't agree that a backup system has to be restore tested periodically,
instead I believe that the restore process has to be an integral part of the
workflow. In my scenario, I rsync to hg, commit and push, then from other
machines in my workflow (or more likely vm instances) I pull+rsync back. This
way the backup and restore cycles are just part if the workflow and everything
is version controlled at the same time.

------
duncans
If you're on windows the following Powershell may help diagnose if you've been
affected:

    
    
        gci $env:USERPROFILE\Dropbox -r | where { $_.Length -eq 0 }

------
tnorthcutt
Wow.

15 Files affected here. I haven't checked to see if any of them are
unrecoverable (none of them are vital), but this does seem like a very bad
bug.

~~~
andreasvc
Are you sure those files were actually truncated? Because the command will
simply show files of 0 bytes, which is not necessarily an indication of a bug,
only if the file was supposed to contain something.

------
bitcartel
Although Dropbox isn't a backup service, it offers file versioning, so you
would think you could recover from such a situation. Google Drive also offers
file versioning, but the devil is in the detail - they do delete older
versions:

[https://support.google.com/drive/bin/answer.py?hl=en&ans...](https://support.google.com/drive/bin/answer.py?hl=en&answer=2375120)

For RAID like protection with Dropbox and other providers, you can roll your
own BRIC (Redundant Bunch of Independent Clouds). I did this using Tahoe-LAFS
to stripe data across storage providers. Requires a bit of set up, has some
caveats, but does work. If you use with duplicity you have versioning on top
of a distributed, encrypted, redundant store.

<http://news.ycombinator.com/item?id=4689238>

~~~
konklone
Part of what makes this such a bad bug is because Dropbox's file versioning is
suffering from it. I have file versioning enabled, but not all of my files
were able to use it, because their pre-0-byte history had been wiped.

------
acabal
Never trust anyone but yourself with valuable files like family photos. I
don't care if they promise 110% uptime and reliability. The files you really
care about, in the very practical end, are your own responsibility regardless
of how much money you pay to other people. You can sue them till you're blue
but it won't get your files back.

To that extent I keep my important files backed up not just in Dropbox, but
also to Crashplan, and to a spinning-rust hard drive especially kept for
backups that I protect in my home. That's three points of failure I can
recover from if something goes wrong; and if all three fail at once, then I
probably have worse things to worry about, like the zombie apocalypse.

~~~
andrewljohnson
The last person I trust with valuable files like family photos is myself.

I am a bad sysadmin, bad at back-ups, bad at security, bad at redundancy. And
I would guess that describes 99.99% of people who care about family photos.

~~~
acabal
Well, that's why I trust two other companies, plus myself.

I don't know, doing basic backups aren't super-hard for someone who's reading
HN. My setup is really just a regular Linux box with a 1TB HDD sitting in a
closet of mine, with dynamic dns pointing to my own domain (not even a strict
necessity), and I rsync it whenever I have new data like photos. That box in
turn syncs to Crashplan and Dropbox, which is automatic.

Yeah rsync and the concept of a backup PC is beyond mere mortals, but for
those of us here--if Dropbox (or any service) is your only backup, I think you
can only blame yourself.

------
shangaslammi
I've also had files zeroed out on a few occasions. I run Dropbox over a wide
variety of machines and while this is completely anecdotal, the zero update
has always come from one of my Ubuntu machines, which leads me to believe that
this is specific to the Linux version of Dropbox.

Also, I had several occurrences throughout 2011 but so far hadn't lost
anything during 2012, so either something has been fixed or I've just been
lucky. :)

So, if you've noticed zero length files lately, can you check the timestamp on
the last update? Is it recent or over a year ago? You can also go to that date
in your event log in the Dropbox web UI and see what was happening around that
time.

------
ChuckMcM
Thinking about how dropbox might be implemented I wonder if the following
series of events is possible:

1) Process which syncs files to the server fails to get access to a local
fils, sees it as length 0

2) Process proceeds to tell the server file is length 0 and the server updates
the file to be 0 length.

3) Access to the local file is restored and client notices that the server
file is 'newer' than the local client version and it was length zero so it
truncates the client copy to length 0.

The thing is, I can imagine a number of scenarios where the local file might
seem to be zero length (oplocks on NTFS volumes being one)

------
intellegacy
Happened to my mother and her friend a week ago. Only they lost all their
files. They weren't permanently unrecoverable, thankfully, only deleted.

The trust my mother had in dropbox is now gone, and probably will remain so
for the next couple years.

~~~
keithpeter
As an end user myself on Linux, I was worried to read this. I keep teaching
materials in the Dropbox folder so only a small file size, but they are
important. As mentioned below at <http://news.ycombinator.com/item?id=4704176>
I have developed the bad habit of _editing_ files directly in Dropbox. I may
need to revisit this and use Dropbox as a sync only, edit elsewhere and copy
over.

The command

    
    
       find /home/keith/Dropbox -size 0
    

shows only cache files for deleted files, and some backup files that I saved
while empty (I know those _should_ be zero bytes).

A personal work around is a simple bash script to copy Dropbox directory to
another with the date as directory name, I'm running this once a day or so.
Then my normal old-school backup onto an external drive will catch each day's
dropbox.

Surprising how convenient I found automatic file sync, and how quickly I came
to trust the dropbox daemon running in the background on 3 computers!

------
wikwocket
If you're on Windows and want to quickly check if you have 0-byte files in
your Dropbox, open the Dropbox folder in Explorer and type "size:empty" in the
search box, and hit enter. It will show any 0-byte files.

If you'd like a batch file to check this, that you can run periodically
(perhaps programmatically), try this:

    
    
      @echo off
      for /r C:\Users\username\Dropbox %%F in (*.*) do (if %%~zF LSS 1 echo %%F)
      pause
    

If you'd like to echo that list to a file, change the middle line to:

    
    
      (for /r C:\Users\username\Dropbox %%F in (*.*) do (if %%~zF LSS 1 echo %%F)) > C:\Temp\emptyfiles.txt

------
piyush_soni
Dropbox has lost my files as well. I had kept some very important private
files there which I just wanted to keep there without the need of updating
them/syncing them. After uploading, I removed the local copy of that private
top level folder, as I didn't want to keep a local copy in my computer. With
the next sync, this STUPID SH*T removed the folder from their server as well,
without any warning of any sort! Any system which deletes the files without
asking, should just not be used or depended on.

~~~
djrogers
It sounds like you don't quite understand how Dropbox is supposed to work - it
replicates whatever you have locally to the cloud and back down to your other
machines. Obviously with a design like that, _deleting files_ from your local
dropbox would delete them from other machines.

This isn't a case of dropbox losing your files, it's a case of you not
understanding how the tool was designed and how it works.

~~~
piyush_soni
Many sync software offer multiple types of sync (e.g. MS SyncToy), and many
are extremely cautious before deleting multiple files. Dropbox didn't do that.
Yes, I failed to totally understand how did it work, as I deleted one of the
root level folders and expected it'd not sync a folder which is not there on
the local machine, but one of the duties of software is to prevent or try to
reduce common human errors.

~~~
Dylan16807
Dropbox is way simpler than that. It syncs precisely one folder.

There is really no way for it to not sync file deletions without making a
massive clutter.

------
mike-cardwell
Just added the following to the crontab on my laptop so when it boots up, it
emails me about any zero byte files.

    
    
      # Look for zero byte files in Dropbox
      MAILTO=my.email.address@example.com
      @reboot /usr/bin/find ~/Dropbox/ -type f -size 0
    

I also have full system backups going back three months, with two hourly
incremental backups for the last two. So if I get an email about any zero byte
files, I should be fine restoring them from my own backups.

------
gcv
No problems on my system, but I make a point of backing up everything
sensitive to an external hard drive occasionally, and using Arq to do hourly
backups to S3.

------
larme
For all the people saying you need a local backup, in this case you not only
need to keep a local backup, but also it have to be a versioned one.

~~~
bruceboughton
Backup isn't backup unless it's versioned. Otherwise, how do you restore that
file you accidentally deleted a few days ago?

------
iceron
FYI: I'm pretty sure Windows users can navigate to their Dropbox folder and
type 0 bytes in the search bar to see if they were affected.

~~~
nosecreek
I had to type `size: 0 bytes`

------
nosecreek
Can someone clarify the pricing for Dropbox's packrat? Is it free for pro
users and free users have option to pay 3.99/month to get it, or do you have
to be a pro user and then you can pay $3.99/month to get it? I think its the
later, but I'm finding the Dropbox help page for it really unclear (sounds
like it is free for all pro users).

------
whichdan
How viable would it be to keep all of your data in a git repository? Let's say
I'm backing up 25gb of music, could I have Dropbox sync everything but the
.git folder, and just do a git revert if shit goes down? Will it end up using
twice as much space?

~~~
rmc
Look at git-annex, it works on top of git and is good at handling large files,
and will not require twice the space.

If you use plain git, you'll probably need twice the space. Files stored in
git are compressed, but music is already compressed, so there will be little
savings. You'll also need a lot of memory if you want to copy files VIA git.

~~~
whichdan
Thanks for the suggestion - it also led me to some other cool projects like
git-media.

If git doesn't keep its own copy of the data, a Dropbox "failure" would still
be unrecoverable, right?

~~~
rmc
_If git doesn't keep its own copy of the data_

Plain regular git _does_ keep a copy of it's data in your .git folder, and
every checkout/clone/copy of that repository stores the data, and it stores
all old copies of all files. That's how git works. It also makes it a bit
unweildy for large files like that.

What's cool (about git) is that the hash revisions (i.e. what git uses instead
of version numbers) is basically a checksum of every file and every old
version of every file. So if an old version of a file changed, the checksum
would change and you'd be on a different branch!

~~~
whichdan
That makes a lot of sense, thanks!

------
mikegirouard
_> 2 other files (precious family photos) were also affected, but it happened
recently enough to be recovered manually by Dropbox engineers._

It's awful that it had to come to that, but it's reassuring that they will be
willing to work with you on that level.

------
eli
Huh. Is there an easy way to confirm files affected versus files that were
intentionally set to zero bytes?

------
Splines
My content is fine, I have about 5.5 GB of content, and usually push content
using the Windows client.

------
bloaf
Remember, cloud storage doesn't count as a backup if its the only place you've
stored your files.

------
alttab
Well, ... it _is_ DROPbox...?

------
joejohnson
This guy didn't backup his computer before upgrading the OS? That was his
first mistake.

~~~
konklone
I backed up everything else. :) I usually back up my Dropbox just in case,
too, but I just didn't this time. I won't make that mistake again.

~~~
kwijibob
I do this all the time. It is the strength of dropbox that I can lean on it to
do a fresh install of the OS and simply get my files back via DB.

------
ssebro
I've had a similar thing happen to me, but I was able use a python script to
pull the (hidden, old) files from a cache on my machine.
[http://www.dropboxwiki.com/TipsAndTricks/RestoreFilesAndDire...](http://www.dropboxwiki.com/TipsAndTricks/RestoreFilesAndDirectories)

------
daniel-cussen
Everything can.

------
wissler
At least one of my files was effected and not recoverable (hopefully the zero-
size test is reliable).

~~~
zalambar
0 byte files are not necessarily a sign of a bug. There are many cases where
you, or the tools you use, may intentionally create 0 byte files which would
appear in DropBox. There is a problem only if you see empty files which you
expected to contain data.

~~~
wissler
I mean, I know there are idiots on the Internet, but at some point your
assumption that someone is an idiot is idiotic. I know where this file came
from and I haven't touched it. Dropbox hammered it.

~~~
zalambar
Take it easy, I wasn't trying insult you or suggest that I believe you don't
know what you are doing.

I read "hopefully the zero-size test is reliable" as "is a reliable way to
detect if there is a problem with my files" and that is what I was trying to
comment on. Apparently I misunderstood.

