
Dropbox confirms that a bug within Selective Sync may have caused data loss - ghuntley
https://gist.githubusercontent.com/ghuntley/42803b4cabb181098063/raw/2e06230c31018cefbc706dca5e7b12ef9692d87e/gistfile1.txt
======
ghuntley
Additional info from Dropbox support:

    
    
        We received several reports from users who used a Dropbox feature called Selective Sync and couldn’t locate certain files they’d saved in Dropbox. 
        When we took a closer look, we discovered that older versions of the Dropbox client had introduced an issue affecting a small number of users whose Dropbox application shut down or restarted while users were applying Selective Sync settings.
    
        In light of all of this, we've taken the following steps to ensure the Selective Sync bug won’t affect anyone else going forward:
    
        1) we've patched our desktop client so this issue doesn't exist in Dropbox anymore;
        2) we've made sure all our users are running an updated version of the Dropbox client; and
        3) we've retired all affected versions of the Dropbox client so no one can use them.
    
        We've also put additional testing in place to prevent this from happening in the future.
    
        We’re very sorry about this issue and the trouble it might have caused. We’ll keep doing our best to ensure our users' data is always safe and available to them.

~~~
Ma8ee
Just so you folks don't have to scroll sideways:

We received several reports from users who used a Dropbox feature called
Selective Sync and couldn’t locate certain files they’d saved in Dropbox.

When we took a closer look, we discovered that older versions of the Dropbox
client had introduced an issue affecting a small number of users whose Dropbox
application shut down or restarted while users were applying Selective Sync
settings.

In light of all of this, we've taken the following steps to ensure the
Selective Sync bug won’t affect anyone else going forward:

1) we've patched our desktop client so this issue doesn't exist in Dropbox
anymore; 2) we've made sure all our users are running an updated version of
the Dropbox client; and 3) we've retired all affected versions of the Dropbox
client so no one can use them.

We've also put additional testing in place to prevent this from happening in
the future.

We’re very sorry about this issue and the trouble it might have caused. We’ll
keep doing our best to ensure our users' data is always safe and available to
them.

~~~
andy_ppp
Is there a way we can contribute some CSS fixes to the HN code base. These
issues could be quick and permanent fixes. Also 12px Verdana? Mobile?

~~~
colinbartlett
They've been reluctant to change the markup because of the many scrapers.
Hence, the recently released API in preparation. An updated UI is incoming.

~~~
andy_ppp
The majority get a poorer experience because a few people are scraping? I'm
guessing 99% of traffic is from people hitting the site in browsers.

~~~
noblethrasher
HN has historically subscribed to a form of deontological ethics[1] rather
than utilitarianism.

[1]
[http://www.bbc.co.uk/ethics/introduction/duty_1.shtml](http://www.bbc.co.uk/ethics/introduction/duty_1.shtml)

(I happen to agree with HN’s ethical posisition, but I invite you to look at
the section “Bad points of duty-based ethics” in the link).

~~~
andy_ppp
You are assuming the rule "Never change the source code unless you have to" is
a deontological imperative.

It's not really is it? It's just something you believe and are justifying with
obscure ethical arguments. Thanks for the reading :-D

------
NDizzle
I was affected by this, but I realized it at the time.

I have an older laptop that I turned on. It was a work laptop a few years ago,
linked to my dropbox account, etc. Since then I had added a bunch of things
like a bunch of git repos to a folder included in dropbox.

I turned on that laptop and Dropbox started using 100% cpu after a few
minutes. Then the fan kicked on and it was annoyingly loud so I looked at
dropbox and saw it was chugging along in the repos directory. I went ahead and
clicked on selective sync, unchecked repos, and left it alone for about 5
minutes.

It was still 100% cpu, so I killed the dropbox task and restarted it.

Minutes later, on another machine, I went to fetch from one of the repos and
it had a gnarly error. So I went about investigating.

I found my way to the dropbox events tab (on the website - the desktop client
doesn't have this feature) and saw an event where dropbox decided to delete
7,800 files.

I submitted a support request, but before they responded I had figured out it
was (mostly) in the repos directory, which I fixed by simply deleting the
repos and pulling from one of my servers.

Anyways. There's my real world run in with this bug.

------
phren0logy
This is exactly why sync is not a commodity. Dropbox is the very best at what
they do, and even they have bugs. So when someone offers to sync your files
for less, ask why.

~~~
zvrba
> This is exactly why sync is not a commodity. Dropbox is the very best at
> what they do,

The very best? I use OneDrive across all of my Windows machines and I don't
even notice it exists; never had any problems. I just access all my files
everywhere. If you buy a windows phone you even get a decent amount of space
for free (15GB). (Though I subscribe to Office 365 so I have virtually
unlimited space.)

~~~
danieldk
It is a cliche, but the plural of anecdote is not data. I never had any
problems with Dropbox, but had Office on OneDrive corrupt files. OneNote has
also rendered some notes unreadable.

The bottom line is that errors happen. You should prepare for that and make
backups.

Also, Dropbox are still among the very best when it comes to syncing. Many
useful synchronization features are implemented by Dropbox, but not the
competition. E.g., features that most competitors do not have:

\- Modifying a large file on Dropbox will only resync modified chunks.

\- DropBox avoids re-uploads, both when uploading identical files and moving
files around:

[http://macography.net/2013/05/speed-test-dropbox-google-
driv...](http://macography.net/2013/05/speed-test-dropbox-google-drive-box-
skydrive-amazon-cloud-drive/)

\- Dropbox does LAN sync. If a machine has to download a large file and
another machine on the network has the same file, chunks are provided peer to
peer. This makes using large files on multiple machines or in a team much
faster.

\- Dropbox does streaming sync. A machine can already download chunks when
another machine is still uploading:

[https://blog.dropbox.com/2014/07/introducing-streaming-
sync-...](https://blog.dropbox.com/2014/07/introducing-streaming-sync-
supercharged-sync-for-large-files/)

Sure, OneDrive and Google Drive do have many useful functions that Dropbox
does not have, such as including complete office suites. But for the original
task, file syncing, Dropbox is still pretty much unbeaten.

~~~
calinet6
I think you're attributing too much of Dropbox's success to simple technical
reliability. It really isn't that difficult a problem, and many services and
projects do it right. I have an rsync script that has been syncing my files
reliably to an offsite location for 6 years.

It's certain that Dropbox has a high quality syncing service, but there are
other factors. Think, for example, how this case was handled: a fault in their
core product, a breach of user trust in their service, and they understood
that it needed more than a technical solution. None of this was part of their
core sync reliability: it was part of a more broad quality, which is closer to
their true reason for success.

~~~
danieldk
_I think you 're attributing too much of Dropbox's success to simple technical
reliability._

I did not say anything about their reasons for success. Only what the
technical advantages are compared to some of the other file sync services.

 _It really isn 't that difficult a problem,_

Difficult enough that some of its useful features are not matched by other
services yet.

 _I have an rsync script that has been syncing my files reliably to an offsite
location for 6 years._

That's great. But that is one-way sync and not something my parents could use.
Dropbox is successful because they made sync technology that is relatively
flawless to the average user. Also, there is a network effect.

In the longer term, it will be interesting to see if they survive, since
Microsoft and Google have been undercutting prices heavily, and as far as I
know there is no online Office suite on the horizon (only Microsoft Office
integration for business users).

------
darrenkopp
I aggressively use selective sync, and have since as long as I can remember
yet I haven't got an email like this, so it may only affect specific users.

~~~
bradleyland
It appears the circumstance is more specific than simply using selective sync.

> This problem occurred when the Dropbox desktop application shut down or
> restarted while users were applying Selective Sync settings.

So, you must be in the midst of applying selective sync settings while the app
shuts down or restarts. Although I'm not sure what they mean when they say,
"while users were applying selective sync settings." I'm not sure if this
means:

A) Changes made in the selection dialog box, but not committed (by clicking
OK).

or

B) Changes committed, but still syncing.

The former is an edge case, the later, not so much.

~~~
darrenkopp
Interesting. I know a few times I've had to kill the dropbox process while
changing my selective sync settings before.

------
general_failure
Dropbox should have understood that people are using it as a backup service. I
mean carousel and other use cases sort of ebcourage and imply this. With that
in mind, it baffling they didn't have any proper backups for user data.

~~~
danieldk
It is a shame that they don't offer the unlimited packrat option anymore. It's
still not backup, but at the very least people would be able to recover files
in such cases.

Also, if I understand correctly, Google Drive has a better policy here:
removed files are just placed in the trash until you remove them from the
trash. Of course, trash takes space up as well, but it protects better against
such cases.

I guess Dropbox is trying to maximize its profits with its 'remove after 30
days' policy.

~~~
grey_golem
I have been using the "packrat" feature for more than a year, and Dropbox sent
me a similar notification today to tell me they lost several thousand files,
816 could not been restored. They were "lost" around 8 months ago, so
"packrat" didn't save me at all.

As it turns out, I have other backups of most of the files, and the rest of
them weren't important. So I was lucky. Still, my confidence in the product is
unlikely to recover.

I want to note that I had been aware of the "dropbox is not backup" chorus,
but that argument usually is just "sync is not backup", which is sort of
obvious. The packrat feature pretty much addressed this issue, so dropbox with
packrat WAS a backup solution. So the lesson here is never to rely on any ONE
backup provider.

------
Lazare
A good reminder that Dropbox is not a backup client, and should not be relied
on for backups, any more than RAID should be.

~~~
Dylan16807
It's about as good as any online backup system. It's much much better than
raid.

If you want to be picky you shouldn't _rely_ on backups unless you have
multiple independent backup systems, at least one offsite, at least one
offline.

~~~
m_mueller
Sorry, but this is just wrong. Any system that offers multi user sync is
inherently more complex than it needs to be as a backup solution. A backup
should generally be

\- convenient enough that you do it without thinking about it.

\- technically as simple as possible, so it's easy to understand and review.

\- secure.

Dropbox fullfills the first point, but not the second, and the third is
debatable. Spideroak as a counterexample is just as convenient, has a pure
incremental backup mode and is client-side encrypted, the gold standard of
security.

~~~
Dylan16807
You really don't need any of those to be a solid backup system. What matters
is that you make backups, the backups last long enough, and there's testing of
backups.

Also from what I've seen spideroak is significantly more complex than dropbox.

------
sanyo
We provide a self hosted sync offering for businesses. It is currently used by
close to 1000 businesses. It took us almost 18 months from our launch to get
the sync right. There are simply too many edge cases and the development team
needs to closely work with the customers to identify and fix it. Even then our
complexity is much less than dropbox. The largest customer of ours have 10000
users.

Short story: if you plan to develop a sync product from scratch, be prepared
to spend at least 2 years or hire core developers from Dropbox sync team. Eve
now dropbox has issues with handling large number of small files. Try to stuff
200000 to 300000 files and see how it works.

------
chdir
How old is this issue? The release notes don't spell out clearly if this bug
was fixed in the past 2-3 updates (using v2.10.30 on Win 7)

[https://www.dropbox.com/release_notes](https://www.dropbox.com/release_notes)

------
waverunner
I was notified of the potential data loss and checked my data on the
'personalized web page.' Of the 12,000 files that may have been affected, I
found only a subfolder of a few dozen photos that may've been removed.

The problem is that when I clicked 'restore all' from within the subfolder,
Dropbox restored all 12,000 files rather than just the files within the
folder.

Note to DB's UX team: when you place a Restore All checkbox above the lefthand
file selection column, it means 'select and restore all files on the page',
not 'lift the roof off my house and dump in all the shit I spent months
decluttering.'

~~~
mayneack
I've been hoping for a 'restore folder to date X' for a long time too.

------
andy_ppp
Ha, dropbox deleted my files the other day presumably due to this bug. I
ranted on Twitter and they came back with the dropbox client can't delete
files. Hmmmm. Seems I was correct :-/

------
kwijibob
I have all my digital life on dropbox, a few hundred gig.

One of my greatest fears is that thousands of files might disappear without me
noticing for years.

I use selective sync and twice I was looking for something that has
disappeared and I have to restore it. I assumed maybe my wife accidentally
deleted some files, but maybe it was dropbox?

What is the solution to this anxiety?

~~~
pbhjpbhj
Perhaps you can have a cron job run against your local dropbox folder(s) and
do an "md5deep" reporting only differences (or sha1deep or whatever, perhaps
test which uses least resources, maybe nice it heavily too). Then you could
have the output report saved to a folder (not a dropbox one!). Perhaps add
another job to email/alert you if the "count" of lines in the report is
greater than a certain number? Crude, for sure.

------
h43k3r
I remember someone posting on HN about this, a month ago. Can't seem to find
the link.

~~~
dmdeller
Here it is:
[https://news.ycombinator.com/item?id=8441230](https://news.ycombinator.com/item?id=8441230)

------
andrea_s
This manifested for me as a large number of "conflicted copies" everywhere
inside my main visual studio solution. Thankfully, source control saved the
day... But I was really annoyed at Dropbox for a little while.

------
sdizdar
As founder of cloudHQ, I have to jump into this. Software products will have
bugs. And people will make mistakes. We are all human.

So even if you store data in Dropbox - it is smart to have one extra copy in
some other cloud storage. Like Google Drive. Or Box. Or Egnyte. So if data is
deleted in Dropbox (accidentally, maliciously, or due to a bug) you can
restore it from other cloud.

Of course, cloudHQ is the system which can do that:
[http://chq.io/hnsc](http://chq.io/hnsc)

------
paulhauggis
I stopped using Dropbox because of this. I booted my system up one day and a
ton of my files were deleted (locally). Luckily, this didn't affect the sync
on my other systems.

------
copper_rose
Yikes. All those nines of durability that Amazon provided for
Dropbox...brought to naught by a bug in Dropbox's software.

------
nintendo1889
I think this calls for an aggressively distributed, user-controlled backup
system. Perhaps tahoe lafs based.

