
Dropbox Is Probably Not Stealing Your Files - talon88
https://one.darrenpmeyer.com/blog/dropbox-is-problably-not-stealing-all-your-files.html
======
tptacek
This is a fine post, but all I can think about this situation is "or, you
could just reverse the Dropbox client and find out for sure".

Speculation about Dropbox stealing files seems premised on the idea that you
can't know what the client is doing. But that's not even close to true. People
reverse much, much harder targets than Dropbox for fun. If any version of
Dropbox published to its user base ever did anything like this, we'll all know
soon enough.

~~~
sarciszewski
I would be _very_ surprised if a workplace name like Dropbox has never been
reverse engineered by a bored hacker on a lazy weekend.

Surprised and disappointed.

~~~
geographomics
I reversed it back when it was version 1.1.something, it was basically all
compiled Python modules with custom encrypted code objects and non-standard
opcode mappings for the bytecode.

Quite interesting to see how it worked, and useful to get the key for the
encrypted logs, to see it what it actually did while running. Back then you
could intercept the https connections as well as they hadn't pinned the
certificates yet, to get an even fuller picture.

There was nothing obviously nefarious going on back then, but that was quite a
few years ago of course.

~~~
saalweachter
So what you're saying is that, if I wanted to launch a nefarious file-stealing
Dropbox-like application, I should first launch the non-nefarious version, and
then when it gets up to 3.6 or so, turn evil?

~~~
0942v8653
Yes. That's exactly the right time to turn evil.

[https://xkcd.com/792/](https://xkcd.com/792/)

------
chinathrow
"This is, by necessity, a system-wide process."

This is, by design, a fucking huge defect of the underlying system calls.

Give the OS a list of folders to watch. Dropbox should not even get a callback
for a file it's not is supposed to watch.

~~~
raverbashing
See: inotify in Linux

~~~
krakensden
inotify is by far the best of the filesystem notification APIs. It's wildly
reasonable.

------
lutusp
Quote:

"A simple protocol can give us an idea of whether data is being sent to
Dropbox:

1\. Create a large-ish file (1MB) outside of the Dropbox folder

2\. Monitor the network usage of the Dropbox application to see if it sends
enough data that it could be that file

3\. Repeat with many different files, etc.

Doing exactly that, Dropbox only sent a few hundred KB after “accessing” the
target file. Seems unlikely that Dropbox is uploading files outside your
Dropbox folder."

This test approach has a problem. A more realistic test would be to place a
well-compressed file, one that by definition cannot be made smaller, on
Dropbox and see what the system traffic size is for that file. For an
optimally compressed file, if the system is reading the entire file, the read
size will more or less equal the file size.

~~~
Retric
You’re still missing the possibility of DB uploading a hash vs standard file
compression. Even if they never upload the full file plenty of people would
love to know if any of your files was on some list. (Classified information,
piracy, etc.)

------
radiospiel
Well, dropbox could just listen for fs events inside the DropBox folder; and
it should, from a performance perspective as well as from a privacy point of
view. And then "sends a few 100 kByte"? I hope this is a typo; if not, I would
like to know what these are. (also: the OP's largish file (1MB) could easily
fit into "a few 100kByte" after compression)

~~~
sigil
_Well, dropbox could just listen for fs events inside the DropBox folder..._

On Linux at least [1], this is exactly what the Dropbox client does. It only
registers inotify watchers on the $HOME/Dropbox directory and subdirectories.
To verify:

    
    
        strace -f -e trace=inotify_add_watch dropboxd
    

You could also strace open/stat/read/write syscalls to verify that, aside from
shared libraries and the like, the Dropbox Linux client doesn't access files
outside of your Dropbox directory.

Other OSes have different file monitoring capabilities though. Anyone up on
file monitoring on Windows / OS X? Is directory-specific monitoring possible?

[1]
[https://www.dropbox.com/install?os=lnx](https://www.dropbox.com/install?os=lnx)

~~~
radiospiel
> Other OSes have different file monitoring capabilities though. Anyone up on
> file monitoring on Windows / OS X? Is directory-specific monitoring
> possible?

yes

~~~
MrDosu
But ReadDirectoryChangesW notoriously misses updates. Also it would scale
horribly to large amounts of files.

NTFS has a feature called Change Journals where you can view a volume as a
stream of changes.

------
copsarebastards
The cycle of security is this:

1\. Security experts see a security hole and note that it could only be used
by a widely trusted company or government.

2\. People note that it's also possible that it's _not_ happening, and claim
that the widely trusted company or government would never use the security
hole.

3\. It is discovered that the widely trusted company or government has been
using the security hole.

~~~
Dylan16807
"Touches files it doesn't have to" is not a security hole.

~~~
copsarebastards
"Users install program that touches files the users might not want it to" is a
security hole. The unknown is whether Dropbox comes with malware which takes
advantage of that security hole.

~~~
Dylan16807
A hidden function or update could enable malicious behavior on all files
whether or not it had preexisting behavior of touching all files. Only in
certain detailed permission structures would preexisting behavior matter.

~~~
copsarebastards
If a hidden function enabled malicious behavior, causing it to touch all
files, the hidden function would very quickly cease to be hidden.

Are you seriously arguing that it's okay for Dropbox to touch files you didn't
give it permission to touch? This is ridiculous.

~~~
Dylan16807
>If a hidden function enabled malicious behavior, causing it to touch all
files, the hidden function would very quickly cease to be hidden.

I'm not sure where you're going with this. Yes, a security hole would become
much more visible after it was exploited. That doesn't imply that anything
visibly weird Dropbox does is a security hole.

The only notable flaw in security here is that it's a program on a normal OS
outside a sandbox. This is a huge flaw but it applies to most programs.

>Are you seriously arguing that it's okay for Dropbox to touch files you
didn't give it permission to touch? This is ridiculous.

I am. Touching files does not mean taking information from files. And between
the explorer extension and the way file monitoring works on windows it's going
to be fed a list of your files no matter what.

Security holes are a subcategory of "things a program can do, but shouldn't be
able to do". They are described entirely in terms of potential behavior, not
current behavior.

~~~
copsarebastards
Okay, if you want to be pedantic about the meaning of the words "security
hole" instead of addressing the actual concerns people have with Dropbox, then
we can just call Dropbox "potential malware" and be done with it. Does that
address your terminology concerns? Can we move on to talking about the
important stuff now?

------
chisleu
Read their TOS and compare it to Google Drive's TOS.

The insane rights the Google TOS grants to Google are why it costs ~ 1/2 as
much.

It is also an indicator that Dropbox is less shady. They don't grant
themselves rights to do anything with your data outside of the normal things
you need them to do to offer the dropbox service for your use.

Unlike Google, which could for instance, use your personal photos of your kid
eating ice cream to try to sell you ice cream via road side LED billboards.

~~~
eli
I think you're misunderstanding the TOS. I do not think Google is subsidizing
their storage costs by abusing your private photos.

~~~
chisleu
I think you aren't actually reading it. I do not think Dropbox is less good at
running storage than Google or Amazon, yet Google is 1/2 the cost.

~~~
skuhn
The reality here is that the cost to operate doesn't matter to Google. They
can keep subsidizing Drive with revenue from Search Ads long after they have
pushed Dropbox out of business, if they choose to do so.

This is why Dropbox (smartly) does not want to compete on price -- instead
they want to compete on quality. There's no way to be cheaper than Google,
Microsoft, Apple and Amazon in the long run, because they each have other
businesses that are licenses to print money. If one or more of them decide to
heavily subsidize their online storage product, they can outlast you.

However, it turns out that all of the money in the world can't magically make
a great product. You still have to actually do the hard work.

------
whizzkid
" 1 - Create a large-ish file (1MB) outside of the Dropbox folder

2 - Monitor the network usage of the Dropbox application to see if it sends
enough data that it could be that file

"

I can not really say if Dropbox steals them or not. But if i were a Dropbox
engineer and want to know about those newly created files, i wouldn't want to
send the whole file to server at all.

\- Send file name with its extension

\- Send file size

Compare these to dropbox's blacklist file (imagination only) in another
server. If there are any matches, mark user as "whateveryouwant"

As long as there is a network activity when a new file created, it is and will
always be suspicious to its users.

~~~
ProAm
And they don't have to steal the whole file at once, why not chunk it out and
do bits at a time over a few days. This is tinfoil hat territory, but they'd
still get the file and network traffic wouldnt be overly suspicious.

~~~
jamiesonbecker
You don't need to retrieve the file if you already have it.

------
rayiner
As an aside, I've been using Sparkleshare (built on GIT and SSH) lately. It's
pretty good, and sucks up less battery on my MBP than the Dropbox client
(maybe because it's not watching every file in the system!) And not only is it
open source, but you can see a log of all the git commands and fix things
manually if necessary.

My only lament is that it doesn't work that well over the intermittent
connections. It'd be neat to have something robust like mosh
([https://mosh.mit.edu](https://mosh.mit.edu)) for file sync.

~~~
mbq
There is also unison; it is not automatic (custom inotify script is required
to mimic Dropbox, and it will force manual intervention upon conflicts) nor
stores history, but handles large blobs, many clients and crappy connections
quite well.

------
Animats
The original article is titled "Dropbox Is Probably Not Stealing All Your
Files". From bandwidth consumption, you can tell it's not stealing all of
them. Whether it does so selectively, on command from the mothership, is
another matter.

------
bebbiwebbi
"The Dropbox application uses a filesystem monitor to detect when changes are
made by monitoring filesystem write events. This is, by necessity, a system-
wide process. So DLP alerting that Dropbox is “acccessing” a new file
shouldn’t be surprising."

I think this SHOULD be surprising to any competent software engineers. That
isn't how the file system watcher works.

------
MrDosu
About every form of backup software will use change journals to identify what
to backup and how it changed.

Change journals are streams that are per volume (so to monitor some directory
in C:\ i have to monitor the C:\ change stream).

It's just how NTFS works. It's shocking that this was allowed to reach this
kind of publicity because it's just a guy attaching a diagnostic tool to a
system where he doesn't know whats happening and then proceeds to freak.

Software like this will have plenty of file access for metadata, not only on
the backed up files.

~~~
TheLoneWolfling
And this is where having the API actually support, say, monitoring only items
in a single directory would be good.

~~~
MrDosu
Well there is, but it's unreliable in certain edge cases. At least not
reliable enough for a backup solution.

~~~
TheLoneWolfling
In which case the API is broken and should be fixed.

Also, what edge cases?

~~~
MrDosu
For example you have to allocate a buffer to hold the information you receive.
If that's too small, you miss stuff. There is also a lot of intricacy with
permissions and you can mess up a lot when multithreading without knowing how
to properly interact with the OS then.

~~~
TheLoneWolfling
Wouldn't that be worse, if anything, for the whole-drive case?

------
Someone1234
This article makes me irrationally annoyed by how lazy the author was. I was
able to produce a test in under 5 minutes that disproves the article's core
assumption:

> The Dropbox application uses a filesystem monitor to detect when changes are
> made by monitoring filesystem write events. This is, by necessity, a system-
> wide process. So DLP alerting that Dropbox is “acccessing” a new file
> shouldn’t be surprising.

THAT IS NOT HOW THAT WORKS!

Sorry, I am calm now. As someone who has spent quite a lot of time using
Windows' File System Watcher functionality, I know that that is nonsense.
Windows monitoring/watching is conducted at the kernel, when an IO operation
occurs that hits a registered monitor it fires off an event (windows message)
to that process to let it know, the process itself never accesses that file
directly.

But just test it for yourself.

1) Download Process Monitor [0]

2) Start Process Monitor, turn off Registry, Network, Profiling, and Process
events.

3) Set the include (included processes to monitor) to [whatever executable you
build]

4) Build this (see examples section) [1] in C#/VB.net and run it

5) Set the process name in #4 in the include in #3

6) Write to a file in C:\ (that's the default in the example program/source)

7) You should see some Console.WriteLine() output indicating the file watcher
is working. If not run as administrator.

8) There you go. As you can see, no direct file accesses to the file. The
monitor events are fired as you can see, but the file remains untouched
directly by your program.

The author could have done this. Why didn't they? It isn't like I had to even
write one line of code or have some kind of specialist knowledge of low level
kernel functionality...

PS - I don't know/care if DropBox is stealing your stuff. I just wish the
article's author had at least fact-checked before they claimed that "that is
how this works!!!" when in reality that is untrue. That is how it works for
Anti-Virus because AV scans within files to see contents, it isn't how it
works for most processes which just use the file watcher functionality. If
DropBox chooses to look inside files, then why? There is no need for that.

PPS - If DropBox do have a system wide file watcher, that is just lazy. It
will reduce system performance, and they could have just as easily set it up
to point just to folders DropBox is configured to watch.

[0] [https://technet.microsoft.com/en-
us/sysinternals/bb896645](https://technet.microsoft.com/en-
us/sysinternals/bb896645)

[1] [https://msdn.microsoft.com/en-
us/library/system.io.filesyste...](https://msdn.microsoft.com/en-
us/library/system.io.filesystemeventhandler%28v=vs.110%29.aspx)

------
xer
It's very unlikely that dropbox would upload every changed file from your
computer, that would not go unnoticed.

A desirable capability would be on-demand upload or download of any file on
the clients system. For that you would need the entire filetree+checksums so,
imo, that's what it's syncing.

------
jostmey
Dropbox has too much to lose and not enough to gain by stealing your files.
The accusation borders on paranoia. That said, Dropbox is a closed system, and
I always trust open systems more.

------
jinushaun
The tin foil hat is strong with this thread. People should read up on Windows
Explorer shell extensions before making comments. It's like saying regexing
email addresses to check for valid input is the same as stealing emails
addresses.

------
venomsnake
If dropbox get to send metadata about files outside of the folder could be
damning enough.

Hey you have in your downloads folder 3 new files per day whose file names
hints they were send by FB user X. I could make an educated guess about their
content.

------
deciplex
Reminder that Dropbox was mentioned, by name, in NSA documents released _two
years ago_ as the next target they intended to subvert. Also reminder that not
long after _that_ they named Condoleeza Rice, celebrated apologist for
warrantless wiretapping during the Bush administration, to their board of
directors.

Yes, it's _possible_ they named her to their board in good faith, and it's
_possible_ they also resisted the NSA somehow, where Google and Microsoft and
Yahoo and countless others failed. But, do you consider it likely enough to
bet your privacy on it? It seems to me you would be foolish to do so.

The only other excuse for it I can think of is that it's _so obviously
corrupt_ that it proves they aren't corrupt after all - that no one could be
that stupid. I reject such meta-reasoning. They are simply corrupt.

------
darkhorn
If you are an ordinary guy then even if Dropbox steals a file it won't matter
much. If you are a government, Airbus, Snowden, Aselsan, or Comodo you should
not install Dropbox even if you trust Dropbox.

------
sarciszewski
Pure speculation follows.

I wonder if they could be calculating hashes of files and sending them off?
That would be useful for automated exfiltration and targeting.

For example:

    
    
        1. Calculate the SHA-256 hashes for files in places of interest.
        2. Report the hashes upstream.
        3. Hey, this file matches one that the FBI/NSA is looking for via NSL.
        4. Download more stuff. Also identify the person and their location.
        5. Send agents/drones after them.
    

This is unlikely, but still in the realm of possibility. It's also untestable
without more information. (Packet captures from the DLP device would be far
more helpful in determining if anything of the sort is happening.)

~~~
ikeboy
To calculate the hash, it needs to read the whole file, which this post claims
it isn't doing.

~~~
aselzer
Did the author actually verify this with strace (or the mac/windows
equivalent)?

It sounds like he guessed this based on I/O activity of the process. It could
be enough to hash the beginning of the files, and compare the rest if a match
is found in the database.

~~~
cremno
Dropbox doesn't read the file content. There is also no proof that Dropbox
directly accesses those files.

------
pwnna
Speaking of which, does anyone run apps like this with a different user than
their own?

I'm thinking of something like

    
    
        /home/dropbox drwxrwx--- dropbox <youruser>

------
vacri
Separate from the privacy concerns, what about network traffic? "a few hundred
kilobytes" for every file you create adds up.

------
lalos
Any thoughts on SafeMonk, which they claim end to end encryption and piggyback
on dropbox?

------
jamwt
Uhh... no, no we are not. s/probably //

------
meira
Dropbox Is Probably Ready To Do So As They Want

------
anonbanker
I stooped trusting dropbox when condoleeza rice got added to the board.

------
lucozade
Surely tradition requires that the title of this piece be "Is Dropbox stealing
your files?". I mean, it's a complete waste of a Betteridge event.

~~~
quarterto
It's in response to an earlier-submitted article:
[https://news.ycombinator.com/item?id=9136546](https://news.ycombinator.com/item?id=9136546)

~~~
lucozade
That is, in fact, titled in the correct manner. Maybe Betteridge doesn't count
if the thrust of the article is actually to point out that "no" is the correct
answer. I hadn't thought of that.

------
rlx0x
You upload your files to dropbox servers, how in the world can you come to the
delusion that you would notice when they accessed/searched/data mined your
files?!

~~~
3JPLW
(The article is about uploading files outside your dropbox folder)

