Speculation about Dropbox stealing files seems premised on the idea that you can't know what the client is doing. But that's not even close to true. People reverse much, much harder targets than Dropbox for fun. If any version of Dropbox published to its user base ever did anything like this, we'll all know soon enough.
Firstly to address files being accessed outside of Dropbox - this is true, but literally all it does is read the file attributes: https://i.imgur.com/TADvHp1.png. Moving up the call stack and disassembling the calling function, we can see that it's part of the Python runtime: https://i.imgur.com/1TBong4.png (presumably python27_lockdown.dll is Dropbox's custom hardened copy). A bit later on it does a comparison to ".bat", which identifies it as the function win32_stat() in Modules/posixmodule.c - the ensuing behaviour of this function corresponds to QueryBasicInformation as shown on the original author's Process Monitor dump. Why the Dropbox client calls stat() on files outside of the Dropbox folder (but on the same drive) is not clear, but, as the article above also mentions, that is all it does, so no problem there.
Secondly, the original author also posted evidence of Dropbox accessing various shell folders  - Desktop, Documents, Music, Pictures, and so on. This is true but again it's a side effect of an innocent function call, this time SHGetFolderPathW(): https://i.imgur.com/uXN31BI.png. It's actually SHELL32.DLL that is responsible for opening the folder and querying its attributes, not the Dropbox client: https://i.imgur.com/YCyTwNe.png.
Without reversing the entire program we can't say for sure that Dropbox isn't siphoning out data in some other sneakier way, but the accusations of data theft from these file events are simply not true.
Full analysis here: https://news.ycombinator.com/item?id=9139657
Surprised and disappointed.
Quite interesting to see how it worked, and useful to get the key for the encrypted logs, to see it what it actually did while running. Back then you could intercept the https connections as well as they hadn't pinned the certificates yet, to get an even fuller picture.
There was nothing obviously nefarious going on back then, but that was quite a few years ago of course.
For examples see pyrasite, code.interact, etc.
If you specifically want to know what files Dropbox reads there are easy ways to observe this, like strace.
Even without having to go open IDA, I'm sure windows has enough system monitoring tools that you should be able to tell what Dropbox actually reads outside of its own data, if anything.
Besides, Dropbox does much nastier stuff than look at your files; it bloody hooks into your shell (Finder/Explorer) and manipulates the icons. It could decide to replace an .exe icon with the icon for a Word Document, for example.
This is, by design, a fucking huge defect of the underlying system calls.
Give the OS a list of folders to watch. Dropbox should not even get a callback for a file it's not is supposed to watch.
If Dropbox gets to interpose itself in every file update systemwide, that sounds like it's getting too much of my information. Encrypted/Stego drives don't matter - once you open it up to the system, Dropbox sees when you touch it.
All I can gather from this discussion is that I'm glad I'm not using Windows and Dropbox together.
"A simple protocol can give us an idea of whether data is being sent to Dropbox:
1. Create a large-ish file (1MB) outside of the Dropbox folder
2. Monitor the network usage of the Dropbox application to see if it sends enough data that it could be that file
3. Repeat with many different files, etc.
Doing exactly that, Dropbox only sent a few hundred KB after “accessing” the target file. Seems unlikely that Dropbox is uploading files outside your Dropbox folder."
This test approach has a problem. A more realistic test would be to place a well-compressed file, one that by definition cannot be made smaller, on Dropbox and see what the system traffic size is for that file. For an optimally compressed file, if the system is reading the entire file, the read size will more or less equal the file size.
To be fair, I added this section after several people pointed this out, so you may not have seen it.
On Linux at least , this is exactly what the Dropbox client does. It only registers inotify watchers on the $HOME/Dropbox directory and subdirectories. To verify:
strace -f -e trace=inotify_add_watch dropboxd
Other OSes have different file monitoring capabilities though. Anyone up on file monitoring on Windows / OS X? Is directory-specific monitoring possible?
NTFS has a feature called Change Journals where you can view a volume as a stream of changes.
1. Security experts see a security hole and note that it could only be used by a widely trusted company or government.
2. People note that it's also possible that it's not happening, and claim that the widely trusted company or government would never use the security hole.
3. It is discovered that the widely trusted company or government has been using the security hole.
Exceeding authorization is exploiting an existing hole.
Are you seriously arguing that it's okay for Dropbox to touch files you didn't give it permission to touch? This is ridiculous.
I'm not sure where you're going with this. Yes, a security hole would become much more visible after it was exploited. That doesn't imply that anything visibly weird Dropbox does is a security hole.
The only notable flaw in security here is that it's a program on a normal OS outside a sandbox. This is a huge flaw but it applies to most programs.
>Are you seriously arguing that it's okay for Dropbox to touch files you didn't give it permission to touch? This is ridiculous.
I am. Touching files does not mean taking information from files. And between the explorer extension and the way file monitoring works on windows it's going to be fed a list of your files no matter what.
Security holes are a subcategory of "things a program can do, but shouldn't be able to do". They are described entirely in terms of potential behavior, not current behavior.
The insane rights the Google TOS grants to Google are why it costs ~ 1/2 as much.
It is also an indicator that Dropbox is less shady. They don't grant themselves rights to do anything with your data outside of the normal things you need them to do to offer the dropbox service for your use.
Unlike Google, which could for instance, use your personal photos of your kid eating ice cream to try to sell you ice cream via road side LED billboards.
> Some of our Services allow you to upload, submit, store, send or receive content. You retain ownership of any intellectual property rights that you hold in that content. In short, what belongs to you stays yours.
Do you have any proof of this ever happening? Do you have any legal case that support your claim? Can you please point to the text in their TOS that leads you to believe this?
I don't need proof. You need to go read the TOS I mentioned already.
"When you upload, submit, store, send or receive content to or through our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content . The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones. This license continues even if you stop using our Services (for example, for a business listing you have added to Google Maps). "
They changed the TOS to rephrase, but didn't remove the infringing portion. It used to say any service offered now or in the future, and now it says "to make new services."
The fact is, my original statement stands.
They can quite literally use your content for any reason because they can use it to develop new services (such as, say, personalized road side advertising.)
There is nothing legally stopping them from doing it, and if you have been paying attention to the issues highlighted by Snowden and others, there is little backlash to them as a company for doing very evil things such as leaving inter-datacenter communication unencrypted allowing the NSA and others to snoop upon Gmail and all other google services as the data replicates across locations.
Good times talking to you. Appreciate the down votes.
Once again, make sure you actually read the damned thing in question before even replying to a comment about it. You ignorant, uninformed points are worthless.
Also, here is the dropbox TOS:
" Your Stuff is yours. These Terms don't give us any rights to Your Stuff except for the limited rights that enable us to offer the Services.
We need your permission to do things like hosting Your Stuff, backing it up, and sharing it when you ask us to. Our Services also provide you with features like photo thumbnails, document previews, email organization, easy sorting, editing, sharing and searching. These and other features may require our systems to access, store and scan Your Stuff."
REMARKABLY less ambiguous, and in fact enumerates the Services they offer. They literally only want to host your data. Google wants to data mine your data, even if you delete the data and your google account, google wants to keep using your data, forever.
This isn't the same thing.
In order to make copies (distributed storage, network traffic, etc.) and show copies to people -- including you, or anyone you choose to share a file or post with -- they basically have to have a license from you, or they're potentially on the wrong side of copyright law.
This is why Dropbox (smartly) does not want to compete on price -- instead they want to compete on quality. There's no way to be cheaper than Google, Microsoft, Apple and Amazon in the long run, because they each have other businesses that are licenses to print money. If one or more of them decide to heavily subsidize their online storage product, they can outlast you.
However, it turns out that all of the money in the world can't magically make a great product. You still have to actually do the hard work.
"We collect information to provide better services to all of our users – from figuring out basic stuff like which language you speak, to more complex things like which ads you’ll find most useful"
even explained thus:
"For example, if you frequently visit websites and blogs about gardening, you may see ads related to gardening as you browse the web."
and they say that they base it on:
"Information we get from your use of our services."
2 - Monitor the network usage of the Dropbox application to see if it sends enough data that it could be that file
I can not really say if Dropbox steals them or not. But if i were a Dropbox engineer and want to know about those newly created files, i wouldn't want to send the whole file to server at all.
- Send file name with its extension
- Send file size
Compare these to dropbox's blacklist file (imagination only) in another server. If there are any matches, mark user as "whateveryouwant"
As long as there is a network activity when a new file created, it is and will always be suspicious to its users.
My only lament is that it doesn't work that well over the intermittent connections. It'd be neat to have something robust like mosh (https://mosh.mit.edu) for file sync.
I think this SHOULD be surprising to any competent software engineers. That isn't how the file system watcher works.
Change journals are streams that are per volume (so to monitor some directory in C:\ i have to monitor the C:\ change stream).
It's just how NTFS works. It's shocking that this was allowed to reach this kind of publicity because it's just a guy attaching a diagnostic tool to a system where he doesn't know whats happening and then proceeds to freak.
Software like this will have plenty of file access for metadata, not only on the backed up files.
Also, what edge cases?
> The Dropbox application uses a filesystem monitor to detect when changes are made by monitoring filesystem write events. This is, by necessity, a system-wide process. So DLP alerting that Dropbox is “acccessing” a new file shouldn’t be surprising.
THAT IS NOT HOW THAT WORKS!
Sorry, I am calm now. As someone who has spent quite a lot of time using Windows' File System Watcher functionality, I know that that is nonsense. Windows monitoring/watching is conducted at the kernel, when an IO operation occurs that hits a registered monitor it fires off an event (windows message) to that process to let it know, the process itself never accesses that file directly.
But just test it for yourself.
1) Download Process Monitor 
2) Start Process Monitor, turn off Registry, Network, Profiling, and Process events.
3) Set the include (included processes to monitor) to [whatever executable you build]
4) Build this (see examples section)  in C#/VB.net and run it
5) Set the process name in #4 in the include in #3
6) Write to a file in C:\ (that's the default in the example program/source)
7) You should see some Console.WriteLine() output indicating the file watcher is working. If not run as administrator.
8) There you go. As you can see, no direct file accesses to the file. The monitor events are fired as you can see, but the file remains untouched directly by your program.
The author could have done this. Why didn't they? It isn't like I had to even write one line of code or have some kind of specialist knowledge of low level kernel functionality...
PS - I don't know/care if DropBox is stealing your stuff. I just wish the article's author had at least fact-checked before they claimed that "that is how this works!!!" when in reality that is untrue. That is how it works for Anti-Virus because AV scans within files to see contents, it isn't how it works for most processes which just use the file watcher functionality. If DropBox chooses to look inside files, then why? There is no need for that.
PPS - If DropBox do have a system wide file watcher, that is just lazy. It will reduce system performance, and they could have just as easily set it up to point just to folders DropBox is configured to watch.
A desirable capability would be on-demand upload or download of any file on the clients system. For that you would need the entire filetree+checksums so, imo, that's what it's syncing.
Hey you have in your downloads folder 3 new files per day whose file names hints they were send by FB user X. I could make an educated guess about their content.
Yes, it's possible they named her to their board in good faith, and it's possible they also resisted the NSA somehow, where Google and Microsoft and Yahoo and countless others failed. But, do you consider it likely enough to bet your privacy on it? It seems to me you would be foolish to do so.
The only other excuse for it I can think of is that it's so obviously corrupt that it proves they aren't corrupt after all - that no one could be that stupid. I reject such meta-reasoning. They are simply corrupt.
I wonder if they could be calculating hashes of files and sending them off? That would be useful for automated exfiltration and targeting.
1. Calculate the SHA-256 hashes for files in places of interest.
2. Report the hashes upstream.
3. Hey, this file matches one that the FBI/NSA is looking for via NSL.
4. Download more stuff. Also identify the person and their location.
5. Send agents/drones after them.
It sounds like he guessed this based on I/O activity of the process. It could be enough to hash the beginning of the files, and compare the rest if a match is found in the database.
edit: here i was bored enough > http://pastebin.com/NJEvnG1d
I think in the end I just started taking the data from the end of the file, but if you're going with subsets, it's probably better to use a pseudo-randomly selected subset rather than a sequential subset. It doesn't have to be a different pseudo-random subset for each file, but I imagine there's an ideal noise profile in the sampling (maybe white noise is best).
File > 12 KB: First 4 KB, last 4 KB, middle 4 KB
File <= 12 KB: Just hash the damn file
Or why not more mundane: another user shares a file with you. Dropbox knows that you already have the same file somewhere on your filesystem outside the Dropbox folder (or a partial match). It doesn't have to transfer that data to you.
But I agree with those saying it's probably a result of some implementation issue (Finder extension or working around some shortcoming in monitoring just the Dropbox directory).
Of course, this gave rise to the ability to transfer files (even non-public files) quickly between Dropbox accounts provided knowledge of the hashes of its chunks, and Dropbox has since changed their deduplication. See https://github.com/driverdan/dropship
Also, the typo "Drobpox" was fun, that's a good alias when feeling suspicious. :)
I'm thinking of something like
/home/dropbox drwxrwx--- dropbox <youruser>
> Betteridge's law of headlines is an adage that states: "Any headline which ends in a question mark can be answered by the word no." It is named after Ian Betteridge, a British technology journalist