Hacker News new | comments | show | ask | jobs | submit login

I still don't get it. Anyone willing to explain?



Dropbox avoids having to store multiple copies of huge files by detecting duplicated files, storing only one copy, and letting every user that stores the file download from that one copy. Dropship exploits this system for filesharing by lying to the Dropbox servers and saying that it already owns a copy of the file.

For example, Person A wants to distribute a copy of a CD or something. They upload the file to Dropbox normally. They then use Dropship to create something describing that file, which they then publish. Persons B and C download that descriptor and feed it to Dropship, which tricks Dropbox into thinking that they also own copies of the file. Dropbox then lets Person B and Person C download the file that Person A wanted to distribute, and mission accomplished.

It's all very clever. I like it.


Somewhere on a Dropbox server is a file that you want. Normally the file's owner would have to share that file with you in order for it to appear in your Dropbox. But if you know the hash of file you can trick Dropbox into thinking you already have the file and are just adding it to your Dropbox. Apparently Dropbox notices they already have that file, and instead of you uploading it they just make it appear in your account. Then you can download it.

I wasn't impressed by the OP, but this is actually a really cool hack.


I'll try.

Let's say you want to upload files A and B to Dropbox from your computer. A is a 3mb file, and B is a 12mb file.

The dropbox client first looks at A, sees that it's <4mb, and so hashes[1] the whole file. That means that it runs a function which turns the file into a 256-byte string (a "hash") which is unique[2] to that file.

The client then sends that hash to the server, which checks to see if it has already seen that hash. If it has, then it assumes that it already has the file, and just copies it from the previous location where it stored the block with that hash. If it hasn't, then it goes ahead and uploads the file.

The process for uploading file B is very similar, except that the client breaks it into three 4mb blocks, hashes each of those, and sends the hash to the server to see if it's already received those blocks.

Phew. OK, now we can get to why Dropship is (was?) a neat hack. The idea is, if Alfred has uploaded file C, and Barbara wants to get a hold of file C, but doesn't want to download it, she can just send the dropbox server the hashes for each 4mb block of file C.

The server will see each hash, say "ahha! I've already got the block represented by that hash, so I won't make you upload it!", and put the file in Barbara's Dropbox.

Does that make sense?

[1]: http://en.wikipedia.org/wiki/Hash_function

[2]: Not really unique, but the idea of hash functions is that we turn each input into a "hash" which is really really really likely to be unique, so likely that we can treat it as unique.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: