Hacker News new | comments | show | ask | jobs | submit login

I'll try.

Let's say you want to upload files A and B to Dropbox from your computer. A is a 3mb file, and B is a 12mb file.

The dropbox client first looks at A, sees that it's <4mb, and so hashes[1] the whole file. That means that it runs a function which turns the file into a 256-byte string (a "hash") which is unique[2] to that file.

The client then sends that hash to the server, which checks to see if it has already seen that hash. If it has, then it assumes that it already has the file, and just copies it from the previous location where it stored the block with that hash. If it hasn't, then it goes ahead and uploads the file.

The process for uploading file B is very similar, except that the client breaks it into three 4mb blocks, hashes each of those, and sends the hash to the server to see if it's already received those blocks.

Phew. OK, now we can get to why Dropship is (was?) a neat hack. The idea is, if Alfred has uploaded file C, and Barbara wants to get a hold of file C, but doesn't want to download it, she can just send the dropbox server the hashes for each 4mb block of file C.

The server will see each hash, say "ahha! I've already got the block represented by that hash, so I won't make you upload it!", and put the file in Barbara's Dropbox.

Does that make sense?

[1]: http://en.wikipedia.org/wiki/Hash_function

[2]: Not really unique, but the idea of hash functions is that we turn each input into a "hash" which is really really really likely to be unique, so likely that we can treat it as unique.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: