Hacker News new | past | comments | ask | show | jobs | submit login

It's also a method dropbox can use to save storage space. Inevitably, two or more users will put the same file in their DB. By computing and comparing hashes they can simply just store one copy of the file and point all users to that single instance of the file (they may do an exact match check too once the hashes collide since hashes can have unintended collisions as well).

I have no idea how much efficiency this buys them, but the same mechanism can be used against a blacklist of known copyright or illegal materials. In this case, having a hash that matches something in the copyrighted materials list AND sharing it triggers this action.

Just as easily, putting some child porn in your DB could also have them check against illegal material hashes and trigger an automated notice to some law enforcement agency or something similar.

This can even be used for National Security without violating the classified contents of the documents. Say another Snowden-type grabs a bunch of classified documents. The agency could give DB a list of hashes with instructions to "call us" if a user uploads a bunch of leaked documents to their DB. They could even do this blind, just send DB a hash of every classified document or piece of data they ever produce and catch leakers before anybody in the agency even knew something was leaked internally.

Say a company, with its 1000 1TB/user accounts, has ~600GB/user in real life usage, since there's going to be a lot of cross over they could save upwards of .5PB.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact