
Ask HN: For Facebook, what's the most efficient way to erase data from backups? - ggregoire
Mark Zuckerberg just said to the Congress that Facebook erases users&#x27; data from their backups when a user deletes his account.<p>For 1 small database, I&#x27;d restore the dump, delete the data and redump without the data. But at the scale of Facebook, what&#x27;s the most efficient way to achieve it?<p>I don&#x27;t know in how many instances of MySQL my data are, or what&#x27;s the backup periodicity &amp; retention time. But let&#x27;s say my data are in 1 MySQL, backed up every day, kept for 1 week. That makes 7 dumps of several petabytes. Let&#x27;s say I have also some data in other DBs (Cassandra, Redis, etc). And let&#x27;s say there are 10,000 users who delete their account every day. How Facebook does it?
======
dmlittle
If you interpret "deleting" data as making the data inaccessible, would
discarding the encryption key of encrypted data count as it being deleted?

For example, let's say I encrypt each of my users data with a different,
unique encryption key. In order to access the data, I need to fetch the
contents of that user and then use their decryption key to decrypt them. The
data can be regularly backed up and archived. If I ever need to delete a
particular users data, I could simply discard and lose that users decryption
key. While I do have mangled data that can be transformed back to the users
data, I no longer have the ability to read the information and I wouldn't be
able to access it (assuming you don't have the power to easily crack the
encryption).

