I can tell you that GDPR is going to cause issues with block based backups. Many hosting providers don't separate customers on different block devices. When you back up a block device you have snapshots that have many different organizations data on them.
Part of making good backups is knowing that the backup can't change. The only solution now is to add paths to go back and modify those backups to remove customer data when asked too.
Reading https://ec.europa.eu/info/law/law-topic/data-protection/refo... I would agree, of course if that identifier is not in some other database, that maps it to a person. If you have just ids in a backup and you remove the person-ID mapping this should be fine.
I've seen a lot of people talk about having a separate table for ids that should be removed when you restore a backup. It seems like a plan pretty viable solution, at least from what I've seen.
The conventional solution to that problem I’ve heard for the last couple decades is to use encryption so the backup doesn’t need to be altered ahead of your normal rotation schedule as long as you can probably drop a customer’s key on demand.
The backups are encrypted, but the there is no way for the backup software to know one client's data from the other. Its block based, so all it sees is a volume.
Post hosting providers, or anybody really don't create new volumes for each customer. They would simply have a directory per client. Onces you start needing to know more about the file system then you sort of waste all the benefits block based backups provide.
By block based I mean volume based, were we simply copy the allocated blocks of the file system that changed between each backup.
I think the parent means encrypt customer data with key specific to that customer. When you erase that customer key their data becomes irreversibly damaged.
I get that, but the problem is the way data is stored today it is stored on a single volume. That is many customers are stored on a single volume. When backed up there is normally one key per volume.
I guess the real issue is who will be responsible ensuring backups are stored in a way that different clients are isolated.
As somebody who makes backup software I know the burden will at some point be on my plate.
That being said, if people stored data differently, and did actually have a key per customer then the backup software won't matter, because like the parent and you said, just delete the key. But nothing really works like that today, and it will require a massive amount of software to be rewritten to handle this sort of stuff. So until then either you can't backup your data, or you make the backup provider figure it out.
Part of making good backups is knowing that the backup can't change. The only solution now is to add paths to go back and modify those backups to remove customer data when asked too.
That is my plight anyways.