
Ask HN: How are you implementing GDPR-compliant soft deletes? - xstartup
The idea: Customer requests account or item deletion, you set it to &quot;deleted&quot; in DB without actually deleting it and it helps for documentation purpose should the dispute arise over some issue in future.
======
snowwolf
Soft deletes as you describe aren't allowed under GDPR.

But a possible solution may be to disassociate the data from the customer (as
long as the data itself isn't considered 'personal data'). For example, if the
reason the data falls under GDPR is because it is connected to the customers
email address, you could clear the email address. But that wouldn't let you
ever re-associate it. But you could maybe (but don't take my word for it) one
way hash the email address (bcrpyt, etc.) and if a dispute arises in the
future scan the hashes for a match of the person raising the disputes email
address in order to re-associate the data.

~~~
samsari
My understanding, as someone implementing the GDPR-compliance for my company
right now, is that if you could produce the same one-way hash a second time
from the same input email address then the hash is still considered PI.

~~~
snowwolf
Fair point.

I can see the purpose of "right to be forgotten", but I think in some
circumstances it is going to be abused. Any service that "bans" users for
fraudulent/abusive activity and stores data about the banned user to prevent
them from creating new accounts is going to have a problem. Banned user can
just request to be forgotten and then create a new account. Unless there is
some exception within GDPR that will support this use case.

~~~
mattmanser
This is probably covered by article 17, 3, e:

[https://gdpr-info.eu/art-17-gdpr/](https://gdpr-info.eu/art-17-gdpr/)

You're allowed to ignore deletion rules for the purposes of:

 _3 Paragraphs 1 and 2 [the rights to be forgotten] shall not apply to the
extent that processing is necessary:

(e) for the establishment, exercise or defence of legal claims._

------
eb0la
Consult your Data Protection Officer first.

GDPR says you _must_ delete information about the customer; but there are
cases where you still might need to have that data available.

If your customer can interact with another one inside your app/platform,
he/she can commit a crime, and you might be required by court (and by law) to
disclose some information (even conversations! inside the platform).

Setting something to "deleted" might not be the best way to do the actual soft
delete.

Sometimes you can "delete" that user moving it to a separate part of an LDAP
branch (where nobody except someone with authority can access).

In other cases, you can add the "deleted" flag on the table. If so, _MAKE
SURE_ your app access the data from a view of the table where the 'deleted'
users are not present. Even better: partition the underlying table based on
the "deleted" field to physically separate active and deleted users.

But whatever you do, ask your Data Protection Officer first.

~~~
roel_v
Is there anyone reading this whose company has a DPO already? Is it an
internal or external person? How technical are they? I'm a developer and I
have a law degree; would that put me in an advantageous position to become
one? Is there a market for 'consulting DPO's', like companies hire
accountants, if that's allowed? Or do the big consultancy firms have the GDPR
market cornered already? I wouldn't want to go in a direction where I would
become what today's 'security auditors' do - go through a checklist of mostly
irrelevant topics, drum up a list of 'recommendations' that usually aren't
relevant or misunderstanding the situation but nobody cares anyway because
it's all just busywork to get 'certified' for this or that (or insurance
requires it). But if it would be actually working with technical teams on
questions like this, that would be interesting.

~~~
jacquesm
> Is there anyone reading this whose company has a DPO already?

I've seen one in all of 2017 (out of ~20 companies).

> Is it an internal or external person?

In that case it was internal

> How technical are they?

More legal than technical, but that's a very small sample.

> I'm a developer and I have a law degree; would that put me in an
> advantageous position to become one?

Yes. In fact that's probably one of the most lucrative combinations of fields.

> Is there a market for 'consulting DPO's', like companies hire accountants,
> if that's allowed?

YES! In fact if you are halfway decent this would be an extremely lucrative
thing to do, but it probably will become less so over time as the knowledge
gets diffused.

------
Ezku
> it helps for documentation purpose should the dispute arise over some issue
> in future.

If you are required to hold on to the data for legal purposes such as dispute
settlement, there is no issue. The customer can request you delete such data
but you have no obligation to do so. Issues arise when holding on to the data
is no longer "necessary". At that point soft deletion is not enough and you
must be able to remove personal data regarding that customer from your
systems. Source: IP lawyer I talked with last week on this.

From what I can tell, one good way of going about this in case you really do
not want to throw away data – such as when you're doing event journaling –
would be to have all personal data encrypted. When the data reaches its
expiration point or is requested for deletion, throw away the decryption key
and all you're left with is is the metadata which you are allowed to hold on
to for purposes of running your business.

EDIT: This concept is called 'cryptographic deletion'. Here's one whitepaper
on the subject.
[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.397....](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.397.857)

~~~
andriesvh
Are you sure about throwing away encryption keys is sufficient to be GDPR
complient? Does this comes from IP lawyer as well?

~~~
rbf
If the information no longer is possible to decrypt, it would no longer be
considered personal data.

------
zimpenfish
I don't think "soft deletes" are actually allowed, are they?

[https://gdpr-info.eu/art-17-gdpr/](https://gdpr-info.eu/art-17-gdpr/)

> the controller shall have the obligation to erase personal data without
> undue delay

------
raptorcomp
Some comments on the legal aspect to deletion under the GDPR: 1. deletion can
generally only be requested if the personal data is being processed under the
individual´s consent. Thus, other personal data such as under legitimate
interest or execution of a contract does not fall under it. 2. The rule on
deleting the data is not absolute as data retention laws prevail over this
rule. Thus, only if no data rentention law mandates the storing of the data
(which is often the case for business communication) then you are obliged to
delete the data or anonymize it. Hope this clarifies the non-technical aspect.
Dominic Staiger
[https://www.raptorcompliance.com/en](https://www.raptorcompliance.com/en)

------
apw808
OK. Here goes... Advice...

Firstly, download the regulation itself. At the very least, read article 17.
Article 17 concerns something called "The right to be forgotten". It is about
25 lines of legalese. Once you have read it, have a think about it. Then read
it again and have a longer thing about its implications. Then, just to be sure
you have not gone stark staring bonkers, read the rest. Carefully. Be under no
illusion, the GDPR is a game changer in information management terms, let
alone anything else.

------
jjoergensen
Deletion of backup-data is also an interesting topic

~~~
jakozaur
The law itself was written by someone unaware of that. A lot of
interpretations:

1\. The most extreme, go back to all of your backups and delete them too.

2\. You don't need to do anything, if you do not touch the backups and truly
treat them for disaster recovery.

3\. Your backups need to have reasonable retention (e.g. two year) and way to
apply post requests after recovery.

4\. A lot of in between.

5\. My personal interpretation is that in first year of GDPR there will be so
many companies that are not even trying to be compliant. Any companies showing
any reasonable efforts will be just left alone and at worst heard some
recommendations. Of course ad-tracking companies might get screwed, but their
business model seems to be incompatible with GDPR.

Also right to erasure can be tricky (e.g. what if you keep records for
support/warranty purpose). What you should do if someone exercise their right
to be forgotten and than ask you for refund.

~~~
ZenoArrow
In what world is two years worth of backups required for reasonable retention?
Either the backups are tiny or the company involved has got more money than
sense. I'd see no reason, in any company, for backups to be held for longer
than 6 months, and that would be an outside estimate (many companies could get
away with only having a couple of months worth of backups).

~~~
tyfon
In Norway you need to keep accounting data for ten years to be compliant with
accounting laws so I you lose eight of them you would be in violation.

Edit: I see you mean retention, I guess if you discover after a year that your
backup routine is malfunctioning and you need older backups but that is not
that common I'd guess.

~~~
moviuro
backups and archive are a different matter. Archives are meant to not be
altered; ever (WORM disks[0]). Backups are made to be rewritten, ~~lost~~, or
used for restoration.

[0]
[https://en.wikipedia.org/wiki/Write_once_read_many](https://en.wikipedia.org/wiki/Write_once_read_many)

------
unkoman
We're building a consent framework API so our customers can consent for
personal data use. Data is then cleaned and transformed (ETL) from personally
identifiable to pseudo-anonymised. The data is also separated into two
separate encrypted storages for anonymous and pseudo-anonymised data for
generalisation and separation. Random (important) hashed identifiers are
created and put into a metadata service which is used as a lookup-table. If
right to be forgotten is invoked, the data is disassociated from the pseudo-
anonymised and personally identifiable data thus making it anonymous.

Important is also how you handle data analytics and this is why we're
deploying high restrictions on raw data. Analytics will only be able to be
done through an analytics service which can give the employees access to only
certain parts of the data which is approved for the use-case. We're using
Apache Sentry for fine grained role based authorisation to data and metadata
and a directory services for user auth.

Things we've learned:

* Minimise data usage

* Don't use personally identifiable data

* You will need to be able to prove consent when it comes to data usage and it cannot be consent by default, it has to be opt-in

* Log all data access so that use cases can be proved. This needs to be evaluated and audited

* Encrypt in transit and at rest

* Centralise mapping for all data

------
rimeice
It may be that the future issues you envisage can still be met with anonymous
data i.e. some overall audit of your service use. Anonymous data retains a
small link to the original but is exempt from GDPR. One method is to
generalise records in a database, for example, mask / remove direct
identifiers like names, put ages in to age brackets and fuzz spatial data by x
distance. So this can be used to deal with erasure subject access requests and
sharing data in general. There's a lot of advice on this from the ICO -
[https://ico.org.uk/media/1061/anonymisation-
code.pdf](https://ico.org.uk/media/1061/anonymisation-code.pdf)

Disclaimer: I'm a fan of anonymisation because I'm working on a project to
bundle this in to a service - [https://anon.ai](https://anon.ai) \- would be
great to understand more about your use case.

------
he0001
All data under GDPR which also falls under another law can be stored, but NOT
used for any other purpose than of the overlaying laws. So just because you
have the information doesn’t mean you can use it. Having the same information,
even “soft deleted” would violate GDPR since you are not allowed to store it
like that. Having information grouped like that still implies information
which will be under GDPR. Like if you store this information in the “customer”
database that will imply that the customer was a customer at some time, and
then illegal by GDPR. If you move this information to another database which
stores “transactions” which you require to do by law that’s ok. But you are
not allowed to query that information to answer the question, “who have been
my customers”.

------
apw808
Hiya. If I may, a couple more things to add to considerations... Firstly
while, rightly, you concentrate on the design impact of article 17. I guess
most of you live and work in the US. What that almost certainly means, if you
deliver SAAS, is that personal data is exported from the EU for use. You guys
really need to understand GDPR Chap 5, articles 44 to 50 inclusive and get
hold of something called "treaty 108" which is about the idea of "adequacy".

And secondly, I am going to ask someone I know who is a legal eagle to sign up
to this site. She is very clever and knows this stuff backwards. She also has
an aspiration to come and work in the US....

------
yread
What if you outsource the PII? For example use a payment processor and only
store their reference. You can always go back to transaction in the payment
processor in case of disputes, but you don't store the personal information.

~~~
jacquesm
Then you will need a data processing agreement with that IPSP and they will
need to prove to you that they are in compliance. The good news is that in
that situation there is probably more knowledge and funding available to get
things right (but it is obviously not a guarantee).

------
apw808
because the GDPR is a tad vague, given the interconnected world we live in
nowadays (I am a brit living in a small (very) town called Tidworth which is a
pin prick o the map) there is something called "Working Party 29" which is
providing some fairly detailed and verbose refinements of definition of key
phrases and terms. This is an attorneys bean feast as far as I am concerned
because of the vagueness, if nothing else..

~~~
jacquesm
You can 'edit' your comments until they are one hour old, so you don't have to
make many different top level comments.

Also, if you want to make a larger comment it is usually possible to resize
the input field.

------
romanovcode
Your idea is not allowed under GDPR. You need to unassociate the account from
the user.

That is: Remove all personal information - email, name, surname, etc..

------
mr1976
This is how it was explained to me, and the approach we take. YMMV.

Understand what your lawful basis for storing and processing the data is, and
that dictates how you need to handle it. Plenty of people throw around large
fines and rights to be forgotten and make sweeping statements about "you can't
keep X data/we can hash things!/delete all the things". You have a number of
potential reasons for storing or processing data. Legal reasons, contractual
reasons, consent based reasons - data subjects have different rights depending
on the reason you're storing the data (there is a list of these reasons; see
below.)

I may be keeping data because I have a contract with a client, and I require
the data. An example of this would be an email address stored as a login. My
legal basis for storing the information is contractual (I have a contract with
my client). Does the data subject (in this case, the user who's email it is)
have a right to erasure? No. Can I store the information in my Amazon RDS
instance in Virginia? Yes. Provided I've explicitly stated it, and been
transparent about how and where I'm storing and processing the data, and my
client has agreed. Do I need to secure and look after the data? Yes.
Obviously. Do I need to get to get consent from the user? No. That wasn't the
legal basis for storing the information.

What about consent based stuff? I may have someone subscribe to my mailing
list. I get their consent. Therefore, the legal basis for storing and
processing is consent. That consent should be time limited, and I should be
transparent about it. I need to give them the means to review, withdraw and
act on their consent should they wish to.

What about keeping a record of a person you've deleted because they have
requested it? You can store this. Lawful basis is that it's a legal
requirement. If you go and use the data for any other purpose, it's not
allowed - because that would require their consent.

If you want to understand lawful basis, this is a good overview -
[https://ico.org.uk/for-organisations/guide-to-the-general-
da...](https://ico.org.uk/for-organisations/guide-to-the-general-data-
protection-regulation-gdpr/lawful-basis-for-processing/)

Trying to wade through one half of the GDPR and its requirements without
understanding lawful basis leads to confusion, because you'll keep hitting
cases that seem completely unreasonable (because they may not be required).
Trying to paint the law vague and unreasonable defeats the point - it will
become less vague over time. It makes privacy a first class citizen (something
we sorely need), and will become more specific as it's tested in the courts,
just like any other law in jurisdictions that value legal precedent.

Get to know it and work with it - this is not your mother's EU cookie law.

~~~
socalnate1
This is by far the best and most cogent answer in the entire comment section.
Pity it's so far down the list.

------
apw808
But it is only part of a series of new bits of legislation that is being
introduced in Europe related to computer security Some other acronyms "PECR"
something all marketeers should have a look at as it governs things like email
distribution

------
gtsteve
Another pattern you might consider is setting an expiry date on a database
row. You only show values where the expiry date is null. Every night you have
a cronjob or some other process that deletes all objects that have expired.

~~~
pan69
You probably already add a "time created" column/field to each row/document
(at least, I personally do this to all data I store in a database).

If you want to expire something, you can now easily calculate it and you can
even easily change the timeout/lifespan without having to update the data in
the database.

~~~
raverbashing
How would that help? Would you set time created to zero?

~~~
pan69
Rather than having an expire date, with a creation time you can e.g. delete
everything older than 1 day, or what ever. I.e. you don't hard code the expire
time in the data itself. If you want to change the timeout to 2 days it's a
change in your delete logic, not your data.

------
chesimov
What would be the sort of dispute you envisage?

~~~
CamperBob2
Government: "You know that data you were required to delete when $(USER)
requested to be forgotten? We require you to provide it in connection with our
ongoing investigation of $(USER)."

~~~
devdad
Is this a real issue though? If I comply to regulations to remove data as
required by law, I'd be surprised if a government body could require me to
provide data I am supposed to have deleted.

~~~
kasey_junk
This is a _very real_ GDPR fear. Some of its mandates run counter to other
local data retention mandates.

It’s not clear yet how that is going to shake out.

~~~
jacquesm
As HN'er detaro notes in this comment:

[https://news.ycombinator.com/item?id=16366864](https://news.ycombinator.com/item?id=16366864)

There are some provisions for those situations.

And on the subject of backups, those are typically exempt but there are some
obvious problems there when you restore a backup at a later time.

To me the big ticket items in the GDPR are the notification duty and the data
processing agreement 'chain' that gives some level of certainty that the
companies you deal with are going to take this serious.

The implementation details and all the moving bits and pieces are most likely
not going to be the parts where the real tests will be in the first year or
two.

~~~
kasey_junk
I agree with your assessment but the penalty part of GDPR is making lawyers
more jumpy than any regulation I’ve seen.

I’m putting essentially everything in the we’ll see category.

~~~
jacquesm
I see that as good news :)

It looks like the GDPR at least gets people's attention.

------
no1youknowz
About the GDPR, can anyone recommend a company in the UK they have dealt with,
that brought them up to compliance?

~~~
jacquesm
If you aim to do this before May 15th you will find that anybody that is
capable is fully booked for the remainder of 2018.

~~~
no1youknowz
> you will find that anybody that is capable

Define capable. Look at this thread as an example. Many answers contradict
each other. There are so many ways to interpret the guidelines, which in many
cases have not been thought through.

I have engaged in discussions with 5 companies located in the UK. All gave
differing answers on specific questions relating to data for marketing,
finance, and fraud.

~~~
outsideoflife
General compliance advice

1) Act in good faith. DPA fines seem to have been to people who had a blatant
disregard for data protection and their customers, not those who tried hard
but committed some technical breach.

2) Whenever new rules come out there is a long period of interpretation.
Unless you are in a very high risk category I wouldn't 'throw the baby out
with the bath-water' in the interim.

3) Documentation wins court cases.

4) Personally I was already trying not to have my data stolen, so I am not
overly concerned by GDPR. I am updating some policies, employee handbooks and
terms. I will watch how other companies deal with it before I act too rashly.

~~~
jacquesm
The most important bit that you can do that is actionable and that will not be
open to interpretation is to have someone competent write a clause into
employment contracts regarding data confidentiality, to put in place a
protocol on how to deal with various levels of breaches and to review your
sites privacy policy to ensure that it is still applicable (this is something
you should be doing regularly anyway).

On the whole I think your approach is a very balanced and reasonable one,
especially the 'act in good faith' bit. What surprises me is that plenty of
companies explicitly do not act in good faith and try to interpret the
directive creatively so that they can continue to do what they were already
doing without modification. That's asking for trouble in my opinion, some
companies in that bracket will find themselves in the un-enviable position of
being used to educate the rest.

Especially in adtech and marketing there will be a lot of tension between
business goals and the law as written and the finer you want to ride that line
the more important it becomes to have competent guidance.

------
apw808
Ladies, gents... The kinds of things you are saying/asking about GDPR are the
same kinds of things being asked on this side of the pond...

