Hacker News new | past | comments | ask | show | jobs | submit login

Data at rest must be encrypted.

Field level encryption. Just like password files. Salt and hash any potentially identifying information.

Translucent Databases shows how. https://www.amazon.com/gp/product/1441421343

Source: Created some of the first medical records digital exchanges (NYCLIX, BHIX, etc) in the mid 2000s. Worked very hard to figure out how to protect patient privacy. This breach and subsequent blackmailing was one of our nightmare scenarios. FWIW, nothing (nothing) has improved since.

I'm kinda curious about what some of the recent advances in differential privacy in ML could imply in the field here. Taking a look at FB's Opacus, they basically use added noise to ensure that the data "in transit" is not really the data "at rest" and I've toyed with the idea of storing working data and data at rest differently for sensitive information, with the latter being encrypted by a user's OTP so an operator breach cannot effectively do much.

The problem with this approach is the question "how much noise?" Too little and an adversarial attacker can do some kind of a regression based attack. Too much and the "anonymized" data fails to be a useful "substitute" for previous analysis/business logic previously operating on the raw data. There's also the whole field of homomorphic encryption to get into as well. It looks like there have been some interesting breakthroughs there from the crypto nut side, specifically with zk-snarks. I can't say I understand the underlying tech well enough to comment, though. Fascinating subject.

[1] https://ai.facebook.com/blog/introducing-opacus-a-high-speed...

Thanks. Learning zero knowledge is on my to do list.

I'm skeptical of differential privacy for patient medical records. I assume big data style deanon wins once you can correlate with other data.

Differential privacy is probably fine for "data dumps", like when study data is extracted from many patients. Since those dumps are not used for ongoing care, they can be more aggressively scrubbed.

Total aside:

Even encrypting data at rest is insufficient. Because even preserving chronological order (no timestamps) is sufficient for deanon. But last week I had a new notion for hiding time and order. I am a total crypto noob, so I'm still trying to figure out if this is a new idea.

As a defender of patient and voter privacy, and so therefore a very vocal critic of electronic voting, I will be deeply chagrinned if I figure out how to make it work.

I think (i Might be wrong) you are talking about SSE (Semantic Searchable Encryption) which by an d large can let you search (in an encrypted manner) and order by timestamp.

Hadn't heard the term SSE. I'm vaguely aware of tools for searching compressed data. Will learn more about SSE. Thank you.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact