> This strategic partnership will combine Google’s cloud and AI capabilities and Mayo’s world-leading clinical expertise to improve the health of people—and entire communities—through the transformative impact of understanding insights at scale. Ultimately, we will work together to solve humanity’s most serious and complex medical challenges.
I'm sorry but this is just pure silicon valley speak. Are Mayo's patients really going to know what Google is doing with their clinical data? When I hear "partnering with Google to create machine-learning models for serious and complex disease", I have a hard time believing Mayo patients know what they are signing away when they consent to this (if at all, which is not mentioned?)
I have 2 rare diseases, and one of which was discovered via NIH funds at the Mayo Clinic in the early 2000s. I also have type 1 diabetes, and I can attest to the veracity of the claims made on the blog post.
For somebody like me, the situation is unwinnable, if I want to live. HIPAA is a joke because it is perfectly legal to combine other data with the HIPAA anonymized source to identify the individual. Every day, leaving the US looks better.
Consider the following de-identified data sets:
- [date, time, clinic, procedure or test being done, insurer] - as collected by the clinic chain so that it can get money from insurers
- [month, clinic, test name, test result] - for all tests made in the last year, collected for statistical purposes
- [date, time, latitude, longitude, phone number] - because AFAIR telcos sell this data
- [name, surname, phone number, ...] - some insurance company's list of customers
If you can get your hands on these datasets, you can trivially de-identify patients and even assign test results to them with high probability (that depends on how many tests of a given type are made in any given clinic per the unit of time used to group the second data set).
Real-world data sets may be less clear-cut than this, but there is more of it, and you can apply statistical methods to find correlations. You don't need to be 100% sure customer X has diabetes for the information to be useful to you; 70% or 60% is useful too.
"The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:
(B) All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code
(C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older
This the "Safe Harbor" method.
You could use the "Expert Determination" method. However, date + time + location attached to health information in your first data set definitely doesn't meet the criteria. I'll eat my hat if you find a supposed "non-PHI" data set with those.
In fact, the criteria for expert determination is literally that re-identification cannot be performed (without already having PHI-type information).
HIPAA may be a joke, but not for this reason.
If information can be re-identified as PHI in any way (including matching phone numbers, birth date, IP addresses, patient account #s, etc.) it doesn't meet the de-identification standard.
You must remove the 20 types of identifiers, or receive a certification:
"A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:
Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information;"
Moreover, your information can only be used for research if you give written permission (Section 164.508). If you have given this permission, you may revoke it for the future.
personally, I hope that my data could help people. I am not so concerned with my privacy that I would withhold my medical records if they could help. But, in exchange for that contribution I do want protection against misuse of that data by my insurers or other institutions. I recognize it isn't possible for me to enforce that use in any other way than withholding my data entirely.
The reality is that we cannot, as a society, afford to have no trust in our institutions. We need instead to focus on providing oversight and guidance of them such that they can actively contribute to the public good. Private contributions to institutions is mandatory for the public to succeed. maintaining that trust is the responsibilities of those institutions and they should seek whatever audits and oversights they can to keep that trust and fulfill their charters.
I will give up my data for "research" as long as I am 100% certain that my medical information will not be abused in a way that can harm me or others.
Until code is routinely audited and conforms to a strict code of ethics, you can count me out.
If you think it is "unfair", consider my position: I have a rare disease that has case reports and (small) cohorts at best in the literature. Publications on it are sparse.
Since I can be re-identified with literally zero effort, due to a lack of ethics of the STEM community, I have a target on my back.
I want to stay alive, and not be targeted by insurance companies and/or the medical-industrial complex.
Sorry, but if people want to benefit from me, then don't force unethical experimentation on me and others via big data and the wide sharing of that data.
I worry that one tradeoff of more privacy and less trust is that researchers won’t get the information they need to produce cures and treatments. It’s a faustian bargain that people who are sick have to either risk having their information leak, or risk science ignoring their condition entirely.
You have data sharers to blame for that. They're the ones that are destroying possible cooperation (everyone with that condition sharing data).
> It’s a faustian bargain that people who are sick have to either risk having their information leak, or risk science ignoring their condition entirely.
Indeed, and I think the way to solve it is to go after the leakers and the sharers and the "entrepreneurs". If I give my data for medical research, I mean bona fide research, as in scientists and labs and tax-funded scientific papers, and not "research" into lowering operational costs by selling data, or "research" done by startups partnering with the clinic.
I opted out of EHCR here (apparently I was one of the very few who did from a conversation with my GP receptionist) because I simply don't trust "It'll save us money" as a reason (I also don't trust my gov to be competent).
E.g. Google provides hardware and software.
Mayo clinic runs the hardware and software, while Google sees none of the data?
But you're right, Google is a marketing, search, and advertising GIANT. Even if they stay HIPAA compliant, I would want to see more deliberate separation of interest.
Another possible use would be to identify septic or nearly septic patients and alert a clinician to intervene.
Neither tech nor healthcare (pharma, insurance, and many hospitals) want anything more than money and fending off sharks is not something I want to deal with when I'm at my most vulnerable points imaginable (most will be fairly sick at some point in their life even if it's only near death).
This sounds wild, but it is true: Rare diseases are an absolute cash cow, and everyone should watch this. Our healthcare system in the US will be unsustainable if orphan drugs are not regulated (Which is why I naturalized as an European Union citizen, in addition to being American. I fret and worry about getting proper access to medical care every single day.): https://www.nytimes.com/2019/08/23/the-weekly/rare-diseases-...
(I do not believe that healthcare for all is unsustainable, but an unregulated free market will make it unsustainable.)
Just in case anyone was wondering, it is common to have a rare disease, and they are unfathomly expensive to have. In the US, the definition of rare disease (which really should be called "orphan conditions" based on the law) is tied to "orphan drugs" which in theory can collectively benefit 10% of the general population. There are a ton of orphan drugs being approved at the moment, which cost between hundreds of thousands of dollars per year to millions per year, in the US. The European estimate on rare diseases is more realistic and 6-8% of the general population has a rare disease.
So, do not think think that it cannot happen to you. You are naive to believe otherwise.
But not the same rare disease.
Orphan drugs collectively may benefit 10% of the population, but not individually.
Not that I disagree with what I think is your main point: a profit driven medical industry hurts those at the extremes, relative to those in the median. I don't know how awful that is or isn't. 100 years ago, those folks would just have suffered. It's a profit motive, in some part at least, which has fueled advanced treatments. If all healthcare and healthcare research were socialized, maybe the expensive treatments wouldn't exist at all.
What I'm not sure about is if you are critical of, or in favor of, this collaboration. Bringing "commercial-grade" AI to healthcare sounds like a good thing to me on its face. I've read here and there (perhaps it's sensational, but still) how some AI can be order or orders of magnitude more accurate than doctors when evaluating x-rays, or scans, or other diagnostics.
My worry here is in the profit motive of Google and the fact that, well, they suck these days in that they do not care about user privacy.
Obviously. Clearly you do not understand what it means to have a rare disease and that rare diseases play by a totally different set of rules than common health conditions.
> AI can be order or orders of magnitude more accurate than doctors when evaluating x-rays, or scans, or other diagnostics
Rare diseases manifest much more dynamically than something that can be trained to an objective via AI.
There are huge ethical issues involved.
> If all healthcare and healthcare research were socialized, maybe the expensive treatments wouldn't exist at all.
I do not know what you mean by socialized. Most of the groundbreaking research in general occurs via grants from governments. If that is what you mean by socialized, then I support that.
Your post seems critical of Google, but I don't know if that makes sense. Isn't your main criticism with the healthcare system and laws surrounding it? I don't know if we can blame companies that operate within existing law. The blame should instead be passed to policy makers and their voters.
Amazon: People who shopped for MRIs also shopped for...
Facebook: You have been tagged in this x-ray...
Apple: Glucose Monitor Pro, starting at $999/$42 monthly (gold/64GB, Apple Pencil and iCloud storage sold separately...)
Google/YouTube: Dr. Oz, Stephen Fry, and Ben Shapiro DESTROY healthcare in amazing TEDx talk...
What AI projects? What part of the cloud platform will they use? What will it be used for?
It sounds like a nice partnership, but it's not clear what Mayo will be doing with it
Three projects announced at the moment.
I feel fairly comfortable with these, since they are opt in and not aimed at users who are suffering a healthcare crisis while deciding whether to share their data.
Epic sat on its hands too long and missed whatever calling it had in this space.
In practice, that's probably still true for “safe harbor” deidentification, but less true for “expert determination” deidentification  that doesn't need the safe harbor rules. The latter option should be eliminated.
You can find more details about the situation by just searching for: "Enfamil" site:news.ycombinator.com
Then on the news.ycombinator.com URL ctrl/command + f "Spooky23"
Anyways it is extremely bad.
That's probably the biggest problem with HIPAA, not the law or supporting regs (which have problems, like the one I address upthread), but that most people's first and only complaint of a problem will be to the wrongdoer themselves, not anyone with an interest in enforcing the law. (In Spooky23’s case, there was some effort to go beyond that, but not to an entity actually responsible for enforcing the law in question, or even an agency of the right sovereign entity.)
In any case, while the practices spooky23 raises are, legal or not, a real concern, they in no way justify characterizing my criticism of the specific problems with HIPAA deidentificationn rules as naive in the context of a pre-existing discussion of reidentification of deidentified data (which is a completely different issue than sharing, legally or not, data which is not deidentified as is the issue in spooky23’s case.) Again, it's a real issue, just not a germane one to where it was agressively thrown into the discussion.
"Yes, you are. The events surrounding what happened to my wife was very painful (an ectopic pregnancy that nearly killed her), and a thoughtless reminder was very unwelcome. I still feel violated and betrayed.
In our case, I found out the marketing list from Enfamil and bought it for my zip code. _I complained to the hospitals’ privacy officer and the state regulator and found that everything was legal._
There is a lot of data on the topic...
Prescriptions: https://www.theguardian.com/technology/2017/jan/10/medical-d.... Linkage to lifestyle data: https://www.statnews.com/2018/07/18/health-insurers-personal....
In our case, the hospital pharmacy issued drugs to her indicative of a pregnancy. The pharmacy or insurer provides that information in real time to data brokers. The pharmaceutical companies assign quotas and send salespeople for certain drugs. There are other ways for data to get out that we’re not certain of. Perhaps the insurer “anonymizes” and sells subrogation information. Or the lab. In any case, they knew that my wife was admitted to an OB floor of a hospital, but didn’t know the outcome.
It’s not going away. The US government uses these same techniques with companies like Google to combat extremism or terrorist conversions — they actually use factors like this to target potential recruits with counter-information via ads. "
Yeah, as you can tell by the fact that I responded to the post pointing out that the two people complained to were:
(1) A person whose job it is to make sure the hospital doesn't get sued, who is never going to admit wrongdoing, and
(2) An official from the wrong agency (and even the wrong government) when it comes to the law in question.
Also, note that the link you've copied that isn't a 404 is only tangentially related, as it is about gathering and sharing data that never comes under the protection of HIPAA, not resharing PHI as addressed in spooky23’s post, which again is a different issue than reidentification of HIPAA deidentified data that I was responding to here. There are lots of different issued around health data, and or isn't helpful to conflate them, much less to hurl abuse at people for failing to conflate the different issues.
That's from 2012, it's no surprise that the tools for identification have gotten better, even without doing things that are illegal.
I don'tknow the specifics about the data brokers you're describing, this is a huge and complicared area, but I think it's correct to say that companies cannot re-identify de-identified data and then resell it as identified data, legally, under HIPAA.
You can't generate PHI if you aren't a covered entity.
> By the terms of the law (of which I am far too familiar), if you did this you would generate PHI
You are wrong. If an entity that is not a covered entity acquires deidentified data and reidentifies it, it can do whatever it wants with it under HIPAA.
"""Health Care Clearinghouse – A public or private entity, including a billing service, repricing company, community health management information system or community health information system, and “valueadded” networks and switches that either process or facilitate the processing of health information received from another entity in a nonstandard format or containing nonstandard data content into standard data elements or a standard transaction, or receive a standard transaction from another entity and process or facilitate the processing of health information into a nonstandard format or nonstandard data content for the receiving entity."""
My read is that the entities I'm describing would fall under this. If you can point to a specific example which you believes violates this (not an anecdote, I'm talking about investigative journalism or a court case or an academic with credentials in this area), I'd love to hear about it.
No, a clearinghouse is (to summarize the definition you posted from the regs) an intermediary between providers and/or payers in handling transactions for which standards exist under HIPAA.
They receive PHI in either standard or nonstandard forms, transform it to or from standard forms if necessary and transmit it on; it'd PHI the whole time through that function.
An entity acquiring deidentified data (which is explicitly not PHI under HIPAA, that's the whole point of deidentification) is not (for that reason) a clearinghouse, and if they can get other data and reidentify the deidentified data, they can do whatever they want with it.
The theory of deidentification is that the risk of this is minimal (indeed, other than scrubbing virtually everything that could possibly be used to reassociate the data, the only way for PHI to be deidentified is to get a notionally-qualified expert to certify a very low risk of reidentification.)
The problem is that all such certifications are based on a faulty premise: if data is not completely scrubbed so that reidentification without having essentially the equivalent to the original PHI is impossible, the risk of reassociation is almost never very low, because the process is automatable and the marginal cost is near zero.
Specifically: can I go to a data broker, today, in the US, and obtain records under my name that were derived from entirely de-identified data, that has been re-identified by the data broker?
> from entirely de-identified data
What do you mean by “entirely de-identified”? That sounds like you are referring to the HIPAA safe harbor option (which specifies an extensive array of things which must be completely purged), rather than the alternative HIPAA “expert certification of low risk” option. The problem is that the latter has the exact same legal effect as the former, though the only reason to ever use it is because the data isn't entirely de-identified.
The risk is with legally de-identified data, which is not restricted to entirely de-identified data.
You trust them to do the right thing for HIPAA in regards to a multi billion dollar enterprise?
I heard a keynote at NACL a few years ago that was a call to arms to solve these problems.
The for profit status of treatment in "western" medicine.
Laws like HIPPA that are well intention-ed, but written by lobbyists and out of field and out of date lawyers/politicians who don't understand the actual nature of data protection or the need for a patient to be in control of their medical records in meaningful ways.
There's also the lack of a national / international identity and legal / data security infrastructure: this makes it very difficult to associate government issued IDs to patient records and requests / authorizations for limited sharing of those records.
In a less crazy world the outcome might look something like this:
Everyone has a Digital ID; this is a government issued or signed PKI based contract approval key. It would be stored in a dedicated, open hardware, firmware and software, wallet that is used only for making strong signatures.
The Digital ID allows the patient to log in to government websites and associate their healthcare coverage (ideally single payer, but if they're rich and have a luxury plan that could be linked as well) at various medical centers to their (emergency) care records. They can also actively choose to, or passively allow, the sharing of specific records from one provider to anyone else, as well as obtain personal copies of all of their records from all of their providers. Any time a provider is no longer covering a given patient stewardship of those records transfers to the government agency providing this service (and is paid for out of a general fund based on taxing providers so they don't have to deal with this).
A management matrix might also allow for general records access approval, in the case that the patient just wants their entire medical history and ongoing updates to be provided to their pool of physicians.
Through that framework outside entities can also obtain access keys and links for the records at other providers which they are authorized to view the records at.
Also; of course, all of the records would be required to be in "open, patent free, free to implement record formats as standardized by the medical industry software and equipment providers"; a specific format wouldn't be legally mandated, but the use of formats that are intended to be interchangeable would be.
There are many what-ifs. The intent of the system I outlined is to make good data-hygiene practices easier and thus more likely.
I'll also point out that most EHR systems aren't 'airgapped' like paper records of old, but are still connected to the internet at least loosely for security updates if not limited remote access.
If there's some specific attack scenario that you feel is worthy of discussing as a topic that positively enhances knowledge and the exchange of information please outline such a concern in a proper venue; which might or might not be this comments thread depending on the specific concerns. I merely provided a back-of-napkin idea to start from.
I suspect Mayo also hopes Google can break some new ground in analytics beyond munging mere patterns. No doubt Mayo would love to explore all kinds of advances in medical practice using novel monitoring and instrumentation, esp. in the clinic.
No telling what Google has proposed: maybe a lot, maybe only a little. Their announcement says nought.
They, and the people they were marketing to, would have been appalled if at the time they could have seen at what it became. At least in the United States.