
Google’s DeepMind made ‘inexcusable’ errors handling UK health data, says report - aaronyy
http://www.theverge.com/2017/3/16/14932764/deepmind-google-uk-nhs-health-data-analysis
======
aub3bhat
Frankly these post-docs are engaging in outright witchhunt against DeepMind
driven by nothing but pure political agenda.

The amount of data obtained is frankly tiny compared to that available
regularly to researchers in USA. E.g. as part of my PhD research I have access
to de-indentifed data on 40 Million patients spanning 5 years from several
states.

Not to forget programs like CMS Qualified Entity which provide access to
identifiable data on ALL Medicare enrolees to companies. Such arrangements
have been in place for decades. The only reason Deep Mind is being persecuted
is since they are a juicy target, and fear of AI sells very well these days in
academic circles.

The report does not lists any specific cases where violations occured but
rather makes broad hand waving claims about cabining or advertising.

For those interested here is the orginal paper, instead of verge article.

[https://link.springer.com/article/10.1007%2Fs12553-017-0179-...](https://link.springer.com/article/10.1007%2Fs12553-017-0179-1)

If they really had substantive argument (that indicated real malice on part of
Deep Mind) it would have been easily published in Lancet, BMJ, even New
England Journal of Medicine. Instead the fact its published in some subject
specific Journal should tell you about concreteness of their "findings".

~~~
mtgx
1) UK is not the US, and the EU has much stricter privacy laws

2) There is not such thing as "anonymized data" that is gathered by companies.
There are several studies out there that show that with only 4 data points you
can pinpoint someone in an "anonymized database" with 90% accuracy. Even
Apple's differential privacy can probably be reverse engineered to find
people. Just because _you_ don't care to do that, doesn't mean others wouldn't
do that either.

~~~
aub3bhat
1\. UK is not in EU. The argument is that such arrangements are commonplace
and I am sure one can find other such examples of data sharing in UK.

2\. LOL yeah you are so damn clueless that I wont even bother replying. Lets
just say there is reading some popular press article about privacy and
annonymity. And then there is knowing intricate details of how
business/research is conducted.

~~~
Silhouette
From your comments here, it appears that you're doing PhD research dealing
with large sets of healthcare data.

It also appears that you are casually dismissive of concerns about the risks
of that data being de-anonymised. Your best rebuttal seems to be some vague
allusion that the parent poster, who was essentially correct, didn't know what
they were talking about. (This is something of a digression anyway, because in
the actual case we're talking about, as the report notes, the data supplied
was already identifiable anyway.)

You have accused the authors of the report of going on a witch-hunt, but again
you have offered little real argument except that such things are going on in
other places, as if that makes those practices above criticism or
automatically acceptable.

Do you realise that you are providing a near perfect, real-time example of the
need for stricter controls on medical data and who is allowed to access it?

~~~
aub3bhat
Concerns about de-anonymization are NOT AT ALL relevant in cases where patient
level data is shared. Since its common knowledge that this data is ripe for
misuse and abuse. As a result government agencies have developed a set of
legal requirements and contracts to be used when sharing such data. Talking
about differential-privacy and de-identification-is-not-de-anonymization is
meaningless in this context since all parties are acutely aware of the
potential for misuse of this data.

And regarding your concern about me, if anything I should be the one engaging
in this ridiculous witch hunt against Deep Mind since I have during my PhD
developed an Open Source transparent analytics platform for data on millions
of patients.

But unlike the authors I want real debate and real systems that can be used,
not faux outrage over another click bait article.

[http://www.computationalhealthcare.com](http://www.computationalhealthcare.com)

~~~
throwaway729
_> But unlike the authors I want real debate_

Substantively improving patient privacy protections in a _concrete case_ by
forcing a large corporation to agree to strong privacy protections and
auditing regimes seems like a "real" contribution spurring a "real" debate.

For 1.6 million people, the results stemming from this paper much more "real"
than any number of git commits to a software system.

~~~
aub3bhat
Number of git commits!!!

hahhahah.

Chill dude, chill.

Software changes the world.

~~~
throwaway729
_> Number of git commits!!!_

Well, it's true. Differential privacy is a nice idea, but in this particular
case, you're disparaging a style of research that -- to date -- has had a much
greater impact on improving actual, real world privacy than all the fanciest
query engines in the world.

 _> Software changes the world._

The most important thing to know about differential privacy is that when in
comes to privacy, software always plays second fiddle to policy and politics.

Differential privacy algorithms are literally nothing other than the
implementation of a legal spec. Without the law, the algorithms are pointless.

Mind you, I don't intend to disparage differential privacy work in any way!
But disparaging policy research while holding up differential privacy systems
as the answer massively misses the point...

~~~
aub3bhat
LOL

I actually don't use differential privacy at all!

My software implements legal requirements as stated by the agency providing
the data. Which are equivalent to a stronger version of K-Anonymity.

I am not at all against Policy Research, the issue is that this particular
paper is a spectacularly BAD example of Policy Research. Policy research
should not be driven by FUD around AI (e.g. the quote "did not constrain the
company from using AI analytical techniques on the data") or some corporation.

~~~
throwaway729
_> I actually don't use differential privacy at all!_

I was just going by your own description on the product website. Anyways, kind
of irrelevant since your "software implements legal requirements as stated by
the agency providing the data". That's basically my whole point.

 _> ...is a spectacularly BAD example of Policy Research_

I tend to judge research by its merit and its impact.

Merit is discussed at length elsewhere, and IMO you're flat out wrong about
the normalcy of this particular agreement. But we can leave that to other
threads.

In _this_ thread, you're disparaging the research based upon its _impact_
("real debate", "real systems"), when in fact it _has_ had a direct impact.

------
akamaka
You can skip reading the article, as it does not list any "errors" that have
happened. It merely questions whether the agreement under which the data is
shared has adequate protections.

~~~
3JPLW
Indeed. The paper itself details seven "transgressions:"

> 1) We do not know––and have no power to find out––what Google and DeepMind
> are really doing with NHS patient data, nor the extent of Royal Free’s
> meaningful control over what Google and DeepMind are doing;

> 2) Any assurances about use of the dataset come from public relations
> statements, rather than independent oversight or legally binding documents;

> 3) The amount of data transferred is far in excess of the requirements of
> those publicly stated needs, but not in excess of the information sharing
> agreement and broader memorandum of understanding governing the deal, both
> of which were kept private for many months;

> 4) The data transfer was done without consulting relevant regulatory bodies,
> with only one superficial assessment of server security, combined with a
> post-hoc and inadequate privacy impact assessment;

> 5) None of the millions of identified individuals in the dataset were either
> informed of the impending transfer to DeepMind, nor asked for their consent;

> 6) The transfer relies on an argument that DeepMind is in a “direct care”
> relationship with each patient that has been admitted to Royal Free
> constituent hospitals, even though DeepMind is developing an app that will
> only conceivably be used in the treatment of one sixth of those individuals;
> and

> 7) More than 12 months into the deal being made, no regulator had issued any
> comment or pushback.

Quite a few of these strike me as rather absurd, but I don't know the
regulatory environment in the UK.

~~~
throwaway729
_> Quite a few of these strike me as rather absurd_

In a "that can't possibly be true" sense? Well, yeah, that's kind of the
point...

1-3 seem like the sorts of things that even the least privacy-sensitive person
can agree are troublesome.

If Google is willing to give anyone who signs a set of modest legal agreements
carte blanc unaudited access to data stored on their servers, I'll begin to
even remotely consider entertaining the claim that 1-3 aren't important.

5 in particular is _blatantly illegal_ in the UK unless DeepMind is providing
direct care. They claim apps == care (IMO absurd).

6 should just straight up be illegal.

~~~
3JPLW
Amusingly, I find point #4 (which you skipped) completely reprehensible and
unambiguously the worst offender. All these sorts of arrangements are done
within a legal context. If they satisfied the legal requirements, then the
other points lose their punch.

3\. If you want to learn new insights, you — by definition — need to include
data that a priori don't seem directly related. This point even notes that the
data was technically and legally well-scoped.

5\. Sounds like they were within the terms of the existing data privacy
agreements given 6. Again, I don't know UK privacy laws.

6\. My impression from the paper is that they're not only trying to manage AKI
but also improve the detection of it. Ok, sure, they're not going to improve
detection in deceased or transferred patients. Those probably should have been
minimized.

~~~
throwaway729
As you noted, the fundamental problem underlying 3,5,6 is that they're trying
to _detect_ AKI. That requires everyone's data.

I'm not opposed to that _in general_ , but there really ought to be 1) an opt-
in or at least a well-advertised opt-out mechanism; and 2) an independent
audit of how data is used.

FWIW I think this was a healthy push-back against "just trust us" and hope the
result is a cleaner template for future similar projects.

------
MistahKoala
This has the whiff of what might kindly be called activist research,
particularly given the track record of the author(s). There may be legitimate
questions and concerns, but I'm inclined to trust the Wellcome
representative's analysis of the situation.

------
koolba
> The data-sharing agreement — which was signed in 2015 and has since been
> superseded by a new contract — allows DeepMind access to medical records
> from 1.6 million patients attending London hospitals run by the NHS Royal
> Free Trust. Although at the time Google presented the deal as primarily
> about finding patients at risk from a condition known as acute kidney injury
> or AKI, the actual terms of the agreement, revealed in April 2016 by a New
> Scientist investigation, were more broad.

> The report notes that DeepMind was given access not only to relevant blood
> tests and diagnostics, but historical medical records dating back five
> years, including information on HIV diagnoses, drug overdoses, and
> abortions. The report also says the wording of the 2015 deal did not
> constrain the company from using AI analytical techniques on the data
> (something DeepMind disputes).

What's the legal status and overall vibe of something like this in the UK?

It's a bit different in the USA as we don't really have an NHS here with
_everybody 's_ data and it'd be done directly with multiple insurers or
medical providers. I'm guessing this would violate some type of patient
privacy laws as well.

Given the option I bet many people, including your humble commenter, would
opt-out too. (or just not opt-in if we're lucky).

~~~
throwaway729
[https://www.mib.com/](https://www.mib.com/)

 _> including your humble commenter, would opt-out too. (or just not opt-in if
we're lucky)._

At this moment, do you know what organizations own a copy of your medical
records -- in whole or part -- and what laws apply to each of those partial
records? If not, then how can you possibly hope to opt-out?

~~~
koolba
> At this moment, do you know what organizations own a copy of your medical
> records -- in whole or part -- and what laws apply to each of those partial
> records? If not, then how can you possibly hope to opt-out?

I don't and that's partly why I prefixed that line with, " _Given the option
..._ "

~~~
throwaway729
I see, I thought you meant "give the option to opt-out". As in the option to
opt-out is the missing thing, rather than knowing who you should even ask for
an opt-out.

------
cryptoz
Note: The errors are by people and business dealings, not in decisions made by
AI while analyzing health data.

------
joatmon-snoo
Errors in transparency and oversight. Fix the title please.

Still a very legitimate concern, especially in the age of privacy.

------
tomxor
I don't want my privacy invaded as much as the next person, but arguing about
broadness of medical history in this context is pretty stupid.

The data needs to be broad if you are interested in finding out things you
don't already know... that's why it's being fed into a machine learning
algorithm in the first place - If you get selective then it's not going to be
very useful - how do you limit history to what's relevant when you don't know
whats relevant?

There is however an interesting difference here between most data mining on
the web which is trying to sell advertising. In this instance it should be
fully anonymisable, only the doctor should be allowed to see that patient ID
e76f57a is John Smith.

~~~
3JPLW
Note that there's a big difference between de-identifying and fully
anonymizing data. In general, it's extremely hard to fully anonymize a
dataset. De-identification gets you most of the way there, but its frequently
possible for a dedicated attacker to re-identify users by combining the data
with public datasets. For example, patient ID e76f57b may be a 38 year old
woman in Smalltown. Her medical record states that she gave birth on
3-20-2017. Find all public birth announcements from that day in that town with
a 38 year old mother. Once you have a small set of candidates, it's not hard
to narrow down farther.

------
mikecb
Interestingly, Deepmind's Ben Laurie has been working on certificate
transparency-like tech to permit a verifiable audit log of access to data like
this, precisely for this purpose: [https://qz.com/929833/googles-goog-
deepmind-is-using-blockch...](https://qz.com/929833/googles-goog-deepmind-is-
using-blockchain-technology-to-handle-nhs-medical-data/)

If you want to take a look at the underlying technology, take a look at
[https://github.com/google/trillian](https://github.com/google/trillian)

------
Gatsky
"In July 2015, clinicians from British public hospitals within the Royal Free
London NHS Foundation Trust approached Google DeepMind Technologies Limited,
an artificial intelligence company with no experience in providing healthcare
services, about developing software using patient data from the Trust."

Actually, the institution which collects and stores the data handed it over
without due process. The article keeps trying to blame DeepMind, I guess
criticising the NHS is a little stale. I think this is in a journal because it
is too long for an op-ed but not substantial enough for a long form article.

------
killjoywashere
I would urge folks writing these deals to make sure to separate the rights
assigned to 1) algorithms, 2) data, and 3) the models they generate. There is
a clear joint interest in the models, and that seems to get missed in most of
these articles.

------
KCFforecast
Under Spain protection of personal data law, any file with personal
information must be accessible for the person, and the person has the right to
know, modify and delete that file, and to deny the access to that information
for any purpose.

~~~
desas
There must be qualifiers, you can't tell your bank to forget the money you owe
it.

~~~
KCFforecast
You are right, more information here:
[http://uk.practicallaw.com/1-520-8264](http://uk.practicallaw.com/1-520-8264)

------
biggio
I don't care or why should I care? As long as there's progress in making NHS
more efficient in treating me that's fine. If they need it I will personally
go to their offices and give blood samples or whatever they need on a daily
basis!

