
What Surveillance Valley knows about you - eggspurt
http://pando.com/2013/12/22/a-peek-into-surveillance-valley/
======
krakensden
The Google bait in the lede and the failure to address it at all is
wonderfully classic Yasha Levine. Here is the link from the article on the
event that caused this- it's slightly more succinct:

[http://blogs.wsj.com/digits/2013/12/19/data-broker-
removes-r...](http://blogs.wsj.com/digits/2013/12/19/data-broker-removes-rape-
victims-list-after-journal-inquiry/)

~~~
eggspurt
Hmm, would it would be possible to actually detect if Google's got information
about a visitor - by creating a web page with google ads and then snooping
what targeted ads are getting shown (which is possible through DOM traversal)?
This way, information can leak.

~~~
yetfeo
If the ads are shown in an iframe on a different domain you won't be able to
traverse them due to cross origin restrictions.

~~~
eggspurt
Malware that can hijack the users' cookies can do that, as described in
[https://news.ycombinator.com/item?id=6950758](https://news.ycombinator.com/item?id=6950758)
at 15m40s
([http://www.youtube.com/watch?v=xv6K2GqyijM#t=15m40](http://www.youtube.com/watch?v=xv6K2GqyijM#t=15m40))

~~~
yetfeo
"Malware that can hijack the users' cookies" is a little different to "which
is possible through DOM traversal" though. I wanted to address the point in
case some got the impression just using the DOM via JS would allow it.

------
yummyfajitas
Lets be realistic about how the "rape sufferers" incident happened. They
started with some list of standard medical conditions joined against users:

    
    
        condition       user
        dermatititis    1
        rape            2
        athletes foot   3
    

Then they did this:

    
    
        SELECT UNIQUE(condition) FROM user_sufferers
    
        <h1> Lists for sale </h1>
        {% for condition in conditions %}
          <li> {{condition}} sufferers </li>
        {% endfor %}
    

Fundamentally no different from

    
    
        ["Keep Calm and {{verb}} a Lot" t-shirts for verb in list_of_verbs]:
    

[http://singularityhub.com/2013/03/20/keep-calm-and-rape-a-
lo...](http://singularityhub.com/2013/03/20/keep-calm-and-rape-a-lot-t-shirts-
show-automation-growing-pains/)

The fact that things like rape, pregnancy [1], etc are detectible by computers
makes great headlines. But we shouldn't act as if MedBase was deliberately
targeting [{{condition}} sufferers for condition in hot_button_list] - that's
fundamentally dishonest.

[1] See the case when Target knew some teenager was interested in products X,
Y, Z, where the correlation was driven by the underlying causation of
pregnancy.

~~~
mortov
Are you saying you _know_ this for a fact or are you speculating a benign
explanation without any genuine knowledge of this particular case (by
referring to the keep calm t-shirts which are unrelated and irrelevant to this
situation) ?

edit: Plus I've never heard of "rape" being a diagnosed medical condition -
why would that be in a list of conditions ? I find your 'explanation' quite
troubling.

~~~
louthy
> edit: Plus I've never heard of "rape" being a diagnosed medical condition -
> why would that be in a list of conditions ? I find your 'explanation' quite
> troubling

The SNOMED CT coding system has an incredible number of coded clinical terms.
Including rape:

[http://phinvads.cdc.gov/vads/http:/phinvads.cdc.gov/vads/Vie...](http://phinvads.cdc.gov/vads/http:/phinvads.cdc.gov/vads/ViewCodeSystemConcept.action?oid=2.16.840.1.113883.6.96&code=54231004)

~~~
mortov
The condition it lists in your example is "Rape trauma syndrome" \- that would
be an acceptable medical diagnosis as it is the trauma associated with being
raped (and medical support and intervention could be necessary to deal with
this). The rape itself is not a medical condition, and I think that definition
listed makes that clear.

I still would not like the idea of a list of rape trauma victims being sold as
that almost by definition would add to the trauma they have suffered if they
knew this happened.

~~~
louthy
Sorry, I should have dug a bit deeper for the exact code. SNOMED CT covers
non-clinical terms along with clinical ones. It covers tons and tons of
'stuff' from measurements, drugs, finding areas, diagnoses, substances, social
contexts, foods, etc. The point was to create a coding system which captured
more than just clinical information.

SNOMED CT is used to build 'clinical statements' which are combinations of
SNOMED CT-terms to help build a more in-depth patient record. Many of those
terms are non-clinical, but can be used in combination with clinical-terms.

Here's 'Victim of Statutory Rape':
[http://bioportal.bioontology.org/ontologies/SNOMEDCT?p=class...](http://bioportal.bioontology.org/ontologies/SNOMEDCT?p=classes&conceptid=http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FSNOMEDCT%2F17044007)

I run a company that develops a web-based medical practice management system.
We use NLP techniques combined with the SNOMED CT database (and others) to
extract clinical-statements from all text inputted against a medical record.
Whether it's a complaint, treatment, diagnosis, or even an email to the
patient. So I don't doubt for a second that there are other systems out there
that do a similar thing and have auto-associated patients with various
clinical and non-clinical terms.

------
jlgaddis
"How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did"
\--[http://www.forbes.com/sites/kashmirhill/2012/02/16/how-
targe...](http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-
figured-out-a-teen-girl-was-pregnant-before-her-father-did/)

This event is referenced in TFA. I remember reading it when it first came out
and was (a) amazed at the ability to analyze purchasing data and come to
conclusions such as this and (b) worried.

Previously, I never really cared for carrying cash so I used a credit card for
almost everything I purchased -- online, of course, but also in person.
Nowadays, I pay cash anytime I can: at the gas station, the supermarket, at
restaurants, etc. I also stopped using my "loyalty" card at the supermarket
and obtained a new one that was not connected to me in any way.

I'm not buying anything I shouldn't be, of course. I'm just of the "it's none
of their business" mindset. This has carried over into other parts of my life
as well, such as logging as little personal information as possible at work
(ISP) and retaining it for as short of a period as possible.

------
gabemart

        > Normally, such detailed health information would fall under
        > federal law and could not be disclosed or sold without
        > consent. But because these data harvesters rely on indirect
        > sources of information instead of medical records, they’re
        > able to sidestep regulations put in place to protect the
        > privacy of people’s health data.
    

What "indirect sources of information" could be used to identify thousands of
rape victims?

~~~
thesausageking
One simple way is search data. If you have the data from a browser toolbar,
you could mine for searches like "chances rape victim getting pregnant", "what
is a rape kit", etc. and visits to certain pages of WebMD. This will be plenty
accurate to use for targeting.

------
navait
Is it possible to purchase all the information they have on you? I'm curious
to see the extent of what they have on me.

