
Most Americans don’t realize what companies can predict from their data - pseudolus
https://theconversation.com/most-americans-dont-realize-what-companies-can-predict-from-their-data-110760
======
rixrax
I’m increasingly thinking that targeted ads are eerily similar to Isaac Asimov
psychohistory[0]. E.g. you cannot reliably predict individual behavior, but
with right|enough data you can reliably predict how a large enough population
will act.

This is why individually we often feel that they’re off the mark, or we’re
thrifty enough to ignore the ads or political or other targeting. But like
others have pointed out, data is out there, and ‘they’ have infinite tries to
get it right. And more importantly, it works already today. And it’s impacting
everyone, so as an individuals, we also get impacted in indirect and subtle
ways when e.g. friend of ours raves about new toy she bought without even
realizing that she chose this product over the other because of all the ads
that she never clicked.

[0]
[https://en.m.wikipedia.org/wiki/Psychohistory_(fictional)](https://en.m.wikipedia.org/wiki/Psychohistory_\(fictional\))

~~~
hari_seldon_
Great analogy.

------
mattkrause
I’m not thrilled about the amount of information being hoovered up, but....the
predictions being made with it aren’t terribly impressive.

Facebook should know a great deal about me, but the advertising categories it
puts me in are either completely obvious (based on locations and group
memberships that I’ve explicitly told it), or bonkers. It currently thinks I’m
part of multiple wildly incompatible religions and political groups and am
interested in a weird collection of abstract concepts (“decay”?)

Amazon thinks that my interests are dominated by textbooks and vacuum cleaners
(if only!)

Twitter has correctly sussed out that someone who mostly follows scientists
might be interested in science....or dogs.

~~~
luckylion
> I’m not thrilled about the amount of information being hoovered up,
> but....the predictions being made with it aren’t terribly impressive.

Conspiracy theory: what if they intentionally throw random garbage in there so
you don't get paranoid? The things they want to hit you with will still be
there, but they'll be surrounded by misses, and you're inclined to think "wow,
they sent me an ad for a new dishwasher just as soon as mine broke, but they
also sent me an ad for cat toys knowing full well I'm allergic to cats"? The
dishwasher is still a great hit, but less suspiciously so.

~~~
ardy42
> Conspiracy theory: what if they intentionally throw random garbage in there
> so you don't get paranoid?

That's not a paranoid conspiracy theory (and certainly not something that
should be down-voted to oblivion). It's well documented that Target did that
exact same thing to throw off its customers, so they wouldn't be weirded-out
by its pregnancy-prediction algorithm.

[https://www.forbes.com/sites/kashmirhill/2012/02/16/how-
targ...](https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-
figured-out-a-teen-girl-was-pregnant-before-her-father-did/):

> “Then we started mixing in all these ads for things we knew pregnant women
> would never buy, so the baby ads looked random. We’d put an ad for a lawn
> mower next to diapers. We’d put a coupon for wineglasses next to infant
> clothes. That way, it looked like all the products were chosen by chance.

> “And we found out that as long as a pregnant woman thinks she hasn’t been
> spied on, she’ll use the coupons. She just assumes that everyone else on her
> block got the same mailer for diapers and cribs. As long as we don’t spook
> her, it works.”

~~~
mattkrause
Predicting a pregnancy sounds so creepy and personal, but it actually seems
like it should be one of the easiest life events for a retailer to predict.

The story plays up changes in consumption (scented->unscented lotion) but
there are also whole categories of products that are used only by pregnant
women (e.g., maternity clothes, pre/perinatal vitamins). Another huge swath of
the store is devoted to infants (clothes, diapers, toys, formula). There are
also really strong demographic priors (women only, 18-40 or so, though you
could fine-tune that age bracket much more precisely with socioeconomic status
(credit score?) or zip code.

------
crispyambulance
The thing that makes me increasingly concerned is the possibility of an entity
using this data-surveillance NOT so they can sell us more crap, but for
ulterior malicious purposes.

We already got a taste of what this can mean with cambridge analytica.

But what if some hate group (or other extremist org) with deep pockets decided
to buy up and use such data in more sinister ways, targeting individuals or
organizations at large scale, developing "Stasi style" dossiers to use as
leverage for future actions?

The information would not need to be "perfect" but it could get increasingly
more accurate over time depending on how much attention they focus on their
targets.

~~~
bilbo0s
Just imagine if it's just some tech guy, maybe unemployed or something, tired
of being broke, who just needs to pay rent or something? That's what's
concerning, people don't even need to have a cause, they can just be
desperate. The government will be tracking the people who have a cause. You
can go to the government to get help against those people because those people
have something to lose. What about the people with _nothing_ to lose? More and
more people are out of work or underemployed, but I'm pretty sure the number
of people who need to pay their bills remains constant.

All of a sudden all this data starts to make well off people look more like
meal tickets. Imagine how easy it would be to get money out of that rich
looking lawyer guy who's having an affair? Or maybe the well off looking
doctor lady who voiced some views about blacks that her hospital, and the
local naacp, might find interesting?

This economy, combined with massive data retention and security breaches, will
make for some real perverse incentives in the future. We could conceivably get
to the point where all you'd need to be is some guy with internet access who
needs to pay rent by the end of the month.

~~~
wholinator2
Right, and what if you can also fabricate the evidence used for the blackmail.
Image generation techniques are improving year over year and we've gotten to
the point where generating a video of a person saying anything you want in
their voice with their face is coming closer to the realm of home computing
power. Anyone with the time could focus on hacking social media accounts or
even just faking a sceenshot. The media doesn't need much inclination or
evidence to start the slander.

~~~
bilbo0s
Yeah, what I'm getting at more is the stuff that doesn't make it to the level
of the media, but some guy could still use to make money. The media won't care
about some random lawyer having an affair, but the lawyer's wife would. The
media won't care about some random doctor who talked about how she could off
the sub-human blacks if she wanted, but it'll be about 5 seconds before
lawyers and, more importantly, medical review boards, start looking into
statistics at her hospital.

You have to be prominent before the media cares, but there are plenty of
people with money who are _NOT_ prominent, and I suspect they'd make very
tempting targets. In fact, I'd bet the people at that level would actually be
more likely to pay up.

------
pdkl95
> General interest data | .... political leanings, magazine and catalog
> subscriptions .... preferred {celebrities,movie genres,music genres} ...
> {Bible,New Age/organic} lifestyle ...

If they wanted to get people's attention, this list should have included
"preferred pornography" and maybe even "the other type of pornography that is
viewed when your spouse's cell phone and your cell phone are in different zip
codes.

------
lettergram
I wrote about this relatively recently, basically there’s now enough data and
enough good systems out there that companies can start predicting what you’ll
do next.

This has been a thing since credit scores. However, now it’s to the point they
can even mimick your voice and predict how you’ll respond to situations.

We are walking dangerously and blindly into a nightmare right now, and no one
seems to realize it.

~~~
echevil
Seems like it. I tend to watch specific type of videos at specific time of day
on YouTube. Even though I watch tons of other videos on the same account,
YouTube can do a pretty good prediction at that time of day and present me the
videos I'm going to watch. I find it so useful!

------
Brahma111
Have mentioned it before. Watch any recent videos of Yuval Noah Harari and he
almost always talks about the concept of 'The Hackable Human'. Forget about
what data they have. Soon they will know more about you than you do. It has
far reaching effect but let's hope humans find a way to keep outsmarting
technology.

------
FlowNote
If data collection can be sifted to discover behavior of groups...

And groups of people can have that behavior correlated to cultural and ethnic
and racial factors...

Then, technically, isn't all of Silicon Valley violating the Civil Rights Acts
of the 1960s? For example, let's say black males statistically swipe phones a
certain length and certain time... doesn't this mean ads targeting them can be
engaging in disparate impact?

~~~
darkpuma
Maybe. _If_ such a swipe gesture discrepancy existed, ML could pick up on it
as a proxy for race, despite having no concept of race and despite no human
directing it to do so. One example I've heard of is lending software learning
to use zip codes as a proxy for race, then systematically denying loans to
minorities.

Much more egregious than this though is for years facebook was apparently
allowing realtors to target only certain ethnicities. This was a case of
deliberate human-driven discrimination, and as far as I know nobody has been
held accountable for it. So far the tech industry has proven itself pretty
good at getting around the law.

------
theNJR
Funny how the brain works.

People do realize something is happening. It’s why so many think Facebook is
“listening” to their conversations then showing ads for “products I’ve never
searched for then talked to about with a friend”. No, FB inferred you would
buy it because your friend just did.

It’s hard to comprehend the effects of data collection. Which, of course,
makes it even more powerful.

~~~
drdeadringer
Just yesterday I read a post on Reddit where the OP was wondering about an
online ad being mere coincidence or some deep data-collection plot.

Per the story: They had purchased ice cream at the grocery store using a
credit card "never used for online purchases", and then at home they see an
online ad for that very brand//flavor of ice cream. This raised alarm bells,
hence the post to sanity-check.

Sometimes it's like we shouldn't fear the Terminator but the access terminal
in our pocket. Other times it seems like both, or neither.

~~~
mindslight
That isn't an "access terminal" in your pocket, but a fully-fledged computer
and sensor platform operated by hostile actors.

The ice cream is straightforwardly due to the full purchase data being
backhauled to the surveillance companies. The real wtf here is assuming that a
card not being "used on the Internet" affects anything. People want to cling
to this really weird "if I don't see it, it can't be happening model", as
opposed to realizing that the surveillance industry is based around operating
without your involvement or consent.

------
NeedMoreTea
Is this the only reason the current house of cards stays up and functions? I
think so.

Almost no one realises what companies predict and infer from their data, or
the extent of its collection. Once they see some of the surface effects they
start calling it creepy or scary.

If people ever start realising the true extent, expect a backlash, surely?

~~~
mtgx
Yes it is. And I always mention it on this board and others whenever people
start concluding that "people just don't care about privacy."

It's not that they don't care, they just _don 't understand_ the true
implications of having someone like Google or Facebook have pixel tracking on
all websites on the web tracking you, or them tracking wherever you go, and
the thousand ways in which that data could be misused by them, their partners,
or people stealing that data from those companies.

I've noticed from other older stories that even pro-surveillance politicians
don't understand what they are pushing for, as some of them were later
"shocked" to discover that those very powers could _also_ be used to gather
information on them. And then they started singing a different tune about the
surveillance powers spy agencies should be given.

~~~
echevil
> It's not that they don't care, they just don't understand the true
> implications

Just to give you a data point since you drew that conclusion. I'm fully aware
of the tons of the mechanisms website use to track you and all of it's
implications. I am not concerned about it at all.

------
specialist
Everything about every person, living or dead, is known in near real-time.

Seisent (bought by LexisNexus) was being used to solve cold cases in the mid
naughts. Just by using fragmentary data and sifting thru millions of
demographic profiles to see who matched.

[FWIW, that "What data brokers know" table is pretty good.]

\--

If we choose to protect people's privacy, give individuals control over what
is publicly known about them, we'll need to encrypt demographic data _at
rest_.

Meaning translucent database strategies. Just like how password files are
salted and then encrypted. You need your pass

Meaning using universal identifiers. Like implementing Real ID.

It is counter intuitive that identifying (catalog) everyone is how we protect
everyone. But if there's another way, I haven't heard of it.

\--

There will be some upsides to finally having one master identifiers.

Data quality will dramatically improve.

Truly portable health records.

Nearly 100% accurate voter registration (& eligibility).

The government "census" will be just running a report.

We'll daylight all the bad data broking actors.

~~~
lioeters
Setting aside the ethical/political implications of a universal identifier for
everyone, your description makes me wonder about the technical implementation
of the ID itself.

My first thought was whether it's possible to design the syntax of the IDs, so
that they're not just sequential or random, but have some inherent properties
that make them easier to organize, i.e., for sorting/categorizing. Kind of
like Open Location Code [0], but for people. Since they should be immutable
(same ID for lifetime), I suppose it could encode birth date/location, or
maybe genetic "markers".. (Edit: On the other hand, that would by itself be a
leak of private data..)

Once that's globally practiced (easier said than done!), there could be a
searchable database of all registered individuals on the planet. I could see
the practical advantages of having such a system, but it sure does have a hint
of dystopian future.

[0]
[https://en.wikipedia.org/wiki/Open_Location_Code](https://en.wikipedia.org/wiki/Open_Location_Code)

------
sct202
Layer in the fact that on Facebook--and other ad platforms--advertisers can
import lists of people by name/contact info to target specifically using data
from brokers, it gets creepier. I'm less bothered by a algorithm that tracks
me and serves ads as long as I'm aggregated in the result tracking, but when
they're tracking/analyzing me as a specific person is super creepy.

Edit: If you check out your Facebook > Ad Preferences you can see which
companies have added you specifically to their ads by email/phone number. My
list has a ton of car dealerships and real estate brokers.

------
roganp
I think the primary problem is glossed over a bit in the article. Even though
most people are unaware of the kinds of information gathered about them, 36%
are "Somewhat comfortable" or "Very comfortable" with the kinds of profiles
being built. This seems (to me) a depressingly high number, one that indicates
that there will be no grassroots rebellion against the surveillance economy.

------
imgabe
> For example, data about a mobile phone’s past location and movement patterns
> can be used to predict where a person lives, who their employer is, where
> they attend religious services and the age range of their children based on
> where they drop them off for school.

Is it me? I don't consider any of these to be particularly sensitive
information.

Where you live: We used to have these things called "phone books" where they
listed the name, address, and phone number of everyone in town. The world
didn't collapse.

Who their employer is: My name and picture are listed on my employer's public
website. Not exactly hard to find.

where they attend religious services: I don't, but of the people I know who
do, nobody has ever considered it something they need to hide. Many would want
to tell you and ask you to join them.

age range of their children: So? If somebody knows you have a kid aged 5-10,
then they can...what?

I mean, I still try to limit how much information I expose online, but if
anything this makes me less worried rather than more.

~~~
jacquesm
That's a rehash of a number of silly 'nothing to hide' pseudo arguments.

Phone books were not instantly searchable in bulk all across the globe. Who
your employer is is not important, but in bulk to know who your employer
employs and to be able to access that information in bulk and within a couple
of milliseconds gives a lot of power to outsiders. I should know because I use
that power regularly for my work, and trust me, lots of people we read up on
would do better to keep a much lower profile online. It does not benefit their
employers either.

Whether you attend religious services or not has been used to target (and
kill) people in the past, and if you are willing to extrapolate a bit, was
used for mass murder.

The age range of your children may not be so important, the fact that you have
children may be, depending on your station in life.

The fact that you personally have not been inconvenienced by any of this - yet
- is not a datapoint worth recording.

~~~
imgabe
> Whether you attend religious services or not has been used to target (and
> kill) people in the past, and if you are willing to extrapolate a bit, was
> used for mass murder.

Right, it was, long before the ability to aggregate this information en masse
existed. The fundamental problem there is that there are people who want to
kill other people because of their religion. Taking away information from them
doesn't solve the problem, it just delays it a little. It's a form of security
by obscurity.

If anything, they can still go and target the actual place of worship itself,
which is not hidden. If all they care about is killing people of a certain
religion, they don't need to know who they are particularly, just go to the
place of worship for that religion.

The larger point here, is that if we can't trust the authority figures, the
government, the corporations, etc. It doesn't matter how well we hide. The
problem of a fascist government can't be solved by hiding. It has to be solved
by enacting a government that's not fascist.

Edit: This is not a "nothing to hide" argument. The information listed is not
the sort that is private information. Your address is not a special secret
that only you can know. There are numerous legitimate reasons that the
government or a corporation might need to know your address. The government
needs to collect taxes and keep property records. Companies need to deliver
products that you order. All of that is going to require sharing your address.

~~~
Profan
That went well for all the dutch jews in WW2 where the government also held
detailed records about religious affiliation.. sometimes it is not about what
is now, but what will or might be later.

~~~
_jal
Exactly. People forget the role of IT ("data processing") at all of our peril.

[https://en.wikipedia.org/wiki/IBM_during_World_War_II](https://en.wikipedia.org/wiki/IBM_during_World_War_II)

------
nkkollaw
I'm curious, who do data brokers sell this data to, and how is it eventually
used?

For instance, if they know where I live and who my family is, do real estate
companies buy data of people that died recently to contact my family and get a
deal on my house?

------
buboard
Question if google asks you explicitly if you are OK with using such info
would that be OK with most people?

------
johnchristopher
They don't because it doesn't provide them any value in their day-to-day
lives.

------
trumped
thats one reason why most people dont care about privacy

