
Ask HN: Who really gives your personal info to Intelius, Instant Checkmate, etc? - mehrdadn
Please excuse the frustrated tone; the &quot;secrecy&quot; all over the internet about the whole issue has been driving me insane.<p>The usual useless BS is &quot;oh, these companies get it from governmental public records&quot;. Yeah, right. I&#x27;m pretty <i>darn</i> sure that some of the personal information I can find online on MyLife, Intelius, InstantCheckmate, Spokeo, etc. is not in some government agency&#x27;s public record, and regardless, surely there&#x27;s no way that hundreds of companies are repeating each others&#x27; work over and over again when they could just buy the information from someone.<p><i>Someone</i> (or a few) hidden underneath has to be doing the heavy-lifting of scraping people&#x27;s data from sketchy sources and selling them to third-party companies while staying hidden. My question is, who are these, and (where it is possible to know) whom are they selling to? How can I find out? Surely someone knows, and I&#x27;m tired of playing this goose chase where those who <i>don&#x27;t</i> know just make random guesses as to how the information must be coming from some some public records, and those who do know say hardly anything beyond &quot;you have to know where to look&quot;.<p>I&#x27;m not looking for just 1 pointer, though I would appreciate it. I&#x27;m tired of pointer chasing. I&#x27;m just looking for as comprehensive a list as possible. It <i>has</i> to exist somewhere... after all, when a court needs to order that someone&#x27;s information be purged (for whatever reason, e.g. for safety), it&#x27;s <i>got</i> to have a list of these data aggregators somewhere, so I&#x27;m sure some people must know. So how do I find out? I&#x27;m hoping to also learn to fish in addition to being given the fish.<p>Thank you!
======
monodeldiablo
I've had the misfortune to be present for an in-person demo of a verification
service nearly a decade ago. It involved those relationship questions ("Which
one of the following people have you not lived with in the last 5 years?")
that are incredibly creepy.

I was shocked that they had so much data on me -- I have no debt, no credit
cards, no house, no car, no bills, and I had always entered informal rent
agreements (I was poor) up to that point -- yet the rep was easily able to
list all the places I had resided from college to present, along with a host
of off-the-books housemates.

"Where the fsck did you get all this?!" I demanded.

"Have you ever ordered a pizza?"

Turns out, some fast food chains do a brisk business in reselling customer
data. I had ordered from Domino's _once_ , but that was enough to link my name
to a specific location.

This experience has made me _extremely_ sensitive about the information I give
while making a purchase. When I lived in the US, I stopped having food
delivered, paid in cash, and never signed up for branded credit cards. I
rented informally and, whenever possible, tried to just pay the landlord my
share of utilities in cash. Anything to keep a lower profile.

Far more companies than you realize are collecting as much data as possible
about you, your habits, and your relations. So I'm afraid your search for a
canonical list of data sources is ultimately fruitless. In this new economy,
you are _always_ the product.

~~~
ams6110
Do we forget that not long ago everyone's name, address, and phone number was
published in the phone book (unless you paid extra for an unlisted number),
and hospital admissions were published in the daily newspaper. It used to be
unremarkable, and now people shriek "privacy!!" when it's discovered that some
mundane detail about their life is not a closely held secret.

~~~
intended
No we don't forget.

Those systems were analog and could only be scaled to a certain extent, after
which you ran into management overhead.

So in the example given by the OP, dominoes would not be updating your name in
the directory.

That would be data that evaporated and never made it to record.

To increase the contrast - whole new types of clustering and analysis are now
possible, at the _moment_ a new data point is received.

~~~
tankenmate
Indeed, the amount and types of published information hasn't broadly changed,
what has changed is the information horizon; the ability to search for
information and analyse it. The world hasn't become smaller, our ability to
see is much much wider. The information horizon is much further away from zero
than it used to be (thanks to the computer's / big data's ability to
communicate and analyse), and it's impact on society is much more than just
privacy.

------
pandabear187
I have insider knowledge as I used to work for one of these companies.

Depending on the product you purchase the data comes from multiple sources.
Also these companies have sophisticated machine learning capabilities to build
a profile based on various attributes found in seemingly unrelated pieces of
data.

So the list consists of credit reporting agencies, public records, your online
profiles with public access, court records, aggregators like LexisNexis and
dozens like them.

This heavy lifting you speak of is done differently by each company and
consists literally based on multiple sources to enrich your profile. These
companies spend millions on data and engineering and make even more, and
whatever preconceived notion you have about courts ordering to seal your
records, it doesn't happen in a centralized fashion, you would need to contact
each data vendor individually to be removed. But it would be like playing
whackamole.

~~~
mehrdadn
Thank you for the reply. Can you actually give some kind of a list though? The
entire problem here is everyone explains the _how_ but no one is willing to
explain the _who_. I certainly understand it's "multiple sources", I'm asking
who are these sources I keep hearing about. There can't be nearly as many
sources as there are sites who buy from them. If you'd like to not name the
company you worked for yourself then could you at least please list as many
other ones as possible? That would be far, far more helpful than just saying
they use machine learning and that they use multiple sources, etc.

~~~
blackbagboys
There is no such public list. Most of these companies are privately held,
their methods are trade secrets, and there is no real form of legal recourse.
Your best bet would be to buy a book like this:
[https://www.amazon.com/Hiding-Internet-Eliminating-
Personal-...](https://www.amazon.com/Hiding-Internet-Eliminating-Personal-
Information/dp/1500397814) and follow the recommendations, but even that is
only likely to be half-effective.

Much of the public information is mined from sources like credit headers, your
court records, utility bills, property and tax assessment records, voter
registration lists, motor vehicle registrations, etc.

Unfortunately, the legal and technological landscape is such that 'hiding'
from these kinds of services is effectively impossible.

~~~
mehrdadn
> There is no such public list. Most of these companies are privately held,
> their methods are trade secrets, and there is no real form of legal
> recourse.

How can there not be a public list of these data miners? When e.g. a court
needs to control someone's information surely they know who these people are
and they can let them know? Is there a secret list in every courthouse or
something?

Or when someone wants to start another one of the higher-level companies --
how do they know which core aggregators to buy from? If that's a secret then
how would they find out? Surely someone's gotta be willing to tell?

~~~
dsp1234
_a court needs to control someone 's information surely they know who these
people are and they can let them know?_

I think you have a misunderstanding of what a US court can do. A court can
only tell a specific party to take some action, and generally only if that
party is somehow related to the legal action (such as being a defendant).
Generally, there is no judgement that a court can make that can effect unnamed
parties (unless they are John Does, which later have to be named).

Theoretically, you could sue each company with your data, and a court could
tell each of those companies to remove your information. But it would have to
be for each one, and the judgement is only binding on those companies.

 _Surely someone 's gotta be willing to tell?_

The techniques used are generally trade secrets, and amount to competitive
advantage. There is little incentive for a company to reveal this information
(or for an employee to do so, and thus open themselves up to legal liability).

~~~
mehrdadn
What about this though:

>> Or when someone wants to start another one of the higher-level companies --
how do they know which core aggregators to buy from? If that's a secret then
how would they find out?

~~~
cosmie

       > If that's a secret then how would they find out?
    

You don't. I've worked in data acquisition in the past, both buying data and
selling it. Sometimes as the original source of truth and sometimes as a
middleman that does data cleaning, standardization, appending (from other
sources), then selling the derived product downstream.

Companies in that space guard their upstream sources quite heavily, because
they don't want to be cut out of the process. You won't find a centralized
list of independent data feeds and providers specifically because of that. In
one scenario, we were dealing with a substantial rate increase from one
supplier. We spent time attempting to source an alternate supplier of that
particular type of data, and could only find sources that were several months
more stale than we were currently getting (i.e. these people were getting the
feed several hops after we were). In the end we paid the rate increase because
we couldn't find an alternate source that was as close to the original data
provider as our current source. And without knowing who the original data
provider _was_ , we couldn't go around our supplier.

The lack of a centralized directory isn't just done to make things opaque for
end users, it's done to make things opaque for business competitors as well.
It's an industry that's very, very reliant on networking and introductions.

Edited to add: You're also asking a lot of people in here to name specific
companies even if they can't give you huge lists. This space is super heavy on
NDAs (and trigger happy on enforcing them). If you've actually worked in it,
there's simply no way you're able to name drop legally.

~~~
mehrdadn
+1 thanks for the explanation!

And regarding this:

> Edited to add: You're also asking a lot of people in here to name specific
> companies even if they can't give you huge lists. This space is super heavy
> on NDAs (and trigger happy on enforcing them). If you've actually worked in
> it, there's simply no way you're able to name drop legally.

I understand that an NDA would prevent you from naming your own company or
your suppliers and clients, but surely it doesn't prevent you from listing
some other companies in this space that you know of (including but not limited
to your competitors)? I don't understand why you shouldn't be able to name
_any_ company just because you've worked at _one_ of them.

~~~
cosmie
Every company I know in the space is a company that we've had at least
preliminary conversations with (seeing if there's any potential relationship
to either purchase from or sell to that company).

Just having that conversation required getting a mutual NDA in place, since
the conversation involves revealing your capabilities (even if not your
sources). And that's assuming you're even aware of all the NDAs your company
has signed with other companies, which isn't always the case. Speculation or
name dropping in public could violate an NDA you're not even aware of, then
you find yourself having to defend your speculation as just that, rather than
as revealing proprietary knowledge (that you didn't actually have but your
company did).

At the end of the day, it's easier to default to speaking in generalizations
rather than risk the potential repercussions of _not_ doing that. :-/

~~~
mehrdadn
Okay, I see. But somehow someone in your company found out about them before
they could sign mutual NDAs, right? How does that happen? Do you just need a
higher-up who's friends with the right people?

~~~
lithos
Imagine the legal liability you'd incur if the layperson could track the
source of inaccurate data, then proved in court it kept them from getting a
loan.

Those are provable damages, and maybe slander if they can convince the court
they're a publication

Another reason to be NDAed up.

------
JoshTriplett
If anyone reading this thread is interested: I would pay non-trivial amounts
of money on a regular basis for a service that systematically worked to
eliminate records like these (and the sources they draw from), as well as
chasing down sources of junk mail and the lists they ultimately draw from.

The value would depend on effectiveness, and on the degree to which the
service clearly reported exactly what they did. Calling and unsubscribing from
sources of junk mail would be a moderate time-saver, but finding out _where_
they got their names and addresses from and destroying _those_ would be far
more valuable.

It'd take some optimization and batching of the process to figure out how to
avoid taking an excessive amount of time per person.

~~~
jameslk
I've considered working on this problem from a business standpoint, but I
couldn't figure out a good business model for it. I don't think too many
people will pay a monthly fee to have their information removed from these
services. My guess is that they would sign up for a month and after their
information has been removed, immediately cancel their subscription until they
needed to do it again. And a yearly fee seemed like it would cost too much for
mass adoption.

There's also the problem that you'll often need to get the customer to "opt
out" by providing their own information to verify they own it or they will
need to click on a link from an email or receive an SMS text verification
code. This gets really messy as an automated service.

~~~
JoshTriplett
It's possible that the business could be so successful that everyone uses it,
the services selling this information all run out of customers and go out of
business, none of them come up with newer and more evil ways to do this, and
you run out of potential customers. In which case: _mission accomplished_ ,
retire on your giant pile of money and bask in the knowledge that you made a
far better place. (Avoid scenarios in which you have perverse incentives to
allow the problem to continue.)

But in the meantime, tens of millions of potential customers times any
reasonable fee seems more than enough to build a substantial business on.

You could tempt people in with a cheap fee to let them send in a few pictures
of junk mail and stop those. (As you get more, find the biggest sources and
automate or batch them so that they cost you almost nothing, which will pay
for the higher-effort ones. Have an upper bound on effort expended, and tell
people that they don't pay if you can't remove them.) You could then track
down the underlying sources, and if you successfully identify them, contact
the customer, and give them enough information to decide to pay you for a
higher-end service to get them removed from those sources (and keep them
removed).

The value that gets people to keep paying you would be a steady stream of
reports of "we found this source leaking/selling your information, here's what
we did about it". It'll take you _years_ to track down all such sources and
find paths to remove them; you will likely end up having to fund some legal
work and possibly even a lawsuit or two, which will give you a giant pile of
publicity.

(As one example of something much easier for a company optimized for the
process to do than an individual: the USPS has a detailed process for formally
putting a company on notice for mailing someone who has specifically
unsubscribed, and that process ends in massive fines for continued mailing to
that person. I read a report of someone doing that to stop receiving
persistent Dell catalogs.)

If you're sufficiently creative, you could even pitch this as a _service_ to
marketing companies. You have a list of people who will not buy anything via
direct mail, and who will despise any company that they receive such mail
from. Convince the sources of postal spam that removing those people from
their list makes the rest of their list _more_ valuable. Convince the
downstream customers of those sources that using your list directly is far
more convenient for them than dealing with opt-outs from every individual on
it.

That also gives people a continued incentive to pay to remain on that list.

~~~
TrinaryWorksToo
>the USPS has a detailed process for formally putting a company on notice for
mailing someone who has specifically unsubscribed, and that process ends in
massive fines for continued mailing to that person. I read a report of someone
doing that to stop receiving persistent Dell catalogs.)

I would love to learn this process! I've repeatedly asked for a certain
mailing to stop and it hasn't ceased.

~~~
edwhitesell
I'd rather see much larger bulk rates via the USPS for something like
"environmental impact".

For example, I'm currently getting no less than 3 letters per week from
Spectrum (formerly TWC) promoting their new triple-play plans. I drop all of
them in the recycling bin.

I couldn't care less about their bottom line, but I do care about the
environmental impact.

At 3x per week, it probably costs them around USD$0.80/week (postage, paper,
printing, etc.) to send those 3 letters. Call it USD$1 to make the math
easier. I currently pay about $9.03/week. If I upgraded, it would be at least
3x that amount.

I'm not sure how to calculate the profit they would gain after an upgrade, but
I can't imagine it would take more than 2-3 months for the new rates to more
than cover the mailing fees for an entire year of their letters.

And yet, I'll never upgrade and I hate the waste caused by their practices.

~~~
jey
Why does this happen? Is there some subcontractor who just bills by the piece
so they don't care about de-duping the idiotic mailmerge that has 5 entries
for each entity I'm affiliated with?

~~~
edwhitesell
In the case I listed above, these are separate, distinct letters. Certainly
from an automated process, but clearly all part of a larger marketing program.

In the ones like Dell (which I've also experienced in the past), I suspect
there's some metric involved for getting the most contacts for "coverage". The
reality is unless they were tracking my moves to different companies, there
could be different people with the same name. In that sense, I'm very glad
they can't correlate with employment data.

There's also the idea that sending to multiple people within a company means
someone may see something they want and try to go through the procurement
process because of the catalog, rather than because "IT" says it's time for a
refresh.

Not that I agree with any of those practices, for a number of reasons, but I
could understand the case for them.

------
executive
You mean data brokers
([https://en.wikipedia.org/wiki/Information_broker](https://en.wikipedia.org/wiki/Information_broker))

Top US brokers:

\- Acxiom

\- Experian

\- Epsilon

\- CoreLogic

\- Datalogix

\- eBureau

\- ID Analytics

\- inome

\- PeekYou

\- Rapleaf

\- Recorded Future

Protip: loyalty/reward cards are a gold mine, especially drug store purchase
receipt data

~~~
bks
Opt out links I found -

Acxiom -
[https://isapps.acxiom.com/optout/optout.aspx](https://isapps.acxiom.com/optout/optout.aspx)

Experian - [http://www.experian.com/blogs/ask-experian/credit-
education/...](http://www.experian.com/blogs/ask-experian/credit-
education/preapproved-credit-offers/opt-out/)

DataLogix Holdings, Inc. [https://www.datalogix.com/privacy/#opt-out-
landing](https://www.datalogix.com/privacy/#opt-out-landing)

Epsilon Data Management, LLC [http://www.epsilon.com/consumer-preference-
center](http://www.epsilon.com/consumer-preference-center)

Equifax, Inc -
[https://help.equifax.com/app/answers/detail/a_id/2/noInterce...](https://help.equifax.com/app/answers/detail/a_id/2/noIntercept/1/kw/prescreen)

Fair Isaac Corporation
[http://www.myfico.com/policy/privacypolicy.aspx](http://www.myfico.com/policy/privacypolicy.aspx)

Intelius, Inc.
[https://www.intelius.com/optout.php](https://www.intelius.com/optout.php)

LexisNexis Group [http://www.lexisnexis.com/privacy/for-consumers/opt-out-
of-](http://www.lexisnexis.com/privacy/for-consumers/opt-out-of-)
lexisnexis.aspx

TransUnion Corp.
[http://www.transunion.com/corporate/business/datareporting/s...](http://www.transunion.com/corporate/business/datareporting/support/opt-
out.page)

~~~
jibberia
Thanks, this is very useful.

The paranoid voice in my head is wondering if these forms don't actually opt
me out of anything, and instead just confirm to these companies that the
information they have on me is correct.

------
goshx
It pisses me off too. US is so concerned about privacy, yet a LOT of your
private information is made public once you start opening bank accounts,
buying real estate, sign up for gym, etc.

When I opened my first bank account they had a typo in my name, which I found
out when I received my debit card. I asked them to fix it immediately,
however, two to three weeks later I was already getting mail from stores
addressed to the misspelled name.

When I was buying my first house I immediately started receiving mail from
moving companies at my old address before I signed the closing. After I moved
I got a lot of junk mail with other kinds of offers. I even started getting
PHONE CALLS from a home monitoring/alarm company. When I asked them where they
got my number they hang up.

It is like all the information is up for sale somewhere.

~~~
confounded
Taking power away from capital to give to individuals is currently positioned
as un-patriotic in the US (e.g. pandabear187 above's beleif that regulation of
data brokers will lead to fascism or communism, even though he goes to great
lengths to protect his own information).

------
VLM
"when a court needs to order that someone's information be purged (for
whatever reason, e.g. for safety)"

I believe that is the location of your confusion, that is a Hollywood fiction,
mostly. If a collections agency is bugging you there is a way to resolve it
via the legal system, but its very much case by case and company by company
business. A judge can order one company who's officer or agent is present in
the courtroom to do something to one record. A judge can purge his own legal
system's record of an arrest if he wants to. Belief in this in general is
analogous to non-computer people believing in the CSI tv show or hollywood
hacking

~~~
mehrdadn
What about the new higher level companies that pop up amin to Instant
Checkmate? How do they know which lower-level companies to buy your
information from? There's no way they ALL do the heavy lifting themselves.
Someone's gotta be making money off doing the real work and others must be
buying from them.

------
criddell
Last year I was getting constant calls and snail mail about buying an extended
car warranty on a car that I no longer own. I asked the place where I bought
my car if they sell that information and they claimed not to.

So where do these sleazy companies get that data? The DMV?

This year, I'm getting two or three calls every week about a buying a home
security system and monitoring.

I don't understand why these calls aren't easier to block. Somebody knows here
they are originating from. Why can't I get that information too?

~~~
szc
The "home security system" calls could be social engineering to find out if
your home is protected or not. Just by listening to what they say and not
saying you already have one is enough to tell the caller what they want to
know.

~~~
criddell
I would never tell them anything, but I have listened to the recording and
pressed '1' to speak to a representative and when I do that, nobody ever picks
up. It's baffling.

------
showkiller
I believe and based on the links below the data is sold by different entities
to companies like Intelius etc.

[https://www.scientificamerican.com/article/how-data-
brokers-...](https://www.scientificamerican.com/article/how-data-brokers-make-
money-off-your-medical-records/)

[http://triblive.com/news/allegheny/8690215-74/drivers-
inform...](http://triblive.com/news/allegheny/8690215-74/drivers-information-
companies)

[http://spectrum.ieee.org/riskfactor/computing/it/us-
states-s...](http://spectrum.ieee.org/riskfactor/computing/it/us-states-
selling-hospital-data-that-puts-patients-privacy-at-risk)

------
e0m
If you have your own domain name with a wildcard, it's really helpful to
enter: someservice@mydomain.com as your email. That way if it leaks you'll
know who did it and can setup much more robust rules to block. I'll use the
domain name as the main address so I remember which name goes to which site.

For physical address mailings, you can hyphenate (or use a middle name) as the
service. So First Service-Last as the addressee name. While harder to setup
"mail rules" for, at least you'll know who to never trust again.

~~~
Sephr
I've been doing exactly this for a while. Here is my list of companies that
have leaked the email address I gave them to spammers:
[https://gist.github.com/eligrey/5084991](https://gist.github.com/eligrey/5084991)

~~~
mehrdadn
Note that (I think) Adobe was hacked, so that doesn't mean your email was
"leaked" by them per se, not in the sense we mean anyway.

Also, out of curiosity, how long did it take these companies to leak your
info, generally? Days, weeks, months, years...?

~~~
Sunset
Dropbox was hacked as well.

------
etree
The one that surprised me is to learn that virtually all health insurance
companies sell your personal health information. Most people think this is
illegal because the data is sensitive. But it turns out that if it's generated
by a business transaction (i.e. a claim between your doctor and your insurance
company) then it's not considered PHI and it's not protected.

------
chrisgoman
For pre-employment screening, we had court runners literally sitting through
the courthouses going through paper records for each candidate on an "ad-hoc"
basis. Some companies do this in a more organized fashion by having a person
just data enter ALL the records (like in North Carolina if I recall) and since
they had this data, we just bought the company.

~~~
mehrdadn
Pre employment screening is different though. They need your permission for a
legal background check and of course they will do everything necessary to do
it. I'm asking about the information that leaks without your permission.

------
NickBusey
Small tip: At the grocery store or anywhere else with a rewards account linked
to a phone number, rather than signing up for one just use (Your Local Area
Code)-867-5309

The number almost always exists and is a valid account. Get the discount,
don't get tracked. Thanks, Tommy Tutone.

~~~
Turing_Machine
If I can't conveniently avoid those things, I like to use the names of famous
serial killers, with a local address that would be in the ocean (if it
existed).

I've yet to have a sales clerk question it (or perhaps they just don't care).

~~~
ams6110
The clerks don't care. Why would they?

------
AdmiralAsshat
Have you ever gone to the doctor, signed up for a gym, or signed up for your
local grocery store's membership rewards program?

That's how they get your information.

~~~
engx
I went to Pavilions grocery store the other day, and when I got home, the
Facebook app said, "Have you been to Pavilions recently? Click here." I want
Facebook to find friends and events nearby me, not track where I go.

I can only imagine all the location data Google, Apple and Facebook is
collecting and what they're actually doing with it.

~~~
literallycancer
Google asks you to take a photo when it thinks you are somewhere like a newly
built mall, or anywhere where they don't have many photos in general.

------
xenadu02
Many clients of these services have agreements that require them to contribute
data back (or at least they get a discount for doing it).

I worked for a telephone company that used a service like this a long time ago
and we shoveled customer data back at the provider. We did NOT give them call
records though.

Almost any medium sized company you deal with is selling your data.

------
slv
LexisNexis is one of the aggregator that is frequently used.

~~~
mehrdadn
I thought was a court case database, not a personal information aggregator! +1
thank you!

~~~
daxelrod
LexisNexis provides all sorts of databases.

I had an experience where I was shopping for car insurance, and just before I
signed, my rate doubled from what was previously quoted by the same company.
It turned out that a LexisNexis database had an erroneous record claiming I
was at fault for an accident.

~~~
mehrdadn
Wow! How did you find out it was LexisNexis? Did the insurance company just
straight up tell you?

------
clumsysmurf
One book that covers the medical perspective is:

"Our Bodies, Our Data: How Companies Make Billions Selling Our Medical
Records"

[https://www.amazon.com/dp/B01EE08NXM](https://www.amazon.com/dp/B01EE08NXM)

~~~
snowpanda
I read this book, it should be a crime what they do.

------
trelliscoded
I saw a great talk by a private investigator at a security conference on this
topic, and did some research after the talk to confirm what he was saying.

The short version is what you're asking for is going to be an uphill battle.
Data aggregation companies don't want to disclose their sources because their
services are often used by debt collectors and other organizations who want to
find people who don't want to be found. If the sources were public knowledge,
debtors could avoid them to escape debt collection agencies, for example.

The comprehensive list you're looking for doesn't exist. Each data aggregation
company has a _different_ list which is the result of many private agreements
they have with their sources. If you want a list of the sources, you can't ask
the data aggregators.

You _can_ ask the sources themselves. Because you don't know who they are, you
have to guess. Any company that puts a card in your wallet would be a good
place to start. Under California Civil Code 1798.83, you can email companies
and ask them to provide you with a list of all the direct marketing companies
they sold your information to. Try making a request to your insurance agency.

1798.34 also allows you to ask California government agencies to provide you
with an accounting of everyone they disclosed your information to. A 1798.34
request to the DMV should be a rich source of data providers.

There's also lists of companies involved in the data aggregation ecosystem in
the consumer finance protection board's list of complaints:

[https://data.consumerfinance.gov/dataset/Consumer-
Complaints...](https://data.consumerfinance.gov/dataset/Consumer-
Complaints/s6ew-h6mp)

One surprising thing I found out during the talk is that pizza chains are a
rich source of data for these companies. If you think about it, this makes
perfect sense. The data includes a guaranteed link between a person, a place,
payment information, and a phone number.

> when a court needs to order that someone's information be purged

I don't think this is a thing. A friend had to deal with a DV case and I got
to see how all the legal machinery involved works. The court doesn't even
bother trying to purge the victim's data when they move out and try to stay
away from the abuser. The court simply moved the victim and the DV support
organization told them to stay off social media and not to give the new
address to anyone (including pizza places.)

That being said, the federal government does have a comprehensive list of all
the agencies which are members of the federal privacy council. In theory,
these agencies are supposed to have a data integrity board which provides
oversight for any data they keep on Americans.

[https://www.fpc.gov/federal-agencies/](https://www.fpc.gov/federal-agencies/)

~~~
mehrdadn
Thank you for the legal information! That was enlightening.

The one question I have remaining though is: when these companies pop up, how
do they know whom to buy your information from if there's no list and nobody
tells them?

~~~
trelliscoded
It's all very incestuous. I suspect that part of what happens is that new
companies are comprised of people who have previously worked in the industry,
so they have general knowledge of who the available providers are. Without
prior experience in the industry, I suspect you're not going to be in business
very long.

The other part of getting initial sources is probably calling companies in
retail, insurance, etc and pitching them on how much they would make selling
customer data to them. If I had to do it, I'd probably get industry reports,
sort by annual revenue, start at the top and work my way down as I try to get
a hold of the consumer data department at each company.

I've also been approached by these guys when I was part of shutting a company
down. They wanted to buy our customer database. I guess when one of these
companies shuts down, some new one can buy the database and use that as a seed
for new operations.

------
accountface
It sounds like based on the other comments that there's no way to track down
one sole source because there are so many varying from public records to
machine learning...

That being said, is there a simple way to better obscure yourself? Like using
a business name and a PO box instead of your personal name when it comes to
bills/addresses?

------
tracker1
You don't think all those fake facebook/twitter profiles trying to
follow/friend you are just for the lulz do you... They're mostly for mining
your personal information. This extends to transaction processing agreements
with advertisers and merchants for the purpose of analytics and tracking, as
well as information sharing from credit card companies themselves, and peering
agreements for data.

Just your email address alone, let alone combined with IP information can
result in being able to find a _lot_ of information about you... then you take
that and correlate it to public information, cc purchase history, online
profiles, it's a treasure trove. That doesn't even count extra data gleaned
from all the tracking cookies.

All said, I'm still far more concerned about government use of similar data
than I am private businesses.

------
tylercubell
As far as public records go, real estate data can be quite the treasure trove.
There are data brokers that collect, analyze, and resell this information all
day long.

For example, my startup can pull information on ownership, mortgage and sales
history, liens, and foreclosure records, among many other things, for a given
property. If you were to cross-reference the data with other public and
proprietary sources, it could get pretty, umm... what's the right word?,
"interesting", in terms of accuracy and level of detail.

------
nl
Take a look at this (2015) pic of the landscape:
[https://www.slideshare.net/RaviralaKarunakar/luma-display-
ad...](https://www.slideshare.net/RaviralaKarunakar/luma-display-adtech-
landscape)

The companies you are interested in are in "DMP & Data Aggregators" and "Data
Suppliers" section.

But this is only the largest, best known companies. There are many, many more.

------
mxk17
There are far too many too name and they each collect data from different
sources - both primary and secondary. Then the data is shared between them
through intermediaries that would form a very long list. This
duplication/redundancy makes it impossible to remove your data completely from
all the owners. Experian Acxiom DataLogix TransUnion Innovis
Woodbridge/Thomson Reuters DNB

------
ennuihenry
Easy one is they scrape Google for Linkedin bio info.

I'm paranoid that USPS sells your info once you fill out their change of
address form.

~~~
bb611
You're not paranoid, I don't know if money changes hands but my changes of
address have 100% been accompanied by spam snail mail.

I just don't change my mailing address anymore.

~~~
dredmorbius
Money does in fact change hands.

------
finid
Public records, which is accessible to anybody.

So paying to delete your info from one site is useless, because the next site
that somebody sets up will have your info, if they use the same public records
as the others.

------
tgarma1234
Reading this thread I can answer the question because I have worked in this
industry and given/sold data to Intellius specifically.

There are three main sources of ALL of the data:

1\. Acxiom [http://www.acxiom.com/](http://www.acxiom.com/) 2\. Experian
[http://www.experian.com/](http://www.experian.com/) 3\. Neustar
[https://www.neustar.biz/](https://www.neustar.biz/)

Acxiom got it's big start by developing a way to copy phone books in the 90s
and they won a court case that sided with them saying the name and address
information was basically public info. Acxiom aggregates something like 800
different attributes for each named person at each address using third party
vendors and then resells the entire consumer database to list brokers who
often times will add additional detail for smaller subsets of the data. You
can opt out of Acxiom by going here

[http://www.acxiom.com/about-acxiom/privacy/consumer-data-
inf...](http://www.acxiom.com/about-acxiom/privacy/consumer-data-information/)

2\. Experian. Same as above but they have a lot more specific data about you
because you probably fill out forms related to credit and loan applications
correctly. Thus they know all of you previous addresses and they sell that to
companies like Intellius.

3\. neustar: ditto.

The main thing to keep in mind is that each of those three companies have
slightly different channels through which they aggregate consumer data so your
info comes out a little differently in each database.

Almost any list broker or mailing house or telemarketer that you encounter is
getting their data ultimately from one of those three companies (and in many
cases they would buy data from all three sources).

Finally, a company like [http://www.criteo.com/](http://www.criteo.com/) uses
a process they call "database cookie-ization" to match your online browsing
history (hence interests and business) to those three databases via your email
addresses. So they know what you look at online, where you live, everything
you have ever requested credit for, etc etc.

There are hundreds of smaller companies like these (below) feeding data into
those databases too:

[https://www.hgdata.com/](https://www.hgdata.com/)
[https://www.fullcontact.com/](https://www.fullcontact.com/)
[http://zetaglobal.com/](http://zetaglobal.com/)
[https://www.lotame.com/](https://www.lotame.com/)

------
dbg31415
If Google aggressively delisted shitty companies like this, the problem would
go away.

There's no value for society in these shit services that make people register
and pay to have their profiles taken down.

~~~
ddebernardy
Then again, Google and Facebook are the biggest aggregators among them.
Admittedly, they don't sell raw access to their data.

------
samstave
Lexis-Nexis? TRW/Experian/etc?

[https://www.lexisnexis.com/risk/](https://www.lexisnexis.com/risk/)

------
dredmorbius
Information acquired from a number of sources, including working in the data
industry (a decade or two back), privacy advocacy, working in Web space,
research of my own, stories over beers, legal experiences, etc.

There's a large information-brokerage industry. If you want to find it,
investigating the question from the consumption side (as in: who will sell me
this information) should turn up the larger players, most of whom are already
listed in this thread.
[https://news.ycombinator.com/item?id=13804795](https://news.ycombinator.com/item?id=13804795)

The big players are much of the business: Power laws work here as anywhere
else, and heading off the larger sources is pretty effective.

The value of _individual_ data isn't all that great. Which leads to one of the
major PITAs of this industry: there's a lot of invalid, false, or stale data
around. The incentives to fix it simply don't exist.

The now-defunct Internet Junkbuster used to have a print-your-own set of
letter templates which could be sent to various marketing organisations. Doing
that in the early 2000s dropped my own junk-mail volumes _tremendously_ , and
for years afterward. I suspect SafeShepherd operates somewhat similarly.
Finding and hitting the direct marketing association(s) was a big part of
that.

Putting a fraud hold on your credit reports (TansUnion, EquiFax, Experian) is
useful.

Any account-based activities _or activities in which you are specifically
identified_ are fodder for capture. Credit cards, checks (Luddite! ... hang in
there), "loyalty" cards. Gyms and pizza, as noted.

Facebook, which should go without saying. LinkedIn profiles.

Any online information service which has ever been hacked. (For safety, assume
all of them.)

Online purchases. Through both the marketplace _and_ your credit card.

Court and other public records are manually reviewed and entered.

Various school and alumni associations. Organisations such as Classmates.com,
MyLife, etc., front-ended to skip-tracing and similar organisations (info via
direct communications).

Your auto smog testing station. There's an outfit known as ISO, Insurance
Services Office, who has a unit that tracks down odometer mileage data. They
glean that by buying the state smog check data, which is indexed by VIN and
drivers license in mose cases. The notion is that miles driven is an excellent
proxy for insurance risk.
[https://en.m.wikipedia.org/wiki/Insurance_Services_Office](https://en.m.wikipedia.org/wiki/Insurance_Services_Office)

The US Post Office NCOA (change of address) form, as noted. File a _temporary_
COA to avoid getting listed.

Used to be you could submit a "pornographic materials" request to the USPO to
have circulars and such removed from your delivery. Though online sources
suggest it's possible to block 3rd class mail (Yahoo answers). That's more an
annoyance than privacy issue.

Magazine subscriptions.

Request of any organisations you do business that they not share your
information. Use telltales to determine which do (additions to your address,
name, etc.).

And, _if the state of affairs bothers you, get on your government
representatives to do something about it._ Data are liability, and there's far
too much of it floating around. The US in particular has taken an
_exceptionally_ piecemeal approach to the problem (video store rental records
are protected, bookstore and pharmacy records are not).

Request comprehensive data privacy regulations, with teeth.

------
yuhong
This reminds me of
[https://www.spamhaus.org/faq/section/Marketing%20FAQs#176](https://www.spamhaus.org/faq/section/Marketing%20FAQs#176)

------
bobzibub
Would it be possible to copyright our names and use DMCA takedown notices?
Just a random thought.

~~~
uiri
Names do not meet the threshold of originality test[0]. Beyond that, your
parents would be the copyright holder until their death when you would inherit
it via their estate.

They are potentially covered under trademark if you have a brand in a specific
industry and others are using your name or a similar name in a way that could
cause consumer confusion.

[0]
[https://en.wikipedia.org/wiki/Threshold_of_originality](https://en.wikipedia.org/wiki/Threshold_of_originality)
unrelated: the threshold for code is usually around 15 lines.

