
Inside the Largest US Voter Data Leak - danso
https://www.upguard.com/breaches/the-rnc-files
======
DannyBee
Speaking as a guy with a lot of experience with voter data ( I built the first
"where do I vote" apps for Google and helped found the voting information
project):

This is actually almost entirely public data. Yes, including addresses and
phone numbers and political affiliation. There are some states that is not
public as part of the voter file, but you can still get it other ways
publicly. For example: USPS, etc. Some states/players would make you sign
agreements not to use it for commercial purposes.

The modeling info included is not public.

Acquiring 50 state data can be a bit of a pain, but there are at least two
major players that will sell it to you. (I remember one of them literally
laughed when I told them we would want the databases without any personal info
included, because we just wanted the address to various political precinct
mapping.)

~~~
unityByFreedom
> This is actually almost entirely public data

Birthday is an included item. That's definitely private as it is often used to
confirm identity.

"almost public" is meaningless. One data item, like credit card number, or
birthday, can make this a dangerous leak.

~~~
nostrademons
Birthday is almost certainly public. You can get it easily from the DMV or
from any of several dozen commercial providers that resell government data:

[https://www.dmv.org/public-records/](https://www.dmv.org/public-records/)

That it's used to confirm identity shows how weak identity-theft protections
are at most institutions, not what's public information. (For that matter,
mother's maiden name is basically public information as well: you can get it
from genealogical records.)

~~~
jack_kc
The site you linked to is not affiliated with any state DMV. It links to a
sketchy background check service that appears to be a scam.

Edit: spelling

~~~
williamscales
The link was preceded by the words "from any of several dozen commercial
providers that resell government data:" so that's how I took it. It would be
at a .gov domain if it were state affiliated.

~~~
unityByFreedom
It said from the DMV. No evidence of that here.

Also, "reselling government data" implies that the government explicitly gave
permission to a business selling your data. I doubt that's true. More likely,
these entities gathered data from whatever entities they could, probably
various private companies from which you've made online purchases.

------
_Codemonkeyism
As long as the CEO of an company (RNC) that gives data to an outsourcer (Deep
Root Analytics) is not going to jail to give data to an unqualified company,
nothing will change.

If the CEO goes to jail, things will change very rapidly (CEO will manage his
CMO much tighter who will first want to see an security audit not older than 6
months).

At least CEOs I have reported to as CTO were very sensitive for implemention
issues in areas that could land them in jail.

Same for every other hacking (e.g. Sony) or IT failure (e.g. British Airlines
crashed DC).

~~~
strictnein
What law did they break, exactly? These aren't medical or financial records.

A careless programmer makes a bad choice and the CEO has to go to jail? Come
on.

~~~
criley2
>?A careless programmer makes a bad choice and the CEO has to go to jail? Come
on

An institutional failure of review, testing and security that will lead to
tens of billions of dollars of identity theft goes unpunished completely?

Come on.

A CEO is responsible for his organization. If you ruin lives, you have to pay
the price.

Can't handle the heat?

Don't take the job.

I hate how CEO's get hundred million dollar parachutes because, the risk and
danger and difficulty of such a position warrants such extravagant pay.

But, then, we ask them to be responsible, bear responsibility for the
organization which paid them a hundred million dollars to be responsible,and
we say "come on?"

Utterly ridiculous.

CEO's bear responsibility for their organizations, or the organization should
not exist. There must be responsibility for private organizations, lest the
concept of private organization be nothing more than a cheap trick to remove
criminal and civil liability from wrong doing.

~~~
stale2002
This isn't social security numbers.

This is all publicly available data scrapping stuff. Like your public Facebook
profile.

If you don't want that stuff to be leaked, then don't put your info publicly
on Facebook.

~~~
wutwutwutwut
I didn't put mine on Facebook. What's step 2?

~~~
stale2002
Are you registered to vote?

Then your info is publicly available for anyone to get. The government will
just give it to you.

------
wyldfire
I hate to be in the position of defending a leak such as this. But if what
they've done is "merely" compiling data that was available from our public
profiles, are they obligated to secure that compilation? I'm asking -- I don't
know for sure how the data was gathered, it just sounds like it was from
scraping public records + public web sites.

Also, can someone ask Troy Hunt whether he has or can get access to this data
so he can let us all know if we're on it? (But will it even matter if they
don't have an email address field?)

~~~
falcolas
Personally, while I understand that this is nothing illegal, I think it's
terribly wrong.

We just handed everyone, including the 5% of society who tend towards
sociopathy, a nicely tagged, collated (and yet probably slightly inaccurate)
list of minorities.

Hate women? You have a nice list which includes the names, addresses, and
telephone numbers of all those women.

Hate muslims? Boy do I have just the list for you. Blacks? Republicans? White
muslim men who live in the same neighborhood as you?

Let's omit the sociopaths for just a moment, and let's look at the ad
networks. Can you picture how much more accurate a picture those companies
have of you now? They no longer have to guess at your age, ethnicity or
religion - they now _know_. What could go wrong when _that_ list of "legally"
collated data gets combined with the RNC leak, and is subsequently itself
leaked?

So no. It's probably not illegal to compile these lists. It's probably not
even illegal that it was released. But it was, for certain, a damned immoral
thing to do, and there will be consequences.

~~~
nostrademons
I don't really disagree that this is a shitty data breach, but on the
consequences, I'll point out that the 5% of society who tends to sociopathy
already have much lower-effort means to target minorities:

Hate women? Look for boobs.

Hate muslims? Look for brown skin.

Hate Republicans? Look for MAGA hats.

That these are inaccurate signals is irrelevant: haters gonna hate, and I
really don't think they care whether the brown-skinned person they're
harassing is _actually_ a Muslim, they just want a target for their anger.

As for ad networks - they already have much more accurate models of age,
ethnicity, and religion than the RNC has. There's a lot more money involved in
targeting ads, and so they've put a lot more effort into it than political
consultancies. Worrying about them is like closing the barn door after the
horse is out.

~~~
falcolas
> lower-effort means to target minorities

It doesn't get much lower effort than downloading a list, popping it into
Excel, and sorting on a column (I'm willing to bet that some hate groups are
already doing this and will release very specific lists to their membership).
And with phone numbers and a couple of bucks, you don't even have to leave
your house to send them hate messages by the thousands.

Haters gonna hate - far too flippant a phrase to describe those who
emotionally and physically assault their targets.

As for the ad models - the RNC release has a very specific DOB, location,
gender, and phone number. Some of these the ad network could guess at, but
this provides concrete data.

~~~
nostrademons
It gets _a lot_ lower effort than downloading a list, popping it into Excel,
and sorting on a column (really, how many non-techies would do that?). Get on
the subway, find girls with brown skin, start screaming obscenities, pull
knife when confronted, murder.

Look, I don't want to minimize the impact of hate or harassment on victims. It
really is terrible, and should be challenged whenever possible. I _do_ want to
inject some realism into the discussion of the likely consequences of this
breach. The people who would go out and harm other people because of their
ethnicity or religion don't particularly care if they get the ethnicity or
religion right. (I'm reminded of a time when I was carrying my wife's purse
while she went shopping in a nearby store, I walk by a pickup truck, and the
guy inside is loudly muttering "Fucking faggots" over and over again. And then
I met my wife in the parking lot, give her a kiss, hand her back her handbag,
and he laughs a big "Ha-ha!" of relief and drives off.)

And the people who would do mass harassment over the phone have much easier
ways to get this data, like e-mailing their state voter registry and asking
for it.

------
sschueller
Wow, not good:

    
    
       "State", "Juriscode", "Jurisname", "CountyFIPS", "MCD", "CNTY", "Town", "Ward", "Precinct", "Ballotbox", "PrecinctName", "NamePrefix", "FirstName", "MiddleName", "LastName", "NameSuffix", "Sex", "BirthYear", "BirthMonth", "BirthDay", "OfficialParty", "StateCalcParty", "RNCCalcParty", "StateVoterID", "JurisdictionVoterID", "LastActiveDate", "RegistrationDate", "VoterStatus", "SelfReportedDemographic", "ModeledEthnicity", "ModeledReligion", "ModeledEthnicGroup", "RegistrationAddr1", "RegistrationAddr2", "RegHouseNum", "RegHouseSfx", "RegStPrefix", "RegStName", "RegStType", "RegstPost", "RegUnitType", "RegUnitNumber", "RegCity", "RegSta", "RegZip5", "RegZip4", "RegLatitude", "RegLongitude", "RegGeocodeLevel", "ChangeOfAddress", "COADate", "COAType", "MailingAddr1", "MailingAddr2", "MailHouseNum", "MailHouseSfx", "MailStPrefix", "MailStName", "MailStType", "MailStPost", "MailUnitType", "MailUnitNumber", "MailCity", "MailSta", "MailZip5", "MailZip4", "MailSortCodeRoute", "MailDeliveryPt", "MailDeliveryPtChkDigit", "MailLineOfTravel", "MailLineOfTravelOrder", "MailDPVStatus", "MADR_LastCleanse", "MADR_LastCOA", "AreaCode", "TelephoneNUm", "TelSourceCode", "TelMatchLevel", "TelReliability", "FTC_DoNotCall"

~~~
LeifCarrotson
That does not even nearly fit on one line, so I broke it up. Yeah, it looks
pretty bad. Here's hoping it's only sparsely filled out.

State

Juriscode

Jurisname

CountyFIPS

MCD

CNTY

Town

Ward

Precinct

Ballotbox

PrecinctName

NamePrefix

FirstName

MiddleName

LastName

NameSuffix

Sex

BirthYear

BirthMonth

BirthDay

OfficialParty

StateCalcParty

RNCCalcParty

StateVoterID

JurisdictionVoterID

LastActiveDate

RegistrationDate

VoterStatus

SelfReportedDemographic

ModeledEthnicity

ModeledReligion

ModeledEthnicGroup

RegistrationAddr1

RegistrationAddr2

RegHouseNum

RegHouseSfx

RegStPrefix

RegStName

RegStType

RegstPost

RegUnitType

RegUnitNumber

RegCity

RegSta

RegZip5

RegZip4

RegLatitude

RegLongitude

RegGeocodeLevel

ChangeOfAddress

COADate

COAType

MailingAddr1

MailingAddr2

MailHouseNum

MailHouseSfx

MailStPrefix

MailStName

MailStType

MailStPost

MailUnitType

MailUnitNumber

MailCity

MailSta

MailZip5

MailZip4

MailSortCodeRoute

MailDeliveryPt

MailDeliveryPtChkDigit

MailLineOfTravel

MailLineOfTravelOrder

MailDPVStatus

MADR_LastCleanse

MADR_LastCOA

AreaCode

TelephoneNUm

TelSourceCode

TelMatchLevel

TelReliability

FTC_DoNotCall

~~~
BoiledCabbage
Plus the supposedly "startlingly accurate" preference and views modeling that
is linked to all personal details and publically accessible. While someone
could always deny it being correct it again raises questions about
responsibility in data collection.

> RNC_RegID, State, 2012ObamaVoter_DRA_12_16, 2012RomneyVoter_DRA_12_16,
> 2016ClintonVoter_DRA_12_16, 2016TrumpVoter_DRA_12_16,
> AmericaFirstForeignPolicy_agree_DRA_12_16
> AmericaFirstForeignPolicy_disagree_DRA_12_16
> AutoCompaniesShipJobsOverseas_agree_DRA_12_16
> AutoCompaniesShipJobsOverseas_disagree_DRA_12_16
> CorpReputs_AmericanMakers_DRA_12_16, CorpReputs_DailyLives_DRA_12_16,
> CorpReputs_Egalitarians_DRA_12_16, CorpReputs_EnviroConscious_DRA_12_16,
> CorpReputs_OpportunitySeekers_DRA_12_16,
> CorpReputs_STEMSupporters_DRA_12_16, CorpReputs_SupplyChainers_DRA_12_16,
> CorpReputs_Unifers_DRA_12_16, DemLeadersStandUpToTrump_DRA_12_16,
> DemLeadersWorkWithTrump_DRA_12_16, DParty_DRA_12_16,
> FinancialServicesHarmful_agree_DRA_12_16
> FinancialServicesHarmful_disagree_DRA_12_16
> FinServicesCompany_Dreamers_DRA_12_16
> FinServicesCompany_RiskMitigators_DRA_12_16
> FossilFuelsImportantForUSEnergySecurity_DRA_12_16
> FossilFuelsNeedToMoveAwayFrom_DRA_12_16,
> InvestInfrastructure_agree_DRA_12_16,
> InvestInfrastructure_disagree_DRA_12_16, LowerTaxes_agree_DRA_12_16,
> LowerTaxes_disagree_DRA_12_16, NonReluctantDJTVoter_DRA_12_16,
> NonReluctantHRCVoter_DRA_12_16, PharmaCompsDoGreatDamage_agree_DRA_12_16,
> PharmaCompsDoGreatDamage_disagree_DRA_12_16,
> ReformGovtRegulations_agree_DRA_12_16,
> ReformGovtRegulations_disagree_DRA_12_16, ReluctantDJT_Above.5_DRA_12_16,
> ReluctantHRCVoter_DRA_12_16, RepealObamacare_agree_DRA_12_16,
> RepealObamacare_disagree_DRA_12_16 RParty_DRA_12_16,
> StopIllegalImmigration_agree_DRA_12_16,
> StopIllegalImmigration_disagree_DRA_12_16, TrumpStandUpToDems_DRA_12_16,
> TrumpWorkWithDems_DRA_12_16, USAFinancialSituation_Optimistic_DRA_12_16,
> USAFinancialSituation_Pessimistic_DRA_12

~~~
enzanki_ars
Formatted:

    
    
        RNC_RegID
        
        State
        
        2012ObamaVoter_DRA_12_16
        
        2012RomneyVoter_DRA_12_16
        
        2016ClintonVoter_DRA_12_16
        
        2016TrumpVoter_DRA_12_16
        
        AmericaFirstForeignPolicy_agree_DRA_12_16
        
        AmericaFirstForeignPolicy_disagree_DRA_12_16
        
        AutoCompaniesShipJobsOverseas_agree_DRA_12_16
        
        AutoCompaniesShipJobsOverseas_disagree_DRA_12_16
        
        CorpReputs_AmericanMakers_DRA_12_16
        
        CorpReputs_DailyLives_DRA_12_16
        
        CorpReputs_Egalitarians_DRA_12_16
        
        CorpReputs_EnviroConscious_DRA_12_16
        
        CorpReputs_OpportunitySeekers_DRA_12_16
        
        CorpReputs_STEMSupporters_DRA_12_16
        
        CorpReputs_SupplyChainers_DRA_12_16
        
        CorpReputs_Unifers_DRA_12_16
        
        DemLeadersStandUpToTrump_DRA_12_16
        
        DemLeadersWorkWithTrump_DRA_12_16
        
        DParty_DRA_12_16
        
        FinancialServicesHarmful_agree_DRA_12_16 
        
        FinancialServicesHarmful_disagree_DRA_12_16 
        
        FinServicesCompany_Dreamers_DRA_12_16 
        
        FinServicesCompany_RiskMitigators_DRA_12_16 
        
        FossilFuelsImportantForUSEnergySecurity_DRA_12_16 
        
        FossilFuelsNeedToMoveAwayFrom_DRA_12_16
        
        InvestInfrastructure_agree_DRA_12_16
        
        InvestInfrastructure_disagree_DRA_12_16
        
        LowerTaxes_agree_DRA_12_16
        
        LowerTaxes_disagree_DRA_12_16
        
        NonReluctantDJTVoter_DRA_12_16
        
        NonReluctantHRCVoter_DRA_12_16
        
        PharmaCompsDoGreatDamage_agree_DRA_12_16
        
        PharmaCompsDoGreatDamage_disagree_DRA_12_16
        
        ReformGovtRegulations_agree_DRA_12_16
        
        ReformGovtRegulations_disagree_DRA_12_16
        
        ReluctantDJT_Above.5_DRA_12_16
        
        ReluctantHRCVoter_DRA_12_16
        
        RepealObamacare_agree_DRA_12_16
        
        RepealObamacare_disagree_DRA_12_16 
        
        RParty_DRA_12_16
        
        StopIllegalImmigration_agree_DRA_12_16
        
        StopIllegalImmigration_disagree_DRA_12_16
        
        TrumpStandUpToDems_DRA_12_16
        
        TrumpWorkWithDems_DRA_12_16
        
        USAFinancialSituation_Optimistic_DRA_12_16
        
        USAFinancialSituation_Pessimistic_DRA_12

~~~
timdavila
How could one tell if a vote is "reluctant" or not, based on the available
data?

~~~
anigbrowl
Lots of ways. Perhaps you know they're a super-loyal Republican voter and you
have things that correlate with that, such as willingness to vote when
presented with an online poll. You might know they were reluctant because the
person shifted between the available alternatives during primary season -
perhaps supporting Jeb! Bush, then Marco rubio, then Ted Cruz, then John
Kasich, before finally falling into line when Trump won the nomination. You
could then infer that Trump was the absolute last choice but that they would
still 'hold their nose' and vote for him in the general election, either due
to loyalty to GOP on particular policy issues or because of some long-standing
hatred of Hillary Clinton, or just a history of being very conformist in
political matters (as many, many people are).

------
Lagged2Death
_“‘Microtargeting is trying to unravel your political DNA,’ [Gage] said. ‘The
more information I have about you, the better.’ The more information [Gage]
has, the better he can group people into "target clusters" with names such as
‘Flag and Family Republicans’ or ‘Tax and Terrorism Moderates.’ Once a person
is defined, finding the right message from the campaign becomes fairly
simple.”_

Neal Stephenson wrote a book called _Interface_ which predicted a form of
tech-enabled micro-targeted politics over 20 years ago. It was disturbing at
the time; it's almost considered business-as-usual now.

I believe American democracy would benefit from including the study of such
techniques in our educational curriculum. When I was in school, we studied
advertising techniques to help us be skeptical. We need the same for targeted
political messages now.

~~~
anigbrowl
I agree, but citizen education should not imho be the only approach here. I'm
for _much_ more muscular privacy laws and a slightly narrower tolerance on
what's acceptable political speech.

Of course education is great, but look at the vast financial and operational
asymmetries between even the most informed individual and well-resourced
corporate actors like political parties. I have a super-strong political
immune system but being politically engaged and navigating social media is
exhausting. For the sake of objectivity I have to systematically expose myself
to opinions I find disagreeable lest I retreat into a bubble and be surrounded
by confirmation bias, but continuous exposure to countervailing political
ideologies is intellectually and morally tiring, given the intense
polarization and visceral rhetoric that prevails in today's political
discourse.

Despite not liking programming, I've been seriously thinking about building a
virtual assistant that I can train to pre-emptively tag people using my
peculiar ideological criteria so that I can avoid or at least prepare for
certain interactions that I know are going to be psychically difficult. By my
value calculus, tuning out of politics is irresponsible at best and suicidal
at worst; only communicating with people whose values you share exposes you to
confirmation bias, and and inevitably exposes one to manipulation; observation
of and argumentation with antagonists is psychically expensive and potentially
dangerous.

so much as I agree with you on education, it's not something we can just put
on the to-do list and wait a generation to benefit from. And that would be
true even if we had a well-functioning educational sector rather than one that
fails a large number of children and adults by leaving them only semi-literate
and -numerate. People who can't read or reckon well are poorly positioned to
identify fallacious political discourse.

~~~
specialist
I revel in my filter bubble and labor to improve it.

For policy wonks and activists such as myself, discourse, persuassion,
marketing, are distractions from the real work of getting things done.

Firstly, because people vote their identity. Period. Almost no one votes on
the facts, the issues, the policy, the platforms, whatever. There are no
undecideds, no independents. Cite "Democracy for Realists".

Secondly, victory is achieved by mobilizing your supporters. You bring the
heat, whoever is sitting in the chair will see the light.

The only distinction is if a voter is willing or unwilling to bother casting a
ballot.

------
pdog
This raises so many questions...

Why is U.S. voter registration made public at the individual name/address
level?

Why do the states publish their voter registrations in the first place?

Why should private campaign operations (or anyone else) have access to this
data?

Shouldn't voters' privacy be protected by the states?

Is there a privacy policy you can review when you register to vote?

~~~
VonGuard
So that the two parties who voted for this to be the case can have unfettered
access to their potential voters, go to their houses, send them things, and
know for sure they're only hitting up people in their party, so as not to
mobilize the other side.

Nothing that would harm the 2-party system ever changes in the US, and nothing
ever will.

~~~
kevindqc
But why does it need to be public?

In the US, you register with the party no? So the party you register with your
information, and they can do what you said

> go to their houses, send them things, and know for sure they're only hitting
> up people in their party, so as not to mobilize the other side.

~~~
gregshap
Nope, in the US you register with the state and you can optionally tell the
state that you are a member of some party.

You can register to vote and self-identify as 'Dem' 'Republican' etc while
getting your driver's license.

Aside from the public voter file update, the Democratic Party doesn't get any
special notification if you pick them, and you don't need to apply or get
accepted in any way.

------
rattray
I don't typically don hats with this much tin foil, and I don't think this is
_likely_ , but...

The real danger of data like this, in my opinion, illegal usage for voter
fraud.

Find people who are likely to vote against you and likely to have poor voter
registration documents, and remove them from the polls so they can't vote.

Find people who aren't likely to vote at all and vote on their behalf. In-
person, the only verification required is name & address. By mail, the only
requirement is a signature, which can be obtained from receipts (I assume this
is available on black hat markets).

Leaving this S3 bucket as public-read allows for deniable coordination with
illegal actors. I can't imagine they did this on purpose but that could be an
explanation.

I don't know if it's possible, but I hope the FBI / Mueller team is able to
get access logs.

~~~
rwc
No. The data that would be useful for wide-scale voter fraud is already widely
available from public/free sources, including state Secretaries of State or
Departments of Elections.

The loss here is all the very expensive extra modeling and demographic work
that isn't included on those files. But having that doesn't massively alter
the mechanics of the voter fraud effort you're describing.

~~~
rattray
The expensive modeling (and data collection) makes it much cheaper and more
feasible.

I agree that it doesn't change the fundamental mechanics, or enable otherwise
impossible attacks.

------
ploggingdev
I have this theory that the only way regular people will start caring about
privacy breaches such as this one is to use that data against them in a
malicious way. Tell the average Joe that the data of all US voters has been
leaked, "Hmmm. That's bad." and they move on with their lives as if nothing
happened. Instead, if this data is used to impersonate the average Joe on
social media or if it's used to trick their mobile carrier into porting out
their number, then they'll take notice. (I am _not_ suggesting people do this,
it was just part of a thought experiment)

Unless the company involved is sued to bankruptcy and the people involved are
prosecuted, sending a strong message to companies dealing with user data,
nothing will change. But that's unlikely to happen as this company is backed
by the RNC.

While we're on the topic of collecting personal data of people, there's a
simple solution : just don't collect it unless it's absolutely necessary. Stop
asking me to broadcast my address in my newsletter. Stop asking me to submit
my billing address when I make payments online. Stop asking me for my mobile
number when I visit a fast food restaurant. Most of the companies that collect
this data are not competent enough to keep it secure. The reason companies ask
for an address to broadcast in users' newsletters is some anti-spam act which
does not prevent the spammers from doing their job. I imagine it's also a
requirement for companies to collect a billing address for certain types of
online payments. Change the law to remove these poorly thought out
legislature.

More generally, we need regulations on how user data is used by companies.
They should not be allowed to store user data indefinitely. If a user closes
an account with a company, retain the data for a short period (eg- 1 year) and
then delete the data automatically. Companies should not be allowed to build
shadow profiles of users.

~~~
ksk
>I have this theory that the only way regular people will start caring about
privacy breaches such as this one is to use that data against them in a
malicious way

The other thought experiment is to adopt the opposite point of view that
privacy is overrated. You can adopt various worldviews from 'everything is
grey' to 'this is good, that is bad" but punishing someone to care about same
things you care about is a pretty terrible approach. If you can't convince
someone, it doesn't mean they're stupid, it could also mean you just aren't
that good at communicating, or that perhaps what you think is important isn't
all that important.

>Unless the company involved is sued to bankruptcy and the people involved are
prosecuted, sending a strong message to companies dealing with user data,
nothing will change.

Or we can give them tools that make it easier to secure data. I've always
found that if you make it easy for someone, they almost always end up doing
the right thing. As it stands the security products/services domain is a
complicated maze where you have to be an expert to evaluate how various
products work internally and which services, if any are worth purchasing.

~~~
dredmorbius
There's a pretty compelling argument though, that people _aren 't_ good at
making long-term assessments of diffuse, but potent, risks, and/or are willing
to, or can be coerced into, arbitraging long-term interests with short-term
exigency.

Gresham's law, availability hueristics, optimism bias, distribution of
cognitive skills, various aspects of game theory, and more, strongly suggest
this.

Examples: global warming, pollution risks, resource depletion, moral and
morale hazard, just off the top of my head.

~~~
ksk
Yeah, and all those biases apply to people on HN too - You (the proverbial)
aren't good at making assessments either. Also, flipping it for arguments
sake, how come software companies are never penalized for introducing software
bugs? How would you like it if an accountant wanted to send you to jail
because you introduced a security bug and their critical data got wiped out
and they went out of business? Or should we go back to blaming them for not
having backups because we have an a-priori assumption that "shit happens" when
it comes to software? Well, the other side could say 'shit happens' too.

~~~
dredmorbius
Those are actually questions I'm actively exploring.

Some recent discussion (from myself and others) on this G+ thread:

[https://plus.google.com/+YonatanZunger/posts/267XsJzKM5B](https://plus.google.com/+YonatanZunger/posts/267XsJzKM5B)

I've discussed Gresham's Law dynamics numerous times at my subreddit/blog. See
particularly:

[https://www.reddit.com/r/dredmorbius/comments/2h63fp/is_gres...](https://www.reddit.com/r/dredmorbius/comments/2h63fp/is_greshams_law_a_special_instance_of_a_more/)

I've been meaning to write up a bit expanding market price dynamics beyond the
set of goods that Adam Smith defined: labour, capital, commodities, rents, and
(indirectly) interest.

In particular, the question of _risk pricing_ , which is treated almost wholly
as a _financial_ question rather than an economic one.

The question of pricing under duress is a key one -- the Backward-'S' bending
supply curve is a curious economic anomaly:

[https://www.reddit.com/r/dredmorbius/comments/53mcxn/backwar...](https://www.reddit.com/r/dredmorbius/comments/53mcxn/backward_s_shaped_supply_curves/)

Also the behaviour of natural resource stocks under supplier pressure -- the
price will fall to the lowest levels possible, and supplied volume will
_increase_ , if possible, for a number of highly perverse reasons. The
collapse of oil prices following the East Texas oilfield discovery, from
~$1/bbl to first $0.13/bbl, then $0.02/bbl, before wellhead production was
siezed at force of arms by the Oklahoma and Texas national guard, and Texas
rangers, comes to mind.

[https://www.reddit.com/r/dredmorbius/comments/6etqey/oil_is_...](https://www.reddit.com/r/dredmorbius/comments/6etqey/oil_is_not_a_drug_oil_dependence_is_not_an/)

------
Taniwha
I think people are missing the big legal liability .... this information has
been published to the world, it contains estimates of people's deeply held
political beliefs - some of it will be wildly wrong and those people might
consider that they have been libeled .... roll on the lawsuits

~~~
dragonwriter
Libel, in the US, requires the publisher to either known the claim is false
ormto publish it with reckless disregard for the truth when it is, in fact,
false; I don't think that there is anyway that this could be construed to meet
that, no matter what legal theory as to who the liable publisher is you use.

------
opensourcenews
I still don't understand the concern.

Want the name, age, gender, home address, mailing address, party of
registration, and voter history from every registered voter in North Carolina?
Here is the "leak" on Amazon S3.
[http://dl.ncsbe.gov/index.html?prefix=data/](http://dl.ncsbe.gov/index.html?prefix=data/)

Except, by leak, I mean, link I got from my state board of elections'
homepage.

------
jurry289
Related: This legally mandated "leak" happened years ago, and it even included
signatures. Most of the citizens pushing for the recall of a sitting
republican governor were democrats, so it seemed like punishment for a
Republican state assembly to pass this one-off policy. Especially for those
who have zero internet-presence. The first thing many employers see when they
search for politically active democrats is this information, paired with
"quick searches" of their names on criminal, pedophile and dangerous persons
databases. If you want to disinsentivize political action, this is how you do
it. I won't link to that site, but can confirm it's still up.

[http://www.cnn.com/2012/02/01/politics/wisconsin-recall-
peti...](http://www.cnn.com/2012/02/01/politics/wisconsin-recall-
petition/index.html)

------
gruez
Are the leaked files floating aroubd on the internet, or were they able to
shut it down before anyone else got to it?

~~~
strictnein
I've seen no evidence that anyone but the security researcher found this, but
maybe that'll change.

------
mlindner
Except this isn't a leak. This is publicly searchable information. I don't
know why people are blowing this out of proportion.

~~~
lr4444lr
Because technologists don't know everything about laws and society, even
though they like to think they do. You might add that not only is it
searchable, but the Freedom of Information Act and its derivative
implementations in state laws mandate that it be made accessible in this way
for no cost other than the expense in compiling the records.

People getting angry when "government transparency" is supposedly such a good
thing no one questions? Go figure...

------
vinhboy
> It would ultimately take days, from June 12th to June 14th, for Vickery to
> download 1.1 TB of publicly accessible files

Do security firms have special permission to do this? Because as a private
citizen, I am pretty sure I would go to jail if I tried this.

~~~
mrpoptart
IANAL, but I'm pretty sure if someone leaves a Top-Secret document on the
ground, and you pick it up, you don't go to jail for that. The data is
accessible to the public -- this isn't a hack, this is just downloading.

~~~
creepydata
The only people who can go to jail for mishandling classified information are
people with security clearances.

------
kjhughes
Being discussed here:

[https://news.ycombinator.com/item?id=14586920](https://news.ycombinator.com/item?id=14586920)

~~~
wyldfire
Great to merge the discussion but IMO the Upguard article is original/superior
details.

~~~
kjhughes
Agreed -- the Upguard article is a better topic link for this.

------
dsfyu404ed
If you're a glass half full person at least they care about the do not call
list enough to give it its own column.

~~~
treehau5
They take the DNC list and subsequent enforcement _very_ seriously.
[https://www.ftc.gov/news-events/media-resources/do-not-
call-...](https://www.ftc.gov/news-events/media-resources/do-not-call-
registry/enforcement)

source: work at place with large call center. Avoiding DNC fines is among one
of our top priorities.

secondary source: S.O. works at place with call center for a large bank. Same
thing.

Also anecdotally: If you file a complaint with the FTC for an unknown number
that keeps calling you back without them giving you a chance to "opt out"
(this is most scammer numbers), you file a ticket with the FTC, and they
usually respond to the ticket within 2-3 days. (Another funny thing -- they
use Zendesk.). I stopped receiving the calls since filing the report.

As to why they take it so seriously? My guess is it's easy money for them.
Kind of like traffic tickets for cops.

~~~
maxerickson
Political calls are exempt from the DNC list:

[https://www.donotcall.gov/faq/faqbusiness.aspx#who](https://www.donotcall.gov/faq/faqbusiness.aspx#who)

The parties take it seriously because they don't want to lose a vote by
pissing someone off.

------
azinman2
Is this available online in a searchable way? I want to see what it has on
me...

------
elihu
What are the legal implications of a campaign using this leaked modelling
data? Would it be breaking any existing laws to use data that was leaked to
the public without the permission of the original company that did the
modelling? (If nothing else, I suppose it's probably a copyright violation.)

Hypothetically, could one deliberately leak a trove of modelling data with
some fake voters inserted, and then monitor the mailbox associated with that
fake voter and sue any organization you don't like that sends campaign flyers
for using the data without permission?

------
fulldecent
TLDR. Was any of the information non-public?

~~~
political_tech
The results of data science models are contained in the data dump, and those
are non-public. The rest of the information is accessible via public records
or registries maintained by states.

------
apeace
> Spreadsheets containing this accumulated data—last updated around the
> January 2017 presidential inauguration—constitute a treasure trove of
> political data and modeled preferences used by the Trump campaign

Genuinely curious: can you really have 198 million rows in a spreadsheet?

~~~
azinman2
It’s provably a CSV not meant to be used by excel.

~~~
nothis
Correct:

>Each file, formatted as a comma separated value (.csv), lists an internal,
32-character alphanumeric “RNC ID”—such as, for example,
530C2598-6EF4-4A56-9A7X-2FCA466FX2E2—used to uniquely identify every potential
voter in the database.

------
MR4D
I'm just waiting to hear Trump tell Deep Root Analytics, "You're Fired!"

It would be good to see him make this a clear case of responsibility. Also,
someone on the RNC side needs to get fired, too. I'm not sure who, but errors
this big demand it.

~~~
political_tech
Trump only has "signaling authority" over the entities with whom the RNC
contracts; he can't actually force a contract termination, but he can express
distrust in an entity and encourage the RNC to terminate a contract. I don't
see that happening, for a myriad of reasons.

The people involved with the decision to start working with Deep Root are
mostly not with the RNC anymore. Even if they were, that's simply not how the
industry works.

------
goffchris
Is there any way to access the Deep Root data to see what it says about me?
Has anybody posted it online in a searchable format?

------
nothis
Doesn't this kinda make many (flawed but still in use) "security measures"
incredibly vulnerable to social engineering? That's _all_ the birth dates and
phone numbers. That's crazy!

------
mgleason_3
Is StateVoterID the same thing as your Social Security Number (SSN)?

~~~
ericcumbee
Not in Georgia, and I seriously doubt anywhere else either.

------
LinuxBender
If I jokingly suggested to the gov that AWS S3 be categorized by all firewall
vendors as "Anonymous File Sharing", would I get a yuge response?

------
SunnyCanuck
Has this been re-leaked onto the torrents yet?

~~~
reallydattrue
Can you clarify, re-leaked? I've never seen this online.

Can you post some names, links? Thanks

Looking for research purposes.

------
DelaneyM
Sounds like one heck of an AI training set.

DeepTrump?

------
arianvanp
So you think it is okay to model ethnicity and religion of 200 milion
citizens? This is how my country got all their jews killed in WWII [0]. The
mere thought of compiling such information here would lead to some hefty jail
time these days. I really cannot believe you can even defend such a thing.

[0]
[https://en.wikipedia.org/wiki/Civil_registration#Netherlands](https://en.wikipedia.org/wiki/Civil_registration#Netherlands)

~~~
wyldfire
Whoa! We got to Godwin super fast this time.

Please re-read what I wrote and consider that I might have the best intentions
to explore a discussion with HN. Note also that I discussed the nature of
securing the data and not its compilation.

Note that IMO several European countries forfeit some freedoms that I consider
valuable and critical for democracy. IMO it is your right to observe the world
as it is and note what you observe. Yes, I understand that this right
conflicts with privacy and find it an extremely unfortunate consequence,
especially given the emerging power of AI. However I think that it's also
possible find people culpable for evil intent of a compilation, though
difficult to prove.

~~~
arianvanp
It's not an attack at you personally. The contrary. But I noticed that a lot
of comments found this kind of data collection 'normal'. and apparently in the
US this is very much legal. And I as a person find this very troubling,
because in Europe these kind of decisions caused a Genocide, which is a result
that should not be taken lightly. Do you really want the RNC to collect
religious information about all 200 million voters? What happpens if this
information falls into wrong hands (in our case it was Nazi's). You just
handed someone a potential 200 million long hit list.

So yes it went Godwin quickly, but it is the painful truth.

~~~
bryondowd
I'm no expert on the subject, so feel free to set me straight. It seem to me,
though, that this is a bit like cargo culting. An association between some
part of a process and the result that doesn't factor in the big picture. More
specifically, wouldn't the presence of such lists have proven a minor and
nonessential part of the genocide? If the Nazis didn't have those lists, would
they have just thrown up their hands and went away? It seems they'd have just
gathered the information by the usual 'everyone accusing his neighbor' witch-
hunt means, with people using it to settle grudges or other nefarious reasons.
Maybe it would be less accurate, and maybe a little slower, but ultimately,
you'd probably have about the same amount of bloodshed, I would expect. The
real keys to these situations are a powerful group with ill intent and a
population that is at least willing to condone it. If those are in place, the
list doesn't really matter, and if they aren't in place, the list doesn't
really matter. So why does the list matter?

Note, this is an invitation to enlighten me, not a firm belief on my part.

------
rootsudo
In other news, already known information. Every USA states has clear laws on
releasing whose registered with which party, past affiliation and address.

Clickbait.

~~~
317070
From the article:

> Along with home addresses, birthdates, and phone numbers, the records
> include advanced sentiment analyses used by political groups to predict
> where individual voters fall on hot-button issues such as gun ownership,
> stem cell research, and the right to abortion, as well as suspected
> religious affiliation and ethnicity.

In some Western countries, the mere existence of such a database would be
illegal. Especially relating to religion and ethnicity.
[http://www.newyorker.com/news/news-desk/can-the-french-
talk-...](http://www.newyorker.com/news/news-desk/can-the-french-talk-about-
race)

In the US, it is leaked online.

~~~
ryanlol
>In the US, it is leaked online.

It's not a "leak" when the data is entirely public already...

~~~
kakarot
It is a "leak" when additional data that was previously private, such as
modeling and analytic information, gets published against the will of its
owners.

However, it isn't a "leak" in the traditional colloquial sense that someone
stole the data and released it to the public. It's just a security leak.

------
80211
The reddit posts are odd. What value do they have?

And why/how did they leak?

Fortunately, they don't hold much personal data, but given that they're
looking to raise $$$, the fact that they had a security breach is interesting.
Especially if they haven't disclosed it.

