This is actually almost entirely public data. Yes, including addresses and phone numbers and political affiliation. There are some states that is not public as part of the voter file, but you can still get it other ways publicly. For example: USPS, etc. Some states/players would make you sign agreements not to use it for commercial purposes.
The modeling info included is not public.
Acquiring 50 state data can be a bit of a pain, but there are at least two major players that will sell it to you.
(I remember one of them literally laughed when I told them we would want the databases without any personal info included, because we just wanted the address to various political precinct mapping.)
Birthday is an included item. That's definitely private as it is often used to confirm identity.
"almost public" is meaningless. One data item, like credit card number, or birthday, can make this a dangerous leak.
That it's used to confirm identity shows how weak identity-theft protections are at most institutions, not what's public information. (For that matter, mother's maiden name is basically public information as well: you can get it from genealogical records.)
Also, "reselling government data" implies that the government explicitly gave permission to a business selling your data. I doubt that's true. More likely, these entities gathered data from whatever entities they could, probably various private companies from which you've made online purchases.
There's no perfect identifier. That's why we need security. This is a major lapse.
Pretty sneaky of the comment above yours to try to pass dmv.org as a government website.
I've never "consented" to "releasing" my address publicly, but real estate records are public information thus the real estate I own is easily queried from my town's website and the real estate transaction was published in all the local newspapers. You can even look up my property tax bills from the towns website and my payment history of said taxes. If you're feeling generous you can also pay my property taxes online too.
Is the list of doctors you visit private? I could probably discover such information, but it is considered private.
Your address is usually public unless you go to lengths to obfuscate it. You can go to Hollywood and get a map of where all the movie stars live. And, the white pages list your phone number unless you opt out.
Birthday has never been publicly available until this voter data leak. Not good.
Yes, this is explicitly protected by law.
>I could probably discover such information
No you can't get this information, and certainly not legally.
>Your address is usually public
Exactly but I never "gave my consent to release my address," it just is.
>Birthday has never been publicly available until this voter data leak.
Open up a newspaper or open up yourlocalpaper.com There's a whole birth announcements section!
How do you think the RNC got this information in the first place? Public records. The RNC is NOT the government.
I imagine they used campaign dollars to buy the information from private companies with whom you've done business. Maybe legally, maybe not.
The RNC certainly isn't offering up such information as to where they sourced the data they leaked.
Too bad. The government doesn't care about you consenting to things. In fact, making you do things without your consent is literally the entire point of organized government, even if we usually overlook that because it's for a good purpose (for example, taxation to pay for health care or national defense).
I don't care if you think paying for health care or national security is important or not. That's an unrelated issue to whether or not Birthday is private or public data.
If it's true that that information is out in the wild now, then I expect government to tighten their tech security procedures.
How do you think the RNC got birthdays for 200 million Americans in the first place? Public records. 200 million Americans are NOT affiliated with the RNC.
As I replied elsewhere, there is a market for reselling your information on the internet. In some cases that is legal, and in many it's probably not. As a tech person you should know this.
The solution to all of this data privacy hysteria is to change our approach. The possession of common facts about a person should not be sufficient to masquerade as that person.
I think the solution is better security. We'll only ever have basic facts to uniquely identify people. Biometrics can be hacked/copied too.
> DMV.org is a privately owned website that is not owned or operated by any state government agency.
> "Note: Residence address and SSN are confidential except when the requester is authorized by law to receive it."
>Confidential information is not considered public record. This includes certain DMV personnel matters, physical/mental information, residence address, social security number (SSN), incomplete findings from research, results of ongoing investigations, operation plans, and electronic data security controls.
DOB is listed on ID and not considered confidential information.
Nowhere does it say that. If you think they mean that by omission, I highly doubt it.
You cited California DMV which does not give up that information. They won't even give you someone's address unless you're legally entitled, like the police.
If there are websites out there sharing that private information, they're doing it without my consent.
Is your daily schedule, when you go to work,drop off your kids, what route you take to work protected legally? No, but you probably wouldn't want to share that information publicly either. Yet, in the future, a data breach could reveal such information, and a business could seek to resell it.
> "According to the webmaster, there is a class action lawsuit that is trying to have the site shut down"
The article also argues that many types of workers ought to seek removal of their name from this list in the interest of personal security.
The story is making the case that this should be considered private information.
Lots of things that are actually not private data are used to confirm identity; “it is used to confirm identity” is not a disproof that a piece of information is public information.
I wouldn't. Therefore it is private.
This is a random guy. His named ERNEST BELL JR and his birthdate is 10/05/1959.
Not sure why you're replying for someone else..
> This is a random guy. His named ERNEST BELL JR and his birthdate is 10/05/1959.
Inmates obviously lose some rights when convicted. Data security is minor compared to losing the freedom of movement.
Also doesn't surprise me that Florida would make all information on its inmates public.
mylife.com publicly posts birthdays so apparently not
Just because some skeezy website shares all my personal information does not make it public information. It's private to me and I'd rather it remain undistributed, where possible.
Overall point being, this US voter data leak is bad. That appears to have released information on all of us that we consider private.
I kinda think it does mean exactly that. Just because you don't want something to be public doesn't mean that it isn't public.
Nothing about the definition of public or private restrict either to being protected by the government. I can consider some information private without it being protected in the legal sense.
That said, I expect that any government entity that released my birthday would be held responsible for a data leak.
Nope. What's your point?
> "Just because you don't want something to be public doesn't mean that it isn't public."
In the future, your location data could be tracked, shared/hacked, analyzed to produce more information, and breached just like this RNC dataset.
My point is, something private doesn't become public when a criminal exposes it. It's still private to you.
If in the future it is no longer public, then I won't consider it private. I may not be happy about it, but that doesn't change the fact that it is public at that point.
So it is with birthdays. There isn't any government organization who will distribute that information.
If all credit card numbers were stolen and distributed, we would still have considered them private before that.
The same is true for these birthdays. They were private, then stolen and made public. That they are now public doesn't retroactively make the theft okay. Theft is theft.
The lesson here is to tighten security, increase security education and awareness, and increase investigation into the most egregious of these crimes so that violators can be brought to justice.
So taking that modeled data, loading it into Cambridge Analytica (which I understand is somewhat of a DMP in this sense), and leveraging it with highly-customized creative targeted against Custom Audience uploads using this modeled data would be insanely valuable to a political player with the capital to deploy this info weapon.
Should I be able to find the public records for any citizen, anywhere in the country, at the snap of a finger? Or should I have to go to e.g. the courthouse and ask for the records in person, deal with a phone system, &c.
As for "For free", states are generally required by law to give it to you if you ask. Some charge fees.
Only two have crazy fees (5k and 30k) has a crazy fee (though if you challenge them, ...etc)
There is a github project i'm aware of to put together all of the data:
 https://transition.fec.gov/pages/brochures/saleuse.shtml#anc... (publication is from 1992, but still accurate per disclaimer at the top)
If the CEO goes to jail, things will change very rapidly (CEO will manage his CMO much tighter who will first want to see an security audit not older than 6 months).
At least CEOs I have reported to as CTO were very sensitive for implemention issues in areas that could land them in jail.
Same for every other hacking (e.g. Sony) or IT failure (e.g. British Airlines crashed DC).
A careless programmer makes a bad choice and the CEO has to go to jail? Come on.
An institutional failure of review, testing and security that will lead to tens of billions of dollars of identity theft goes unpunished completely?
A CEO is responsible for his organization. If you ruin lives, you have to pay the price.
Can't handle the heat?
Don't take the job.
I hate how CEO's get hundred million dollar parachutes because, the risk and danger and difficulty of such a position warrants such extravagant pay.
But, then, we ask them to be responsible, bear responsibility for the organization which paid them a hundred million dollars to be responsible,and we say "come on?"
CEO's bear responsibility for their organizations, or the organization should not exist. There must be responsibility for private organizations, lest the concept of private organization be nothing more than a cheap trick to remove criminal and civil liability from wrong doing.
This is all publicly available data scrapping stuff. Like your public Facebook profile.
If you don't want that stuff to be leaked, then don't put your info publicly on Facebook.
Then we can stop having this conversation constantly. None of this information is secret, I would put SSN into "quasi-secret" land since it takes such minimal effort to get at it.
We've been relying on security through obscurity for far too long. If the only thing stopping mass identity theft is someone compiling a list of otherwise public information, it's far beyond time we re-evaluate where the true problem lies.
So yeah, I agree. At some point society is going to actually have to confront this in a useful manner vs. hysterics and patching over an obviously failed system.
Then your info is publicly available for anyone to get. The government will just give it to you.
I reject the argument that those who aggregate vast troves of data about people, publicly available or voluntarily shared though they may be, are exempt from any sort of responsibility for the curation and deployment of said data. Informational asymmetries lead to power imbalances, and sufficiently severe power imbalances lead to oppression.
If corporations face the prospect of a big bill, and the cost of that bill far exceeds the cost of keeping user data safe, a lot of the right things will start happen.
As in, the company didn't want to distribute this data, so it's a breach, and the person who did that would be guilty of stealing the company's confidentional information (i.e. the modelling info) but it seems quite likely that purely (re-)distributing the core data of people's names and addresses doesn't actually violate any US laws at all; US privacy laws (outside of medical data) are very lax compared to e.g. EU.
I could imagine that victims of a future identity theft might have a civil claim against company if/when real losses have occurred, but it's quite possible that if the CEO personally published all this data, filmed all of this, and sent to the prosecutor's office, that no crime (according to current USA privacy laws) could be found there.
CEOs are not an oppressed class groaning under the burden of social structures that keep them locked up in the C-suite. Even if they are confronted with draconian penalties for naive misadventure, most CEOs of medium and large firms can afford A+ legal representation. If you're more worried about them than you are about the potential first and second-order effects upon tens or (in this case) hundreds of millions of people, then you are essentially choosing to be a pawn of the powerful.
Not to make overly sweeping generalizations, but 'hold on, let's think through all the ramifications here instead of being too hasty' is a great way to maintain the status quo while avoiding any responsibility for it. Who benefits? It sure ain't the general public.
To some extent this is a cultural divide; anglo-Saxon capitalism has an unspoken ethic of 'forge ahead, cross bridges when you come to them' while continental European capitalism is far more accommodating of social considerations and has a 'first do no harm' approach. There are upsides and downsides to both approaches - and of course these are very shallow and incomplete characterizations of complex economic and cultural factors, which I have no intention of trying to defend if someone complains about them.
If you break the law, you pay the price of jailtime. If you haven't broken the law, you might pay the price in the marketplace, but that's all.
If a law was broken, then of course whoever broke it should be prosecuted. But I don't think anyone disagrees with that, nor does it need to be explained in a lengthy HN comment. The only reason we need such an explanation is exactly because this isn't the way our society works. We set up the rules of the game and expect people to play within those rules, but we don't go around jailing people because we dislike them or disagree with their choices.
"CEO's bear responsibility for their organizations, or the organization should not exist."
Don't forget that a CEO is an employee, and just one employee. A particularly important and influential (and well-paid) one, but "just" an employee. An organization is not solely defined by its CEO, nor does it make sense to think of them as all-powerful in terms of what the organization does. A CEO who doesn't perform can (and probably will!) be fired at some point.
I'm going to agree with you in wanting to see someone punished for this, I'm not sure if I'm on the side of jail time in the absence of malicious intent.
How have you established that they didn't have a sufficient number of experts? What if they purchased a product or service and it simply didn't work? Its rather harsh to point fingers without having all of the information.
>We'll keep seeing things like this until our laws are such that stewards of data like these have some sort of incentive to protect them.
I think we need to give companies appliance-like products with a simple set of instructions that anyone can follow. Even a simple change where the data is stored in a 'vault' that requires the use of special tools with built in access controls and auditing would prevent a lot of data breaches. This means you cant email files around or share them on google docs or whatever. I'm convinced that people will do the right thing if you make it easy enough for them.
Given that this data was collected for explicitly political purposes with the specific goal of shaping voting behavior - one of the few things in American life where privacy is considered sacrosanct - surely you don't need me to point out the potential for manipulation, exploitation, and intimidation that become available to bad actors in possession of this data.
Are you familiar with the concept of 'strict liability'? Do you have any policy reason why such a standard shouldn't apply in cases like this?
"In criminal law, strict liability is liability for which mens rea (Latin for "guilty mind") does not have to be proven in relation to one or more elements comprising the actus reus (Latin for "guilty act") although intention, recklessness or knowledge may be required in relation to other elements of the offense."
Proving recklessness is harder than you think.
If data is leak because someone within a company with the appropriate level of access decides to sell out to the dark web, all the security in the world won't protect you. Should the CEO go to jail because an employee turned on the company?
Heartbleed - you could have had 100 security professional on your team, and you still would have been vulnerable. Should every CEO on the planet go to jail?
Security persons do make mistakes and leave keys in places they shouldn't, genuinely by accident. Whom is going to jail for this error? If you think you are sending the security personel to jail, well, we're going to have an exodus of people willing to call themselves security personel, because no one is paid enough to risk jail for a job.
So, it's not a matter of defending ineptitude, it's a matter of recognizing the problem is complex and unless you can have clear boundries of what is punishible and what is not, your going to have a bad time enforcing anything that makes a difference. As a security person, I'm sure you know a policy without adequate enforcement is absolutely useless.
tl;dr penal incentives function like a sword of Damocles. As long as we're debating whether and what size of sword of sword to hang from a thread, Damocles has no reason to worry.
So far I've seen people disagree with simply throwing them in jail and people who want more data (was is a screw up? even security people make mistakes. Did they not have appropriate resources? etc).
I haven't seen anyone say this is OK in any way, shape or form. I see many reasonable discussions.
I'm calling for action - strong user-centric privacy protections with strict liability and significant personal and organization penalties for negligence, similar to the French model.
How's that been working out for you? This isn't a new problem. Where are those educational, training, and social awareness resources? What budgets have been allocated to them? What mechanisms put in place to monitor the effectiveness of the deployment? How many more years of theoretical discussions about ideal solutions should we have before acting, notwithstanding the possibility of error? If your cautious incrementalist approach is so great (and heaven knows I've spent many years thinking and advocating within that framework) why does the problem keep getting worse? How long and to what extent are you willing to wait for this informed public to manifest and (somehow) overcome all the countervailing forces that have economic and political interests in quite different outcomes?
And why, I ask myself, did you respond to my positive proposal about "strong user-centric privacy protections with strict liability and significant personal and organization penalties for negligence" by ignoring it and instead knocking down a straw man of 'criminalization' that I took care to avoid?
You don't want to be the person responsible for taking or advocating for a decision that might work out poorly, fine. But reiterating the reasons for your hesitancy achieves nothing.
Just because there is no information about that in the article, doesn't mean they weren't in place.
> If your cautious incrementalist approach is so great (and heaven knows I've spent many years thinking and advocating within that framework) why does the problem keep getting worse?
How do we know its getting worse? I work with a LOT of non-technical people, and they are very good at detecting spam emails, and not clicking on the fake bluescreen popups, etc using just their intuition and general awareness. They DO pay attention whenever articles about viruses and hacking and whatnot hit the front page.
>You don't want to be the person responsible for taking or advocating for a decision that might work out poorly, fine. But reiterating the reasons for your hesitancy achieves nothing.
Would you consider flipping it then? Let's also put software developers who introduce security bugs in jail. Oh, but software is so so complex!! A million different pieces working together, and I didn't write all that other code, so how could __I__ possibly be held responsible?! Well, people to people interactions are complex too, and putting a process in place where every person is supposed to follow a protocol is hard too.
Now, with that said, I find this position slightly juxtaposed to the position you appeared to hold on the privacy of citizens when the Snowden leaks happened.
Why would political preferences be more sacrosanct than other preferences or private predilections people hold...
It would be great to define all the data-types a citizen can hold a position on and determine those which the government / entities can gain access to, and those which a citizen can expect privacy with...
And have that as a simple checklist as opposed to hidden in lengthy language of laws?
Being a Euro I personally favor very rigorous privacy protections, and think you should be able to know who has data on you, get detailed copies of it in some accessible format, and request its deletion. Public institutions that do have a custodial data function should be subject to increasing levels of accountability and their powers should not be unlimited.
Now, since the US doesn't currently promulgate such strict data-gathering and retention standards in the public or private sector as I would like, it's a strategic reality that well-resourced actors like foreign governments can vacuum that up for their own ends, whether nefarious or merely curious. So I'm OK with the NSA collecting such data insofar as it seems irrational for the government to put itself at a disadvantage relative to everyone else in the private sector, in the same way that it would irrational for police officers to have fewer powers than regular people, as opposed to greater responsibility in the exercise of those powers.
In short, if all that data on people can be legally bought or acquired, it'd be pretty stupid for the USA/NSA to be the only entity that didn't have a copy.
I do heartily agree that data aggregation in both public and private sectors is way, way out of control, and I also agree that a checklist approach would be far preferable to yet more books of rules. I have some radical (but inchoate) technical approaches to this problem in mind, if you want to get in touch via gmail.
This isn't a very good example. A shortage of nurses will directly correlate to poor patient care and possible death. But a shortage of security experts? Who knows. I worked at an insurance company that left an access database open to the internet FOR YEARS. We ran analytics when I found it and it was never served from our web server.
So since it didn't get into the public does that mean they were responsible for their security? If the answer is "no" then how would you ever measure these unknowns?
Security is a major problem in tech. It's very difficult, it's nuanced and its vast. Security covers so so much that it would be difficult to one or maybe even a handful of security experts to fully over all aspects of an app depending on your scope.
Beyond that though is mistakes happen. People will screw up. Even security people can screw something up. Throwing someone in jail for a screw up reminds me of the war on drugs; it's not going to stop someone from making a mistake or simply not realizing an unknown unknown.
Which is why it's so important to hold companies that screw it up accountable. That's the only way to get it to change. Forget about everything else, accountability will force new rules for data storage and protection. Without accountability, nothing will change.
Sure but everyone on HN suggests accountability but never defines what they mean by it except for the few who think someone should just be thrown in jail.
So, what do you suggest for accountability?
At that point it would be my theory that consolidation around best practices, software, security audits... would become the norm. It would raise the cost of a company taking on the responsibility itself, that they'd rely on others to reduce the cost through volume. It would probably start looking a lot like PCI and credit card co. Requirements. The big difference here being that there isn't an industry body responsible, but the government, which would always be political and probably not have enough teeth.
Forty years ago we didn't have this issue because there wasn't so much data for them to try to get their grubby greedy hands on. They don't need our data (ANY OF IT)!
We need laws to protect us from them. Much more and better laws. The fact that the constitution does protect us from gov. is a good argument that our gov. should be active in protecting us from corps. Anyway that is my conjecture.
Where personal data and privacy is concerned, I'd rather err on the side of caution, than the world we live in now.
Like, why is the organization set up so that 1 programmer can make a catastrophic mistake? The CEO is responsible for that.
If the system is set up so that one general can launch a nuclear warhead, then the system is broken.
If the system is set up so that one politician can kill people without a trial, then the system is broken.
If the system is set up so that one nurse can release data on 1 million patients, then the system is broken.
It's not "what happens when a careless programmer does X." but rather "why do we have a system where a careless programmer can do X."
There are many (48) different state laws that do define what PII is and how organizations (commercial and governmental) are to handle data breach notifications. If you want to see what a crazy patchwork map of laws this is checkout:
These only come into play if a certain minimum number of state residents have had their data compromised and if that data is of a certain class.
Typical classes are:
- Account info
- Financial info
- Health Info
- Health Insurance info
- Biometrics, etc.
And I'm not a lawyer, and we likely don't have all the facts, but at first glance the data released in this breach doesn't meet any of those classifications. It looks pretty much like the data you'd get out of a phone book (name, address, phone number) with a few data points like geocoding and their guess as to your religion and politics.
Which isn't to say that it's great, or that it's not a problem that this was all released, but it is pretty much public data.
I know this is USA, but FYI in the EU, all personal data is protected.
I am not sure that jail time is really the thing here, but there are institutional problems if this is something that happens.
That's not how laws work. Laws can be whatever we write them to be. Losing medical and financial records was once not illegal too.
None of the info they had was private info.
If you don't want your info to be leaked then don't make it public.
The size and scope of the data are larger than most voter record data.
The data appear to include proprietary data from various sources who may not agree with the terms of disclosure here.
Even though the RNC is a private organization, it doesn't operate like a normal company. The partnerships that it makes with companies that it awards contracts to are largely relationship-driven, not actually driven by objective analysis of value propositions. Those decisions tend to be made at the COO/CoS (Chief of Staff) level.
There is no special purpose in these two groups of colluding Americans that grants them special rights to gather data on their non-associates any differently than any other member of the public.
What you can share lets companies store and make decisions based on data they shouldn't have and couldn't share as long as their inputs and outputs look clean.
I,e. Facebook or Google could help you intentionally run a race biased campaign across all their assets as long as they don't tell you any specifics and include a little noise so you can't be sure of any one user's race. All thanks to what they can collect, use, but refuse to share.
They seem like they were doing reddit scraping/Facebook scraping.
Nothing illegal about that.
If the leak would've been published willingly by Deep Root Analytics, there would be no crime (according to current USA privacy rules) here at all, no currently valid reason to jail anyone.
Also, can someone ask Troy Hunt whether he has or can get access to this data so he can let us all know if we're on it? (But will it even matter if they don't have an email address field?)
To your latter question, some states, like California, do include email addresses on their voter file, but the coverage tends to be poor.
You can probably assume that, if you're a registered voter in the United States, you're on this dataset, as is your age, gender, party affiliation if applicable, and race/ethnicity info if your state collects that, as well as modeled information projecting things like party support, race, age, and likelihood to turnout.
Effectively two pieces of information when separate, might not be classified, but if they are linked with a third or combined they become classified. Add more and it changes the classification further.
I wonder if there should be something similar for data aggregation companies. Like what we see with HIPAA.
We just handed everyone, including the 5% of society who tend towards sociopathy, a nicely tagged, collated (and yet probably slightly inaccurate) list of minorities.
Hate women? You have a nice list which includes the names, addresses, and telephone numbers of all those women.
Hate muslims? Boy do I have just the list for you. Blacks? Republicans? White muslim men who live in the same neighborhood as you?
Let's omit the sociopaths for just a moment, and let's look at the ad networks. Can you picture how much more accurate a picture those companies have of you now? They no longer have to guess at your age, ethnicity or religion - they now know. What could go wrong when that list of "legally" collated data gets combined with the RNC leak, and is subsequently itself leaked?
So no. It's probably not illegal to compile these lists. It's probably not even illegal that it was released. But it was, for certain, a damned immoral thing to do, and there will be consequences.
Hate women? Look for boobs.
Hate muslims? Look for brown skin.
Hate Republicans? Look for MAGA hats.
That these are inaccurate signals is irrelevant: haters gonna hate, and I really don't think they care whether the brown-skinned person they're harassing is actually a Muslim, they just want a target for their anger.
As for ad networks - they already have much more accurate models of age, ethnicity, and religion than the RNC has. There's a lot more money involved in targeting ads, and so they've put a lot more effort into it than political consultancies. Worrying about them is like closing the barn door after the horse is out.
It doesn't get much lower effort than downloading a list, popping it into Excel, and sorting on a column (I'm willing to bet that some hate groups are already doing this and will release very specific lists to their membership). And with phone numbers and a couple of bucks, you don't even have to leave your house to send them hate messages by the thousands.
Haters gonna hate - far too flippant a phrase to describe those who emotionally and physically assault their targets.
As for the ad models - the RNC release has a very specific DOB, location, gender, and phone number. Some of these the ad network could guess at, but this provides concrete data.
Look, I don't want to minimize the impact of hate or harassment on victims. It really is terrible, and should be challenged whenever possible. I do want to inject some realism into the discussion of the likely consequences of this breach. The people who would go out and harm other people because of their ethnicity or religion don't particularly care if they get the ethnicity or religion right. (I'm reminded of a time when I was carrying my wife's purse while she went shopping in a nearby store, I walk by a pickup truck, and the guy inside is loudly muttering "Fucking faggots" over and over again. And then I met my wife in the parking lot, give her a kiss, hand her back her handbag, and he laughs a big "Ha-ha!" of relief and drives off.)
And the people who would do mass harassment over the phone have much easier ways to get this data, like e-mailing their state voter registry and asking for it.
I think your concept of sociopathy qua serial killer is uninformed and stereotypical. Most sociopaths are not stabby weirdos blinded by hatred, they're just self-centred people with different levels of emotional affect/susceptibility than the general population.
It seems not to have occurred to you that (depending on record quality) a database like this would be great resource with which to find and recruit sociopaths of various stripes.
Someone with the resources to perform ethnic cleansing could easily pony up the few thousand dollars that is required to buy this data in the first place:
Not many years ago, there used to be a book distributed far and wide with the names, address, and phone number of virtually everyone that lived in your city. This was accepted as normal and routine, and you had to specifically opt out of being included.
You could walk around and find some of them, but that would take a significant amount of time and effort.
Phones were also not capable of interrupting you while you were out of the home, nor were they capable of receiving short messages on your dime.
There is other information besides what is on the voter database in this disclosure, it appears, but the voter data itself is mostly not a secret and can be trivially accessed by any citizen. They just have to promise not to break the law around what they do with it.
"State", "Juriscode", "Jurisname", "CountyFIPS", "MCD", "CNTY", "Town", "Ward", "Precinct", "Ballotbox", "PrecinctName", "NamePrefix", "FirstName", "MiddleName", "LastName", "NameSuffix", "Sex", "BirthYear", "BirthMonth", "BirthDay", "OfficialParty", "StateCalcParty", "RNCCalcParty", "StateVoterID", "JurisdictionVoterID", "LastActiveDate", "RegistrationDate", "VoterStatus", "SelfReportedDemographic", "ModeledEthnicity", "ModeledReligion", "ModeledEthnicGroup", "RegistrationAddr1", "RegistrationAddr2", "RegHouseNum", "RegHouseSfx", "RegStPrefix", "RegStName", "RegStType", "RegstPost", "RegUnitType", "RegUnitNumber", "RegCity", "RegSta", "RegZip5", "RegZip4", "RegLatitude", "RegLongitude", "RegGeocodeLevel", "ChangeOfAddress", "COADate", "COAType", "MailingAddr1", "MailingAddr2", "MailHouseNum", "MailHouseSfx", "MailStPrefix", "MailStName", "MailStType", "MailStPost", "MailUnitType", "MailUnitNumber", "MailCity", "MailSta", "MailZip5", "MailZip4", "MailSortCodeRoute", "MailDeliveryPt", "MailDeliveryPtChkDigit", "MailLineOfTravel", "MailLineOfTravelOrder", "MailDPVStatus", "MADR_LastCleanse", "MADR_LastCOA", "AreaCode", "TelephoneNUm", "TelSourceCode", "TelMatchLevel", "TelReliability", "FTC_DoNotCall"
You could, in the next 10 minutes, go purchase a national voter file with all this info on 190 million plus voters.
Just Google "national voter file purchase".
Those are pre-aggregated, and honestly, really cheap (Usually ~2k).
You can also aggregate it yourself for less if you want.
(not disagreeing, rather I'm seeking information)
> RNC_RegID, State, 2012ObamaVoter_DRA_12_16, 2012RomneyVoter_DRA_12_16, 2016ClintonVoter_DRA_12_16, 2016TrumpVoter_DRA_12_16, AmericaFirstForeignPolicy_agree_DRA_12_16 AmericaFirstForeignPolicy_disagree_DRA_12_16 AutoCompaniesShipJobsOverseas_agree_DRA_12_16 AutoCompaniesShipJobsOverseas_disagree_DRA_12_16 CorpReputs_AmericanMakers_DRA_12_16, CorpReputs_DailyLives_DRA_12_16, CorpReputs_Egalitarians_DRA_12_16, CorpReputs_EnviroConscious_DRA_12_16, CorpReputs_OpportunitySeekers_DRA_12_16, CorpReputs_STEMSupporters_DRA_12_16, CorpReputs_SupplyChainers_DRA_12_16, CorpReputs_Unifers_DRA_12_16, DemLeadersStandUpToTrump_DRA_12_16, DemLeadersWorkWithTrump_DRA_12_16, DParty_DRA_12_16, FinancialServicesHarmful_agree_DRA_12_16 FinancialServicesHarmful_disagree_DRA_12_16 FinServicesCompany_Dreamers_DRA_12_16 FinServicesCompany_RiskMitigators_DRA_12_16 FossilFuelsImportantForUSEnergySecurity_DRA_12_16 FossilFuelsNeedToMoveAwayFrom_DRA_12_16, InvestInfrastructure_agree_DRA_12_16, InvestInfrastructure_disagree_DRA_12_16, LowerTaxes_agree_DRA_12_16, LowerTaxes_disagree_DRA_12_16, NonReluctantDJTVoter_DRA_12_16, NonReluctantHRCVoter_DRA_12_16, PharmaCompsDoGreatDamage_agree_DRA_12_16, PharmaCompsDoGreatDamage_disagree_DRA_12_16, ReformGovtRegulations_agree_DRA_12_16, ReformGovtRegulations_disagree_DRA_12_16, ReluctantDJT_Above.5_DRA_12_16, ReluctantHRCVoter_DRA_12_16, RepealObamacare_agree_DRA_12_16, RepealObamacare_disagree_DRA_12_16 RParty_DRA_12_16, StopIllegalImmigration_agree_DRA_12_16, StopIllegalImmigration_disagree_DRA_12_16, TrumpStandUpToDems_DRA_12_16, TrumpWorkWithDems_DRA_12_16, USAFinancialSituation_Optimistic_DRA_12_16, USAFinancialSituation_Pessimistic_DRA_12
Neal Stephenson wrote a book called Interface which predicted a form of tech-enabled micro-targeted politics over 20 years ago. It was disturbing at the time; it's almost considered business-as-usual now.
I believe American democracy would benefit from including the study of such techniques in our educational curriculum. When I was in school, we studied advertising techniques to help us be skeptical. We need the same for targeted political messages now.
Of course education is great, but look at the vast financial and operational asymmetries between even the most informed individual and well-resourced corporate actors like political parties. I have a super-strong political immune system but being politically engaged and navigating social media is exhausting. For the sake of objectivity I have to systematically expose myself to opinions I find disagreeable lest I retreat into a bubble and be surrounded by confirmation bias, but continuous exposure to countervailing political ideologies is intellectually and morally tiring, given the intense polarization and visceral rhetoric that prevails in today's political discourse.
Despite not liking programming, I've been seriously thinking about building a virtual assistant that I can train to pre-emptively tag people using my peculiar ideological criteria so that I can avoid or at least prepare for certain interactions that I know are going to be psychically difficult. By my value calculus, tuning out of politics is irresponsible at best and suicidal at worst; only communicating with people whose values you share exposes you to confirmation bias, and and inevitably exposes one to manipulation; observation of and argumentation with antagonists is psychically expensive and potentially dangerous.
so much as I agree with you on education, it's not something we can just put on the to-do list and wait a generation to benefit from. And that would be true even if we had a well-functioning educational sector rather than one that fails a large number of children and adults by leaving them only semi-literate and -numerate. People who can't read or reckon well are poorly positioned to identify fallacious political discourse.
For policy wonks and activists such as myself, discourse, persuassion, marketing, are distractions from the real work of getting things done.
Firstly, because people vote their identity. Period. Almost no one votes on the facts, the issues, the policy, the platforms, whatever. There are no undecideds, no independents. Cite "Democracy for Realists".
Secondly, victory is achieved by mobilizing your supporters. You bring the heat, whoever is sitting in the chair will see the light.
The only distinction is if a voter is willing or unwilling to bother casting a ballot.
Why is U.S. voter registration made public at the individual name/address level?
Why do the states publish their voter registrations in the first place?
Why should private campaign operations (or anyone else) have access to this data?
Shouldn't voters' privacy be protected by the states?
Nothing that would harm the 2-party system ever changes in the US, and nothing ever will.
In the US, you register with the party no? So the party you register with your information, and they can do what you said
> go to their houses, send them things, and know for sure they're only hitting up people in their party, so as not to mobilize the other side.
You can register to vote and self-identify as 'Dem' 'Republican' etc while getting your driver's license.
Aside from the public voter file update, the Democratic Party doesn't get any special notification if you pick them, and you don't need to apply or get accepted in any way.
My guess is that it's public to avoid unfairly preferencing the major parties, so that they give at least lip service to the idea that "anyone can run for office". The U.S. is pretty sensitive about seeming like it has a fair and open political system, even if other aspects of the system mean that in practice a third party doesn't have a snowball's chance in hell.
(Eligibility which will vary down to the smallest political divisions)
The real danger of data like this, in my opinion, illegal usage for voter fraud.
Find people who are likely to vote against you and likely to have poor voter registration documents, and remove them from the polls so they can't vote.
Find people who aren't likely to vote at all and vote on their behalf. In-person, the only verification required is name & address. By mail, the only requirement is a signature, which can be obtained from receipts (I assume this is available on black hat markets).
Leaving this S3 bucket as public-read allows for deniable coordination with illegal actors. I can't imagine they did this on purpose but that could be an explanation.
I don't know if it's possible, but I hope the FBI / Mueller team is able to get access logs.
The loss here is all the very expensive extra modeling and demographic work that isn't included on those files. But having that doesn't massively alter the mechanics of the voter fraud effort you're describing.
I agree that it doesn't change the fundamental mechanics, or enable otherwise impossible attacks.
An easy way is to see it first hand, by working as a poll judge or inspector.
I've yet to work a poll (next election) but have gotten to know the system here pretty well through the SF elections commission.
Our system is very far from perfect. Many counties do not even audit ballots after every election (let alone use only paper ballots). Epollbook software can be all over the place. Voter verification at polling places is often quite minimal. The penalty for forging a signature on a mail in ballot in CA is only $1000 (I was in the room when the state assembly committee voted not to raise the fine to keep up with inflation).
I don't mean to be alarmist - like I said, I don't think these things took place, at least en masse - but it'd be quite naive to suggest there aren't vulnerabilities.
I still encourage you to work or observe poll sites on election day. Soup to nuts. If you work it, you'll get training, see how the Australian Ballot is supposed to work. It requires many hands, eye balls, proper accounting.
I'm not so worried about identity theft for in person voting. Just doesn't (didn't) seen to happen much on the west coast.
I vigorously opposed closing our poll sites in favor of all mail postal balloting (WA state). With ballot scanners and electronic adjudication of ballots (changing records in the database per "voter intent"), it's roughly equivalent electronic voting machines, with some new vulnerabilities added (eg tabulating ballots as they arrive, effectively a pre-count).
As various members of the election verification network (EVN) determined, auditing elections is infeasible, impractical, and does little or nothing to increase confidence or certainty.
The gold standard for our form of elections, which I continue to advocate, is the Australian Ballot. In place of auditing, use physical chain of custody. (As you likely know, election administration is not banking, where they have double entry bookkeeping.)
To truly fix our election integrity woes, we need to do two things.
First, replace our first past the post (FPTP) with a more robust voting system. Like approval voting (for executive races) and proportional representation.
Second, adopt universal voter registration, with automatic updates. Were our government to use any one of the number of existing demographic databases (facebook, seisent, choicepoint, NSA, etc) then we'd know in near real-time who was eligible to vote. And save huge money doing it.
I'm not sure what you mean regarding the EVN. They provide auditing services, don't they? Also recommend auditing paper ballots here:
> Conduct post-election audits before certification of final results
> Without voter-verified paper ballots, effective audits are impossible.
From their top ten list: http://editions.lib.umn.edu/electionacademy/2016/09/08/evns-...
My impression is that SF actually uses a ballot designed for chain of custody accounting, but doesn't use it whatsoever in practice because of the effort involved. I may be wrong on this. But "many hands, eyeballs, proper accounting" is unfortunately not available for our elections in most areas.
Happy to chat more about this - email is in my profile!
Don't make the mistake of thinking that atrocities couldn't possibly happen here just because you're used to thinking of them as something that only happens in other places.
Unless the company involved is sued to bankruptcy and the people involved are prosecuted, sending a strong message to companies dealing with user data, nothing will change. But that's unlikely to happen as this company is backed by the RNC.
While we're on the topic of collecting personal data of people, there's a simple solution : just don't collect it unless it's absolutely necessary. Stop asking me to broadcast my address in my newsletter. Stop asking me to submit my billing address when I make payments online. Stop asking me for my mobile number when I visit a fast food restaurant. Most of the companies that collect this data are not competent enough to keep it secure. The reason companies ask for an address to broadcast in users' newsletters is some anti-spam act which does not prevent the spammers from doing their job. I imagine it's also a requirement for companies to collect a billing address for certain types of online payments. Change the law to remove these poorly thought out legislature.
More generally, we need regulations on how user data is used by companies. They should not be allowed to store user data indefinitely. If a user closes an account with a company, retain the data for a short period (eg- 1 year) and then delete the data automatically. Companies should not be allowed to build shadow profiles of users.
The other thought experiment is to adopt the opposite point of view that privacy is overrated. You can adopt various worldviews from 'everything is grey' to 'this is good, that is bad" but punishing someone to care about same things you care about is a pretty terrible approach. If you can't convince someone, it doesn't mean they're stupid, it could also mean you just aren't that good at communicating, or that perhaps what you think is important isn't all that important.
>Unless the company involved is sued to bankruptcy and the people involved are prosecuted, sending a strong message to companies dealing with user data, nothing will change.
Or we can give them tools that make it easier to secure data. I've always found that if you make it easy for someone, they almost always end up doing the right thing. As it stands the security products/services domain is a complicated maze where you have to be an expert to evaluate how various products work internally and which services, if any are worth purchasing.
Gresham's law, availability hueristics, optimism bias, distribution of cognitive skills, various aspects of game theory, and more, strongly suggest this.
Examples: global warming, pollution risks, resource depletion, moral and morale hazard, just off the top of my head.
Some recent discussion (from myself and others) on this G+ thread:
I've discussed Gresham's Law dynamics numerous times at my subreddit/blog. See particularly:
I've been meaning to write up a bit expanding market price dynamics beyond the set of goods that Adam Smith defined: labour, capital, commodities, rents, and (indirectly) interest.
In particular, the question of risk pricing, which is treated almost wholly as a financial question rather than an economic one.
The question of pricing under duress is a key one -- the Backward-'S' bending supply curve is a curious economic anomaly:
Also the behaviour of natural resource stocks under supplier pressure -- the price will fall to the lowest levels possible, and supplied volume will increase, if possible, for a number of highly perverse reasons. The collapse of oil prices following the East Texas oilfield discovery, from ~$1/bbl to first $0.13/bbl, then $0.02/bbl, before wellhead production was siezed at force of arms by the Oklahoma and Texas national guard, and Texas rangers, comes to mind.
Want the name, age, gender, home address, mailing address, party of registration, and voter history from every registered voter in North Carolina? Here is the "leak" on Amazon S3. http://dl.ncsbe.gov/index.html?prefix=data/
Except, by leak, I mean, link I got from my state board of elections' homepage.
People getting angry when "government transparency" is supposedly such a good thing no one questions? Go figure...
Do security firms have special permission to do this? Because as a private citizen, I am pretty sure I would go to jail if I tried this.
I.e., if a respected company downloads the data, reviews what horrible things it has and reports it to proper authorities (and gets legal advice before that on how best do it), then they're very likely to be treated as not done anything bad;
If I'd do the same, contact them asking to fix the vulnerability "or else", and then download the data and publish an angry video rant on youtube, that might land me in trouble, as (expected) intent matters a lot for prosecuting crimes.
You could make the argument that since the information was not protected in any way that you were allowed to download it, but try explaining that to a 65 year old judge who doesn't even comprehend the basic structure of the internet.
source: work at place with large call center. Avoiding DNC fines is among one of our top priorities.
secondary source: S.O. works at place with call center for a large bank. Same thing.
Also anecdotally: If you file a complaint with the FTC for an unknown number that keeps calling you back without them giving you a chance to "opt out" (this is most scammer numbers), you file a ticket with the FTC, and they usually respond to the ticket within 2-3 days. (Another funny thing -- they use Zendesk.). I stopped receiving the calls since filing the report.
As to why they take it so seriously? My guess is it's easy money for them. Kind of like traffic tickets for cops.
The parties take it seriously because they don't want to lose a vote by pissing someone off.
Hypothetically, could one deliberately leak a trove of modelling data with some fake voters inserted, and then monitor the mailbox associated with that fake voter and sue any organization you don't like that sends campaign flyers for using the data without permission?
Genuinely curious: can you really have 198 million rows in a spreadsheet?
>Each file, formatted as a comma separated value (.csv), lists an internal, 32-character alphanumeric “RNC ID”—such as, for example, 530C2598-6EF4-4A56-9A7X-2FCA466FX2E2—used to uniquely identify every potential voter in the database.
It would be good to see him make this a clear case of responsibility. Also, someone on the RNC side needs to get fired, too. I'm not sure who, but errors this big demand it.
The people involved with the decision to start working with Deep Root are mostly not with the RNC anymore. Even if they were, that's simply not how the industry works.
Can you post some names, links? Thanks
Looking for research purposes.
Please re-read what I wrote and consider that I might have the best intentions to explore a discussion with HN. Note also that I discussed the nature of securing the data and not its compilation.
Note that IMO several European countries forfeit some freedoms that I consider valuable and critical for democracy. IMO it is your right to observe the world as it is and note what you observe. Yes, I understand that this right conflicts with privacy and find it an extremely unfortunate consequence, especially given the emerging power of AI. However I think that it's also possible find people culpable for evil intent of a compilation, though difficult to prove.
So yes it went Godwin quickly, but it is the painful truth.
Note, this is an invitation to enlighten me, not a firm belief on my part.
Yeah yeah, that'snot what you are about, but citing your own comfy idealism to avoid dealing with unpleasant realities is exactly the mechanism that political bad actors seek to exploit.
That file will tell me when you voted, if it is a primary which party ballot you pulled, and if you voted in person, absentee, or early in person.
> (1) ... (v)
> Any election official in the State, including any local election official, may obtain immediate electronic access to the information contained in the computerized list.
> (3) Technological security of computerized list
> The appropriate State or local official shall provide adequate technological security measures to prevent the unauthorized access to the computerized list established under this section
Rest assured the government has much better data on everyone already.
Needless to say, our countries have different ideas about which freedoms can and cannot be sacrified.
Using public data to guess at someone's race isn't, out of context, illegal. It is in some situations, but not this one.
If we didn't have freedom of speech, we'd be just like Russia. The opposition is demonized at best, snuffed out of existence at worst.
>The mere thought of compiling such information here would lead to some hefty jail time these days.
No, not in the United States it wouldn't. Not only is it not illegal, it's standard practice by marking companies, data brokers, and others.
I would personally be very disturbed and angry if data like that about me would exist, let alone leak to the internet. No matter who has it, that data should not exist. Do people in the US really not feel that way?
Not exactly "my country", but it's implied.
I try to pick my sides based on what seems right, not based on who else picks that side.
(I do not have a side in this particular fight.)
I look at the picker of sides and their motives to determine the consequences of picking a side. I tend to favor protection of people as a whole.
Taken literally, this suggests that you can't work out the consequences without looking at the supporters. If that's what you mean, I have to say it seems weird to me.
Taken less literally: if you think the consequences of one side winning include "people get harmed", you can just say that. You can point at the consequences instead of the supporters.
Plus bad actors are quite good at sounding very good in the abstract while showing their true colors in their actions.
Imagine if there existed similar databases then. Would you hire someone if you knew they were firmly pro-segregation at age 20-something in 1950-something?
Which nation, exactly?
Imagine if there existed similar databases then. Would you hire someone if you knew they were firmly pro-segregation at age 20-something in 1950-something?
Probably not, not. I would certainly probe very carefully to see whether they had abandoned such a stance, since many aspects of personality and and political attitude are formed early in life. If you go back and look at documentary footage from around the time schools were integrated, in the 50s, you can see many youngish people protesting that and waving placards with swastikas and so forth on them (this less than 2 decades after the Nazis were defeated in WW2). A lot of those people held on to those ideas and passed them onto their kids.
- Is this a personal email address?
- Please enter 10 more email addresses that you use (i.e. are associated with this email address) so that we can "narrow down the search results."
Really. How is whether or not this is a personal email address meaningful in searching hacked data for the email address? I'm literally providing you with text to search against a database. You don't need other information.
Also, see this, which may or may not be associated with this thing: https://twitter.com/zackwhittaker/status/876830107230449664
My private collection has a bit over 5B records atm (and presumably largely the same data) but I would never advertise a service with that number as I know for a fact that various databases contain millions of generated entries.
The results for queries like "email@example.com" seem to indicate that you have large amounts of bullshit data included in that figure.
> Along with home addresses, birthdates, and phone numbers, the records include advanced sentiment analyses used by political groups to predict where individual voters fall on hot-button issues such as gun ownership, stem cell research, and the right to abortion, as well as suspected religious affiliation and ethnicity.
In some Western countries, the mere existence of such a database would be illegal. Especially relating to religion and ethnicity.
In the US, it is leaked online.
So, if I look at your post history and make a guess about what religion and ethnicity you might have, and write that down, am I in possession of an illegal database if I lived in e.g. France?
I did take a quick look, and it seems that I have a good idea about your full name, face, hair color, political views and party affiliation, your employer, etc. All this data (together with demographics of others) would also allow to make some guess about your religious preferences. All of that might be wrong, but it might also be right, so let's assume for the sake of argument that I write my guesses down and happen to have an entry that contains your full name, ethnicity and religion, just like the leaked data has.
Should this be considered a crime?
Does this change depending on how many people I look at?
Does this change depending on how accurate my guesses are?
It's not a "leak" when the data is entirely public already...
However, it isn't a "leak" in the traditional colloquial sense that someone stole the data and released it to the public. It's just a security leak.
It is mostly publicly available data, but not always easily accessible (states have varying requirements and methods of acquisition), firms go through quite a bit to get aggregate files in all 50 states. For them to be put up with no protection is jarring. But not surprising with other recent disclosures.
I only wish we had access to the files to do some queries across them!
down now LOL! (Зеркало скоро доступно)
No. Records released include "names, dates of birth, home addresses, phone numbers, and voter registration details, as well as data described as “modeled” voter ethnicities and religions." 
Dates of Birth, Phone Numbers, Email Addresses are not public information.  
I already mentioned the modeling parts are not public.
The data was amassed from a variety of sources—from the banned subreddit r/fatpeoplehate to American Crossroads, the super PAC co-founded by former White House strategist Karl Rove.
Even if no laws were broken, it may be news to those who are unaware that carveouts to PII disclosure laws have been made and their PII (along with polling data indicating preferences on lots of topics) has been leaked.
Also, "politicians are at the very best so indifferent to citizen-unit privacy concerns that they can't be arsed to hire competent admins" may not exactly be news, but additional data points illustrative of that fact are.
 What can be done with that? Well, at the very least, a PI/con-artist/other party with a motivation to approach you cold would certainly love this sort of profile for anyone they're interested in. Think about it for a minute.
And why/how did they leak?
Fortunately, they don't hold much personal data, but given that they're looking to raise $$$, the fact that they had a security breach is interesting. Especially if they haven't disclosed it.