Hacker News new | past | comments | ask | show | jobs | submit login
An Electronic Voting Firm Exposes 1.8M Chicagoans (upguard.com)
173 points by mcone on Aug 18, 2017 | hide | past | favorite | 73 comments



Source blog post (and free of CNN's obnoxious autoplay video): https://www.upguard.com/breaches/cloud-leak-chicago-voters

As soon as I read the headline, I immediately thought "AWS misconfiguration". A few recent massive government-data breaches (by contractors) have fallen into that category:

June 2017: http://gizmodo.com/gop-data-firm-accidentally-leaks-personal...

May 2017: http://gizmodo.com/top-defense-contractor-left-sensitive-pen...

Note that all of these breach reports (including this Chicago one) come from Upguard, which seems to have a method for scanning/crawling public S3 buckets.


Amazon just launched a service to help scan, categorize, and protect data https://aws.amazon.com/macie/


This looks like it only works for buckets you own. Upguard is scanning everyone.


Don't you have to pay for that?


One click away: https://aws.amazon.com/macie/pricing/

"No charge for the first 1 GB processed by the content classification engine After first GB, $5 per GB processed by the content classification engine"


Wow that seems pricey. $5 just to look over a Gig of data with a fancy algorithm?

...data that's already on their servers, to boot!


Thanks! We've updated the link from http://money.cnn.com/2017/08/17/technology/business/chicago-..., which points to this.


Same with Nice Systems' leak of Verizon customers' data:

https://www.engadget.com/2017/07/12/verizon-partner-exposes-...


I made a quick-and-dirty tool for doing this: https://github.com/sa7mon/S3Scanner

I'll probably spend some time this weekend making things look better and improving the documentation. I made it mostly as a PoC


Isn't this considered public data anyways? Illinois (and I believe every other US state) requires that certain voter data be publicly accessible. To access it in bulk, you'll have to pay a small fee, but anyone can get this.

A misconfigured AWS instance is always an issue. I'm not trying to downplay that. Only that this data being released to the public isn't anything new - the public already had access to it.

https://www.elections.il.gov/votinginformation/computerizedv...


No. The Chicago Tribune [0] reported on the type of data exposed:

> The files included names, addresses, dates of birth, the last four digits of many voters' Social Security numbers, driver's license and state ID numbers for the 1.8 million who are registered to vote in Chicago.

[0] http://www.chicagotribune.com/news/local/politics/ct-chicago...


Well, there's an unpleasant reminder of why knowledge-based authentication should never be based on something immutable.

How many services do all of use use that accept name/birthdate/SSN as identification? How many other services, like phone companies, claim not to but would still yield for someone who sounded earnest and knew all of that?

And what can the leak victims possibly do? TFA is great where you can get it, but it's not universal, and none of this information can be refreshed.


So the question becomes - what types of data is available using legal channels?


According to Forbes [0]:

- Name

- Street address

- Party affiliation

- Elections in which you did (or did not) vote

- Phone number

- Email address

[0] https://www.forbes.com/sites/metabrown/2015/12/28/voter-data...


Last four of social is so abused it shouldn't count, and date of birth is in nearly every company's loyalty database. That leaves drivers license and state ID number as the leaked data. I'm honestly not sure how important or secure those are.


Illinois is one of the states where driver's license numbers are computed from all the other information: http://www.highprogrammer.com/alan/numbers/dl_us_shared.html


wow. I had no idea about this, but it correctly calculated my DL number.


> Last four of social is so abused it shouldn't count

Yet it does. Almost every single business/government service in America uses DoB + last 4 SSN to identify you. The two together make fraud trivial.


Exactly. Every leak already has it. Every company already has it. The fact that fraud is trivial is already true, and this leak really adds little to it.


Can I have the last four numbers of your social security number?


-


OK, now would be so kind as to pretend that we've been leaked your birth year and state of birth?

Before you answer, you may want to poke your answers into this site and have a look at the outcome: https://www.ssn-check.org/lookup/

Caveat: This tracks your issue date, not truly your birthdate. In the past couple of decades many/most babies get registered at birth, but if I stick my own (birth) data in there I actually get the wrong answer, because when I was born issuance wasn't automatic yet. But that will work for a lot of people.


They are a requirement if you were wanting to fraudulently open a bank account in someone else's name.


Need a whole SSN for that no?


Both the first three and the middle two have a pretty clear rhyme and reason to them which would likely make getting them right a not-so-difficult task after a bit of homework.

https://www.ssa.gov/history/ssn/geocard.html

Anecdotally, while bored in math class we figured out that 10 or so of the guys in the class had one of two numbers for their first three.


fuck. I live in Chicago.


Voter registration data is available for purchase, but only by registered political committees and can't be used for commercial purposes. This also doesn't include a lot of the breached data like partial SS#'s and drivers license #'s. As a Chicagoan I'm not too happy about this breach, and there has been surprisingly little coverage of it locally.


Yes. One of my first jobs out of school I worked with a Standford professor Doug Rivers (@pollingpoint) that had millions of users voting records he obtained from the government for 'research' purposes. That data: your name, address, what party you are in, et al. is TOTALLY public and passed around legally to other research centers and government agencies. He had me match address information with other databases.


[flagged]


This is a false and defamatory. I (Chris Vickery) have never ransomed any data. I have protected the private data of hundreds of millions. Post some evidence or retract your comment.


I never said "you" (assuming it's really you, new account and all) ransomed the "data" specifically, but I do know of two instances where you threatened companies to go to their customers and/or the FTC unless they met your specific demands.


Did those specific demands pretty much total up to "fix the problem in a non-braindead way"?


jsjohnst- I've posted on my twitter account (@vickerysec) to verify that this is indeed me. Now, please explain the two situations you refer to. I vehemently deny the accusation and would love to know the origin of those false claims.


Ransoms the owners? That's a big claim. What proof do you have or are you just trolling him?


So, what, now somehow a group of people impacted by this potential identity theft vector will need to rally together under some keen prosecutor to personally sue? Why aren't the vendors auto-summoned to court by the government when these breaches occur?</rhetoricalQuestion>

Hooray for the free market .. ?


> Hooray for the free market

The free market says they don't care. I've had my identity stolen from a data breach. Could I have sued? Yes. Could I have led a class action lawsuit? Yes.

Did I? No. Why? I'm fine now and just like billions of other humans, I'm lazy and simply just don't care enough.


Recently, I got an email from AWS notifying my that one of my S3 buckets was publicly accessible (intentionally, for a static site). They really try to make sure that people can't screw this up.


Yes not only that, they have changed the UI so much that it explicitly confirms that you want to make this data public.


As both a Chicagoan and (obviously) an Illinois resident, this means my voter info has been exposed twice this year alone.

Amazon sent out warning emails for owners of misconfigured boxes about 60 days ago. Why didn't the firm in question take action? I am an engineer and literally had to do that same task at work at that time. Easy as 2 clicks.


The ticket wasn't a high enough priority. Or the PO didn't want to prioritize it in the sprint.


My guess is they were staffed with contractors to build it out and it was no ones responsibility to maintain after they left. I've seen that happen more than once..


"It was in the backlog!"


Tech debt "we'll get to it."


Slightly off-topic, but a great video on why Electronic Voting could be a bad idea: https://www.youtube.com/watch?v=w3_0x6oaDmI

I've wondered before why the UK doesn't have e-voting, and after watching it is sort of seems obvious. With traditional voting, it can easily be changed on a small scale, but is very hard to do in a meaningful way. Whilst with e-voting, its almost just as much effort to change on a small scale as a bigger scale, with much fewer people being involved.

I particularly like the idea that the reason we use pencils is as a protection against somebody replacing pens with ones with invisible ink. Not sure if this is true though.


T-Mobile uses the last4 of the account holder's SSN as a phone support authentication string.

This is a trove.


And they're certainly not the only one. Last 4 of SSN is a very common authentication question.


Which is inane.

We have to get away from this idea of having "secret" numbers that, if simply discovered, can cause so much damage.

That includes credit card numbers, SSN, etc.


It's worth noting that SSN is far worse than a credit card number.

"Something you know" isn't a great standard as the entirety of auth, but it'll probably stay common for practical reasons. But "something you know and can never change if breached" is absolutely idiotic, and there are plenty of good alternatives already in existence.


"Improper use of this card and/or number by the number holder or any other person is punishable by fine, imprisonment or both."

We could start by exacting real consequences for those who abuse SSNs.


Currently it's between 48 months and 27 years (see federal sentencing guidelines) if caught. What sort of real consequences would you like to see? I don't think making the numbers above bigger would make that much of a difference.


Sorry, could you source that?

I just looked around and only found 42 U.S. Code § 408, which offers a maximum penalty of five years (higher for Social Security workers or medical professionals engaged in fraud).

Also, the vast majority of the text concerns misuse of an SSN to defraud of mislead the government, particularly by claiming benefits. (8) does read "discloses, uses, or compels the disclosure of the social security number of any person in violation of the laws of the United States", but at a quick look I only see prosecutions where that was tied to benefit fraud.

I don't think 5 years is an insufficient sentence, and I think the urge to raise sentences as a deterred is usually counterproductive. But I do think there's room for progress here.

Most SSN abuse as identification appears to be prosecuted as simple identity theft, not SSN fraud. Adding the secondary charge specifically for SSN abuse might encourage thieves to rely on other, less permanent information like passwords.

More broadly, I'd rather see the government concede that SSNs have become a standard form of identification, and make the renewal process less heinous. Right now you have to show grievous hardship over an extended period, can't appeal a bad decision, and will still lose your credit history when the new one is issued. That's simply not a reasonable system for a number people are expected to give out so often.


Well, you have to also consider that many violations are by anonymous fraudsters who are generally outside of the reach of the U.S. government.

So, enforcement is a lot easier said than done.


what alternative do you have in mind ?


Well, I haven't yet patented an alternative, but I think it's pretty clear that, in this climate of routine breaches, the old system of secret data is no longer viable.

But, if you're interested in building an alternative, then off the top of my head, I'd suggest that we've got the blockchain. We've all got omnipresent palm-sized computing devices. We've got 2FA schemes, and more. The tools are there for you to create a much more robust system than one that says "here are a handful of secret numbers. Don't let anyone else see them or else your life may be ruined".


To paraphrase a comment about virtual currencies: If you have a hard problem and try to solve it with a blockchain, you now have 2 hard problems.


How far fetched would it be for this data to make it's way into Cambridge Analtyica-type targeting for future election advertising?

Putting on my tinfoil hat for a moment, I have this nagging feeling in my guy that these issues are a little too coincidental.

So how can we make sure all this data isn't used to tamper with voter rolls or uploaded to FB, etc. to create Custom Audiences based on voting history and district?


AFAIK, the latter use-case is currently done all the time. Much of the data can be obtained legally, and is public data. Political campaigns are most definitely using all the data they have on you, including public data/legally-obtainable information such as voter turnout history/registration/party affiliation, street address, etc. to do targeting advertising already.


I can tell you that campaigns have much of this information already, often provided directly by the state as public information. The idea that campaigns don't already "create Custom Audiences based on voting history and district" is laughable at best. Communications are often targeted in exactly this way.

Here's Florida's relevant information:

http://dos.myflorida.com/elections/for-voters/voter-registra...

"Voter registration information is public record in Florida with a few exceptions. Information such as your social security number, driver’s license number, and the source of your voter registration application cannot be released or disclosed to the public under any circumstances. Your signature can be viewed, but not copied. Other information such as your name, address, date of birth, party affiliation, and when you voted is public information."


Most of this info is public, but not who you voted for, or your email address.


I don't see anything in this article that indicates that the Chicago database had these pieces of information either.


Cool, now let's match them up against death records and see how many of the dead really do vote in Chicago ;)


The data is for registrations, not who actually voted. There's probably quite a few people who are deceased or moved who are probably still in the registration db.


hahahaha


Is there any way for one to know if their info has been exposed? I had been registered to vote in Chicago ~6+ years ago but have since moved. Knowing Chicago, I'd bet I was still on the rolls (and probably having ballots cast for me ;)


Not sure why my original question has been down-voted. I think it's a legitimate issue – when things like this make the news, there's often an interest for potential victims to find out if they've been put at risk. Many companies go out of their way to protect their customers/users with offers of identity monitoring/credit monitoring, etc – will the city of Chicago do the same?


Is there any way to find one's (personal) details were in the data that was exposed??


I wonder if Obama's info was leaked (mine almost certainly was was :( )


[flagged]


Not a particularly useful comment but if you read the article it does say that there were admin credentials to the voting systems in the then publicly accessible data, so maybe.

It does, however also say that there is no indication the data had been accessed previously, not that I believe that statement.


Where are you getting your 2.1 million figure? It was 913,000[0].

Combined with 699,000 votes for Clinton/Kaine from suburban Cook County[1]—whose voters seem to have been unaffected by this breach—you get the 1.6 million number reported by AP for the county[2].

[0] - https://chicagoelections.com/en/wdlevel3.asp?elec_code=4 - must manually select 'President & Vice President, U.S.'

[1] - http://results1116.cookcountyclerk.com/summary.aspx?eid=1108...

[2] - https://www.nytimes.com/elections/results/illinois?mcubz=0


I'm pretty sure this is a reference to the "vote early and vote often" phrase that is typically correlated with Chicago, and especially with the Democratic machine politics.

Source: Chicagoan. And I found this hilarious.


Alternate take: "It's not a big deal, most of them are dead."


Also excellent.

Someone else in this thread made a similar joke and has an all too literal response. I'm starting to think our irreverent sense of humor for our city's politics doesn't translate well.


This is EXACTLY the reason I don't vote


Choosing to abdicate your civic responsibility is certainly your right but this is easily the most bullshit rationale I've heard offered to support the choice. I'm guessing you also don't shop at Target or Amazon?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: