The top comment on Reddit r/netsec's corresponding coverage has mirrors on Mega.co.nz for the files 
I couldn't find my own data in the set, and actually it seems like lots of entire area codes are missing.
Assuming `cat schat.csv | uniq | cut -c1-4 | wc -l` is the proper command, there are only 76 of 322  US area codes represented.
It appears there are two Canadian area codes represented in the database: 867 and 204. There are also 248 US area codes which are not represented in the database. Assuming a relatively uniform distribution of phone numbers in the US (which is not at all a safe assumption), the average US snapchat user has better odds of not being in the list than being in it. Sampling from the set of my snapchat friends who are not in my area code, 3 of 13 can be found in the database.
If your phone number is in any of these states, you're not in the database: Alaska
Snapchat devs explained how to create the database on Dez 27th:
Theoretically, if someone were able to upload a huge set
of phone numbers, like every number in an area code, or
every possible number in the U.S., they could create a
database of the results and match usernames to phone
numbers that way.
like other people are saying -- HN is news by committee. just try to post a joke on here one day & see how many rapid downvotes it gets if you ever need proof of how lame & uptight the HN crowd can be once they hit that 500 rep.
i posted a joke on a comment of someone going "This. A million times this." the other day cuz that is the most melodramatic, overused, annoying, lame textual meme... the joke was pretty innocuous & immediately got like 5 downvotes.
I don't mean to hate too hard cuz it is what it is, i dont care about my karma, & obv its good enough that i still read articles here but posters being annoying/hypocritical/oversensitive/humourless/feeling that they are the protectors of society... very common on HN. I think it is some facet of the nature of people thinking they are real sophisticated by reading HN for some odd reason (basing this on all the wannabe devs I know on social sites who make it a point to mention it constantly in posts like "my fav sites to procrastinate on"), whereas there are lots of higher-level programming communities out there who seem to have less of this self-seriousness.
that said, occasionally there is that nice moment of a few reasonable people chatting each other & introducing to tech they may have otherwise overlooked. I'm heavy into Clojure & the other day a Lisper on here pointed me to freenode to get community help rather than SO, for example. It's all percentages, I guess :-/
obvious logic, link to wiki for explanation of common term -- u my friend are due for about 5 upvotes in the HN chain of circular validation.
i understand it as a way for SNR sure, but mostly its just a way to fade the text of people you disagree with. it doesn't solve much for people who can skim, & it usually just makes me more curious about what got downvoted (some of the comments are very insightful but just pissed off those with normative opinion)
im just not into kool-aid drinking & the weird PG (harhar no pun intended) vibe of HN while downvoting & having this weird karma system smacks of lame culty passive aggression, which incidentally is much more offensive to me than actual aggression
also i wasn't gonna say anything but can't help from chuckling about it -- i am an audio DSP engineer so by nature I've probably done more work/research concerning SNR than 99% of people on the planet ever will lol.
i didnt want to come of as indignant tho, its just a funny coincidence. you never quite know who you are condescending to on here
All phone numbers in the North American Numbering Plan (NANP) are allocated first as area codes by region and then by prefix for smaller locations. Some states, like Wyoming, have a single area code (307) while metro areas, like Houston, can have multiple area codes (713, 281, 832, 346) all covering the same physical place. This is a holdover of calling party pays so that a caller will know where a given phone number "lives." This distinction has become less visible with the advent of local number portability and nationwide mobile roaming.
(The NANP covers Canada, the Northern Mariana Islands, the United States, and a large list of island nations in the North American area.)
That doesn’t just apply to small countries. I was also unaware that mobile phone numbers could have area codes and I’m from Germany (where each mobile operator has one three digit code – but even that is becoming meaningless with legally required number portability from operator to operator).
Less visible to people, though the way gsm works means that every call and text you receive entails a lookup to your number's original provider. Prefix allocations to mobile networks are static and they are required to return a new route if you've ported. Which means if a company leaves the market, someone has to take on their allocations or even non-current customers will lose connectivity. Or at least that was true last time a checked, 4 or so years ago. Kind of crazy.
In many countries the caller pays for the cellular airtime. You have to know how much your call is going to cost before you dial it, so a new area code is needed. In the US, the cell phone user pays for the airtime even for calls they receive, so a new area code was never needed.
It's also worth noting that some states have partial coverage. For example, all the area codes in Massachusetts are missing from the list, with the exception of "857" and "617" (both Boston area codes) but the latter is incorrectly labeled as "Southern Michigan".
That's crazy how some personnal infos, once leaked, become a public/underground data leaving no real way to repair. (I m thinking about leaks of other infos with an expiracy, or tier revoking, like oauth tokens)
Possibly they shouldn't have pissed on the people who notified them of the vulnerability, and on the journalists who broke the story?
(aside from not being vulnerable to this in the first place, but that actually is a lot to ask. I still can't believe anyone relied on the Snapchat model of security more so than any other app, although from an ease of use, non-security perspective, sure, it's reasonable.)
Fuck, misclicked downvote (wanted to upvote). Really great how HN does not allow changing votes at least once. Sorry for that, but yeah good point.
Nowadays one or a few phone numbers are unique to you, which makes it linkable to other things. Linkability is something that breaks privacy, so if you don't want your full name to be known somewhere, it is important to be able to keep things separate. When your phone number goes public (e.g. resumé and snapchat), that anonymity is broken.
I think the point is that we expect our phone numbers to identify us in the real world but don't expect them to point to our handles online. If I use the same username on Snapchat that I do for something embarrassing like, say, my tumblr, someone who knows me IRL can look me up and find my more private online presence. This is just as real an issue as the reverse scenario in which Internet stalkers find out my phone number or something.
Now generally it's advisable to use a unique handle (or your real name) for a service so closely tied to a piece of real life identification like your phone number, but I don't think a lot of people do it.
Anyone else tried putting together some stats from the info?
name | areacode | count
Chicago Suburbs | 815 | 215953
Eastern Los Angeles | 909 | 215855
San Fernando Valley | 818 | 205544
Southern California | 951 | 200008
Los Angeles | 310 | 196183
Northern Chicago Suburbs | 847 | 195925
Denver-Boulder | 720 | 188285
Downtown Los Angeles | 323 | 168565
New York City | 347 | 166374
New York City | 917 | 165420
Fort Lauderdale | 954 | 153522
Northern New York | 315 | 147447
Buffalo | 716 | 144939
Southern Illinois | 618 | 144280
Boulder-Denver | 303 | 139265
Southern Michigan | 617 | 138821
Northeastern New York State | 518 | 138043
Champaign-Urbana | 217 | 135837
Oakland | 510 | 130531
Miami | 786 | 117906
Westchester County, NY | 914 | 116632
Western and Northern Colorado | 970 | 115378
San Francisco | 415 | 108883
Miami | 305 | 104415
Southeastern Colorado | 719 | 102932
Manhattan | 646 | 96646
Mountain View | 650 | 94430
Chicago | 312 | 70709
Southwest Connecticut | 203 | 60629
Bronx, Queens, Brooklyn | 718 | 51086
Boston | 857 | 41857
Central Arizona | 480 | 35631
South Carolina | 864 | 33034
Eastern Ohio | 330 | 32721
Arkansas | 870 | 28940
Idaho | 208 | 26827
Southeastern Virginia | 757 | 21170
Los Angeles | 213 | 13705
Southeastern Ohio | 740 | 11597
Eastern San Francisco | 209 | 11356
Seattle | 206 | 10623
Fort Lauderdale | 754 | 10131
Maine | 207 | 10126
Northern Louisiana | 318 | 9842
Indianapolis | 317 | 8151
Northwestern Arkansas | 479 | 7300
Manitoba | 204 | 7211
Minnesota | 320 | 7162
Southeastern Michigan incl. Ann Arbor | 734 | 7077
Eastern part of Southern New Jersey | 609 | 6952
Pennsylvania | 484 | 6314
Manhattan | 212 | 3970
Pennsylvania | 610 | 3930
Southern New York State | 607 | 3437
Central Florida | 321 | 3258
New York City | 929 | 2651
Florida | 863 | 2642
Southeastern California | 760 | 2523
Southwestern Wisconsin | 608 | 2217
Central Texas | 325 | 1542
Central Georgia | 478 | 1396
Western Central Alabama | 205 | 825
Eastern Kentucky | 606 | 565
DuPage County, Illinois | 331 | 512
Eastern part of central New Jersey | 732 | 507
South Dakota | 605 | 375
Knoxville, Tennessee | 865 | 263
Southwestern Connecticut | 475 | 253
Eastern Iowa | 319 | 198
Georgia | 470 | 163
Minneapolis | 612 | 103
San Fernando Valley, LA | 747 | 84
Canadian territories in the Arctic far north | 867 | 31
Washington DC | 202 | 3
Georgia | 762 | 2
Dallas | 469 | 1
I wonder where they were getting the numbers to search by from. From how they described the vulnerability, I would have thought they would just iterate through all possible phone numbers. If they're doing that, it's strange how there's exactly 1 number for the dallas area code.
we are kind of the media.. and reddit is too.. I also believe that they made a fatal error by not selling everything for $3bn then jumping aboard. To not have anything to do with the "soon to come security issues". I mean they could have mentioned it and downplay it as they did just recently. I don't think that the new owner would take security more serious than them.
For us it was really really good that he rejected the offer! Because otherwise we would see the trade market crash $3bn, guess who would have to pay loss.. we..
well, if he saw that coming, which I doubt, he would be a hero.
>I don't think that the new owner would take security more serious than them.
I don't know about that. Their dismissal was (at least framed as) "well that's a lot of data, so it's not going to happen!"
Actual excerpt from their blog, on the 27th: "Theoretically, if someone were able to upload a huge set of phone numbers, like every number in an area code, or every possible number in the U.S., they could create a database of the results and match usernames to phone numbers that way."
Unless it's changed recently, the phone number is user-supplied and I'm not sure if it's verified at all. They do claim that the phone number "will be stored as unique mathematical representations (or 'hashes')..." rather than plaintext, but I imagine if you know it's a non-salted phone number that's been hashed, it's pretty easy to brute force. But were they lying about hashing the phone numbers? I guess it doesn't matter if they hashed the phone numbers if they're going to expose an unlimited query API that can be brute-forced like this.
Yes. I signed up in-app today (with a fake phone number) and that's a direct quote from the app.
No idea on the downvotes. I guess because, like I said, telling users you hash the phone numbers doesn't matter if you're using them to search for an unhashed userid. But they're implying to users that their phone numbers are secure because they're hashed, when it really doesn't matter.
It's taking too much time to download each file even they're 40 MB. I wish they put it on as torrent in the first place.
Regarding the leak, yeah, that actually happens when you focus on the product but security and reliability of your system. Snapchat, Whatsapp and many others are hacked numerous times and yet it still happens.
Torrent or not, I want to see if any of my less tech-savvy friends are on the list so I can warn them even though I don't use Snapchat myself. It's much easier to convince them there's a real problem if I can say "Look, I can get your phone number and username from the Internet just like that" rather than explaining theoretical reasons why they're vulnerable.
Hmm, I somewhat disagree. Private information is anything you don't want public. By protocol, it isn't strictly private. But a phone number is private/unknown until its known, which is how most of us prefer it.
For example, in implicit social code it is impolite to give away a friend's phone number without asking them first.
The gray area for that is sharing a business phone number of your friend that they share widely through business cards or their website. Though typically it ends up being an email introduction if you really care to connect someone with your friend.
What does snapchatdb hope to accomplish by allowing people to download the db.
Just showing and proving that you've hacked the database should be enough to get the company to respond. They're probably not hurting snapchat as much as the potential damage to the people who's phone numbers and usernames are being dowloaded.
>For now, we have censored the last two digits of the phone numbers in order to minimize spam and abuse. Feel free to contact us to ask for the uncensored database. Under certain circumstances, we may agree to release it.
At least they had the tact to omit the complete phone numbers, but agreeing to release them under certain conditions just seems malicious.
For those who haven't noticed that, they are censoring the last two digits of the phone numbers:
> For now, we have censored the last two digits of the phone numbers in order to minimize spam and abuse. Feel free to contact us to ask for the uncensored database. Under certain circumstances, we may agree to release it.
Among other things, it means that the cell phones of people who have predictable user names are now very easy to discover. I don't use SnapChat, but if I did I would be patio11 on it. My cell is fairly closely guarded. You could imagine some people with similar situations who'd be at higher risk of misuse, because of a higher public profile, higher perceived payoff for contacting, notoriety, or demographic interest to people with poor impulse control.
If you're not a hot underage teen girl, then prob not. But now it blows up the spot for where a lot of teens are hanging out. Now it's another step to getting to you if some creep wants to. Or take it to a grander scale, it creates a viable link of who a person and their digital mask is.
That's kind of the thing about privacy. It's kind of slipping away, but if you don't care that it is then it's prob cause it doesn't matter to you yet.
If you're playing anonymous, you surely would also use burner phones and not your main phone for that purpose. Same applies to identities, profiles and hardware, any IDs, network connection and so on.
Because if I do have an alias, which is doing very shady things. I would make it pretty sure, it's not going to be that easy to get it. When doing stuff like that you want to be sure that there's "shared nothing" approach. So if they hack your systems, your primary system won't contain any information referring to the shady side and vise versa.
So the primary use for this database would be phishing, right? Or some attempt at building a reverse cell phone number lookup database, assuming people have reused usernames? My normal username was taken when I signed up for snapchat, but I suppose you could use this to get quite a few cell number -> instagram or twitter pairings?
> For now, we have censored the last two digits of the phone numbers in order to minimize spam and abuse. Feel free to contact us to ask for the uncensored database. Under certain circumstances, we may agree to release it.
Why not just release the usernames and leave out the phone numbers?
Yes, this is strange on all fronts. As far as I know names and land numbers are still published in phone books, and a phone number isn't generally a very interesting bit of information to have. And to the extent this information is sensitive, why be so eager to spread it (beyond being a teenager and getting a thrill)?
1. You can remove your phone number from phone books.
2. Cell numbers aren't published in those books, which this affects.
2. Land lines these days are somewhat separate from our lives. It's relatively easy to ignore. Getting phishing texts (say, faking our banks, since some -- including myself -- have some bank alerts texted to us) to our cellphones could be quite harmful. If you send a million texts pretending to be Chase, and say 50% of the numbers are legit cell phone numbers, and 20% of people have chase accounts, and 0.1% of people fall for the phishing attempt, then you get 1/10,000 people getting phished. That's 100 people out of a million affected monetarily, and 500,000 people getting annoyed by the spam.
Obviously this is back of the envelope, but this is one reason it could matter.
edit: a comment thread below mentions that the bottom two digits are hidden at this moment but will be revealed for interested parties. That really smells like the numbers will be sold to spam/phishing operations.
NB: "For now, we have censored the last two digits of the phone numbers in order to minimize spam and abuse. Feel free to contact us to ask for the uncensored database. Under certain circumstances, we may agree to release it."
2FA is pretty useless if you have a good password and simply mind the https lock and domain when logging in. Also I wouldn't share anything with Google that is sensitive enough that it needs 2FA at all.
What I want to know is what kind of asshole it takes to do things like this?
Great, Snapchat isn't secure, and they probably didn't give a damn when notified of the vulnerability (not surprising, given their cavalier attitudes), but why expose their audience in order to prove a point?
You mean to say a company that encrypts users' messages in ECB mode with a fixed key hard-coded into the binary and which was publicly disclosed almost a year ago and hasn't been changed isn't responsible with user data?
How long until somebody releases an updated snapchat database linking pinterest profile pictures? I mean if you chose a very unique username, and went to http://pinterest.com/username, you'd be able to discover what they possibly look like. It doesn't end there, their email address is probably firstname.lastname@example.org too. simply googling the username results in connecting their twitter? facebook? myspace? linkedin? full name, more pictures, your friends, your interests, your likes. All in all, I would have to say, this can be potentially a far bigger loss of privacy than just your Snapchat account.
>>The company was too reluctant at patching the exploit until they knew it was too late
Did they give Snapchat enough time to fix this before releasing this data?
NOTE: I've heavily edited this comment because when I first read the website I thought snapchat ignored the people who found an exploit but re-reading, it's no longer clear to me that releasing this data is not pure malice.
Why would you donate to these people? Because they're hurting Snapchat users? What is wrong with the people posting in this thread like this is some kind of good thing? Real people can be hurt by this.
Um, I just want to say that I have _NO IDEA_ why my BTC address is on that list and I've never seen this git URL before in my life. That BTC address is my deposit address on BTC-e.com. This address has only ever received 2.25 BTC and this was purchased fair & square from coinbase.com with my hard-earned USD. I really do not know what in the world is going on or who put my BTC-e.com address on this alleged cryptolocker's known list. I have absolutely nothing to do with that software.
Pardon me while I go to BTC-e.com and have it generate a new address. I don't need to be getting mixed up in this.
I would have found it quite amusing/scary to suddenly see some huge balance on my account. BTC-e.com sends emails for any account activity and I haven't seen anything I didn't cause. Also, BTC-e.com is just too convenient not to use for now. It's the quickest way for me to get litecoin until coinbase.com supports it.
When i first read your post smtddr i got worried we had a collision!
Ive found the quality of blockchain auditing in 2013 highly inaccurate. I recently bring attention to the case recently on reddit where someone 'chased' the SMP thief through a tumbler and found... the 96k wallet allegedly owned by btc-e.
Its a shame if a non published address of yours has been tainted in someones inaccurate blockchain analysis.
w-ll was talking about the original BTC address in my profile being on the known list for cryptolocker. The same address I linked to in my reply to her/him. When you say "we", who are you?
Also, that whole reddit thread about chasing the SMP stolen coins I thought was too hard to actually pull off. For example, I use coinbase to buy BTC, to send to BTC-e.com, to buy Litecoins and ultimately store them in the offline address that's in my HN profile. Can anyone show me the blockchain.info URLs that would prove my actions? If the SMP people changed coin-types, that's how it'd end up on BTC-e.com's wallet. In fact, maybe that same flawed logic is how my BTC-e.com address ended up in that list - capturing addresses that BTC-e.com uses for its customers or internal operations.
This whole incident reminds me of Reddit doxxing. This could have ended up much worse for me. I'm just glad I found out this way instead of the police requesting info from Google about my youtube account and gmail inbox then busting down my door in the middle of the night.
Looks like they are using WhoIsGuard to protect the domain whois information. The terms of WhoIsGuard include not violating the privacy of others:
> defame, abuse, harass, threaten or otherwise violate the legal rights (such as rights of privacy and publicity) of others;
I've sent WhoIsGuard an email. Hopefully they'll revoke service. Shame on the people that published this private information. They aren't hurting just Snapchat. Revealing personal information like this can cause real problems for people.
Actually in this case the absolute best thing would be for Snapchat, Inc. to go full court press against snapchatdb.info, as what is actually important here is to communicate both the "snapchat security is a lie" message, and "companies which flagrantly suck and then piss on those who report vulnerabilities responsibly will suffer" message, rather than the actual snapchat phone/username db. Streisand will help that more than "go to this site which is really slow and download a huge file which you can't easily use to find your own number or that of your friends" (without a minimum of "how to use a computer" skill).
The website clearly states that the last 2 digits of the phone numbers are censored. You're free to do what you think is right, but in this case you're the one who is trying to get somebody's private information published.
Hrm... it's New Years Eve and people have taken off early, and I suspect that WhoIsGuard doesn't have round the clock support coverage. Disclaimer: pure speculation, but I think its fair to say the timing was strategic.
I have a theory. Last week there was a big story about how Facebook was “dead and buried” because teens didn’t want to be on a service that their parents had moved into. Now, when it comes to security, the parents care a lot more than the kids. Could Snapchat be playing fast and loose with the security of their user data as a way of scaring away the grownups?
This would be a clever ploy but for one damning fact. A large share of Snapchat’s users are minor children. Could anyone, from the CEO of Snapchat to the perpetrators of SnapchatDB really think that risking the broadcasting of the phone numbers of 12-year-old girls and boys is a risk worth taking?
I HAVE a theory that a (not so)clever writer for Forbes is plugging his story by planting misguided theories everywhere UPON which I plan to plant my theories on his planted theories on snapchat CEO "rumor" theories.
I have a theory that Mark Zuckerburg, fearing the demise of facebook, had his ninja assassins infiltrate SnapChat and compromised their security, hoping to drive teenagers back into the arms of facebook.