
Have I Been Pwned? Data breach master list with API - fitzwatermellow
https://haveibeenpwned.com/
======
manigandham
This is a great (free) resource run by Troy Hunt, a well-respected Microsoft
MVP and security researcher. His blog is full of interesting info:
[http://www.troyhunt.com/](http://www.troyhunt.com/)

~~~
skwirl
At the risk of coming across like a shill, I loved his "Hack Yourself First"
course on Pluralsight. He does a great job of covering the basics of web
security that every web developer should know and he makes it interesting by
covering it from the attack angle first.

Someday I plan on getting around to subscribing for another month just to go
through his newer content.

~~~
michaelbuckbee
Troy has a good Web Security course up for free at:
[https://www.varonis.com/learn/web-security-
fundamentals/](https://www.varonis.com/learn/web-security-fundamentals/)

------
jwcrux
It's important to note that in addition to the big breaches, this site also
collects database dumps from paste sites such as Pastebin.

Specifically, I've been feeding logs from my twitter bot, @dumpmon, into HIBP
to help build their collection of pastes. Troy is a fantastic researcher and
has been awesome to work with!

~~~
mikecb
I love dumpmon and regularly point it out to colleagues.

------
Mahn
Guess that explains why my throw away email was getting a ridiculous amount of
spam. Why do people send spam nowadays anyway, is that ever effective? Has a
sketchy, poorly worded spam email ever prompted someone to _actually_ buy
viagra from them?

~~~
griffinmb
If Spam Nation is accurate, spam is actually quite lucrative. The knock-off
medications they're peddling often work just as well as the real thing and at
a fraction of the cost.

~~~
jessaustin
_...work just as well as the real thing..._

There are a few ways to take this, but this reminds me most of the diet aids
that supposedly consisted of tapeworm eggs.

------
asdfaoeu
Maybe you should ask for credit card numbers and passwords to check if they
have been compromised.

~~~
breakingcups
Troy Hunt explained somewhere that he doesn't actually store the password
(hashes), nor does he want to.

See the FAQ as well:
[https://haveibeenpwned.com/FAQs](https://haveibeenpwned.com/FAQs)

~~~
natch
Assurances only go so far. Even if he's a perfectly ethical person, if his
service provider is compromised then all bets are off.

But, as some replies accurately pointed out, the email addresses are already
out there and are not that big a deal anyway.

~~~
mhurron
Everything he has in HIBP is taken from public releases. It's already out
there, there's not level of 'really out there now.'

~~~
kbenson
Yes, not making the email addresses easily searchable is the worst type of
security through obscurity. The type where it's only obscure to you the email
address owner, but not to those who would like to exploit you.

~~~
mhurron
He's not interested in becoming a search engine for finding exploited
accounts.

~~~
kbenson
I understand, and I believe I was agreeing with you. I was merely noting that
while making it searchable takes the difficulty from _slightly inconvenient_
to _easy_ for those who might want to exploit it, it also takes the difficulty
from _very hard_ to _easy_ for the average person who may want to defend
against it (by determining if they are at risk). I believe that's a net
positive.

~~~
manigandham
What do you mean it's not easily searchable?

You put your email in and it tells you if it was ever part of a data breach.
You can even subscribe for future notices in case it's found in a breach in
the future. What more do you need to search for?

~~~
kbenson
I was replying to the up-thread position[1] summarized as "providing a
searchable interface is dangerous because bad actors can use it against you."
The context of the thread[1] apparently makes my meaning (that it's okay that
it is searchable) somewhat hard to intuit.

1: The thread no longer appears to imply what I thought it was. I don't know
whether that's because some comments were edited or because I was just in a
state of mind which made me interpret them differently, but I don't think it
really matters. My statements were meant to be general for this type of
situation, and apply towards the merits of searchable vs non-searchable email
addresses, not to assert that this site did one or the other and apply
judgement because of that.

------
comboy
Would be great if I could just provide a hash of my e-mail address.

~~~
Johnny555
Agreed - I use a lot of email aliases (i.e. myname+merchantname@gmail.com) and
I don't really want to upload a list of them to a website to see if any were
compromised, I'd feel much better about a hash.

Either that or allow a wildcard "myname+*@gmail.com" so I can check them all
at once, but that probably has privacy issues of its own.

~~~
peterwwillis
Just curious, why do you use a different e-mail account for every merchant?
All you need is different passwords, and then maybe label filters if you don't
want them ending up in your Inbox.

~~~
untog
That isn't an account - Gmail sends anything with a +whatever in it to the
address without the plus. So, me+mine@gmail.com goes to me@gmail.com.

It's useful for exactly the filtering you are describing - and I also use it
to track which sites sell my address to spammers (though it's such a simple
regex replacement that I'm sure some hide doing so)

~~~
logicallee
it's literally security by obscurity since gmail does not allow a genuine + in
emails, so:

>I also use it to track which sites sell my address to spammers (though it's
such a simple regex replacement that I'm sure some hide doing so)

...is Google dropping the ball by doing something insecure. The secure version
is to be able to assign a randomized alias, for example I should be able to
request an alias and get a high-entropy string like buffaloaerypyrite which is
associated with "me+fromDatingSite", which is this great group I heard for
meeting singles who are into knitting, my biggest hobby! Then I can give
DatingSite the address "buffaloaerypyrite@gmail.com" and have it go to my
inbox and be marked fromDatingSite, just as it does currently. But if escorts
start sending me mail to "me+fromDatingSite" I don't have to be able to rely
on the _security by obscurity_ that protects me from getting spam that is not
marked with that source, if any of them figure out how to remove the + so that
they can reach the inbox I read everything in.

It's a simple key-value lookup table and would close a security hole that
allows spammers to reach you. It is mandatory for Google to let me give out
"buffaloaerypyrite" instead of "me+fromDatingSite". Unfortunately, instead
they have a non-scalable solution that doesn't work.

This is why I'm faced with a choice, and can either use an unsecure version
and rely on security by obscurity, or use a real throwaway email. I use a real
throwaway email anyone can read (mailinator).

Because Google's engineers either don't understand how to secure their
solution, or believe that security by obscurity is sufficient.

I've never once in my entire life given out my gmail email address with a + in
it, even though I know exactly how this method works and how to use it.

Never once. I'm not stupid enough to use security by obscurity. But it's great
that they have it and are a single key-value lookup from having it work
properly. Any day one of them will realize the implications and implement the
solution, which can be prototyped in 7 minutes in any technical stack and be
fully pushed out within a day or two. It's trivial. I have faith they will
realize it sooner or later. You can do it, Google. Be better.

~~~
kpcyrd
> literally security by obscurity

You've said that a lot, but there is no obscurity here. Or security. If you
actually want to use this for serious spam filtering, you'd discard all emails
without a valid + suffix with a whitelist.

~~~
logicallee
I simply don't think you're writing in earnest, since your proposal is quite
nuanced and does address my issue, proving that you exactly understand it.

But gmail does not support the filter you propose. ("discard all emails
without a valid + suffix with a whitelist.")

Also even if you were serious you would know that if I did manage to get gmail
to filter all email without a valid + suffix from a whitelist, I would then
have a choice: I could give out meaningful suffixes, like me+amazon, me+ebay,
me+facebook, or meaningless hihg-entropy suffixes.

But if I give out meaningful suffixes, then spammers can just try the email
address I give out, replacing the portion after + that identifies them, with
likely candidates instead. They are unlikely to succeed, but this is still
security by obscurity.

If we move to true security, picking high-entropy strings to white-list, that
means I now need to manually track the relationship between the strings and
the party they identify. Google should track that relationship, not me. It's a
single key-value store. That's it. Associate me+ebay with squirehamboard so
that I can give ebay "squirehamboard@gmail.com" and have it behave exactly as
though I gave ebay "me+ebay@gmail.com" today.

I think you're better than this. Be better. I have faith in you. You
understand the issue. If you were working for Google you could solve it within
7 minutes (proof of concept). It's a single key-value store.

I think you know very well the definition of security by obscurity. Be better
than this. Be better!

~~~
newjersey
How would I solve it without parking n number of aliases which are now forever
unavailable for anyone else to use?

You could do what I do and just create a separate google email address (mine
literally has the word spam in it) that feeds into your primary Gmail but goes
straight to all mail and is marked as read by default.

It is funny when some websites reject my email address because the word spam
in the address. Perhaps Google could make this process simpler by allowing
autogenerated email addresses that feed into the real primary email without
the user having to create a second email address. I think this would benefit
Google as the user data isn't scattered around in multiple Google accounts.

So for example I can request a new alias and Google would make one like
ezJVoWkJFBPvMm98xhOUsi7X52l06RNblLXhhCFse3nwdkpCW7VUkIO7zgJqQDd at gmail Dot
com and forward that to my real email address. If I ever respond to any email
that came to that address, make it look like my email address is that string
above.

Now since it is just a random 63 (or longest allowed by Google) char alias
that the user didn't choose, it is unlikely that it will be anything someone
wants as their email address.

Is this along the lines of what you were thinking?

~~~
logicallee
You think extremely similar to me!!! For the sites that reject mailinator and
its aliases, I give them a gmail address that literally has the word spam in
it, just as yours does. But I just check it for any registration I need to
confirm, I don't forward it to my main account. But it literally has the word
spam in it, just like your spam gmail :) Interestingly, I've never once had
any site reject that due to having the word spam in it - but maybe because
it's my second-tier possibility only for use on sites that reject mailinator.
I thought it would be likely that some sites would reject it for that reason,
but I just haven't had that issue.

What you've written:

>Perhaps Google could make this process simpler by allowing autogenerated
email addresses that feed into the real primary email without the user having
to create a second email address.

Is precisely what I suggest. Specifically, all of the current
me+comment@gmail.com (which feeds this email address to the me@gmail.com email
address, marking it with your comment)should be kept, it's just that instead
of giving Bob's Spam Factory the email address "me+bobspam@gmail.com" I first
request a high-entropy version of "me+bobspam", and get something similar to
your suggestion:

>ezJVoWkJFBPvMm98xhOUsi7X52l06RNblLXhhCFse3nwdkpCW7VUkIO7zgJqQDd@gmail.com

and give Bob's Spam Factory that address instead. Then Bob's Spam Factory
can't just take out the part after the + to get my primary email address.

However, you do not need 63 random characters in order not to worry about
taking possibilities from someone!!! (Not even close).

Given a dictionary of 100,000 words, 3 random words have 100,000 * 100,000 *
100,000 = 1,000,000,000,000,000 or 1 quadrillion possibilities. It's
vanishingly unlikely that anyone would ever pick the same strings. This is
what I meant by "entropy" which is a technical term. To see this in action,
consider that I came up with the "high-entropy" string squirehamboard for my
example. I haven't checked yet, but it is _extremely_ likely (99.99%) that I'm
the first person to have ever written the words squireharmboard one after the
other, and that if I hadn't written it, nobody would write it in the next
thousand years (except exhaustively listing all possibilities, but I mean
actually picking it or writing it by hand).

Let's check. Let's google squirehamboard:
[https://www.google.com/search?q=squirehamboard](https://www.google.com/search?q=squirehamboard)
\- as you can see, nobody in the history of the Internet has written that
short phrase except me.

It's not the only example I had. I also listed in my original comment
buffaloaerypyrite - again, it is extremely likely that I'm the only person to
have ever written buffaloaerypyrite in the history of the Internet (I used a
high-entropy free association method to come up with these words), and that if
I hadn't written it, nobody would in the next thousand years. Let's check:
[https://www.google.com/search?q=buffaloaerypyrite](https://www.google.com/search?q=buffaloaerypyrite)

Again, I'm the only person to have ever used the string buffaloaerypyrite on
the entire Internet in any context. (Of course, some web server could decide
to exhaustively list all three-word possibilities, since my personal mental
dictionary that I free-associated through to get the terms Buffalo, Aery,
Pyrite, as well as Squire, Ham and Board, from does not have 100,000 words in
it - I know more like a few thousand words personally.

It's possible for someone to list buffaloaerypyrite in a huge (multi-multi-
terabyte) collection of passwords. But I don't quite require Google's handles
to be as strong as passwords. They should just be long enough not to step on
anyone's toes who are making real, normal email addresses.

In practice 3 random words from a huge dictionary is more than enough. In
practice you will never have any conflict with any real email address anyone
chooses.

But yeah, you get the basic idea. You just go overkill on the length of
password that you think needs to be chosen :) 3 random words given by Google
are more than enough...

------
peterwwillis
This would be more useful for non-hacker types if their website's name had
"compromised" instead of "pwned" (and really, "pwnd" is the preferred
nomenclature)

~~~
gdulli
And less cringey for everyone.

------
shostack
Two questions as a non-security guy...

1\. My two main email addresses were both flagged across the Bitcoin Security
Gmail dump, the Gawker Gnosis hit, and then two game DB breaches (D&D Online,
Heroes of Newerth). If I've changed my gmail passwords since then with a very
strong 1Password PW, is that pretty much all I should be doing at this point?

2\. I was never notified by any of these places that my email and password had
been compromised. I'm particularly concerned by the two games. Do these
companies have any legal obligation to disclose what happened? In the case of
D&D Online they have "Dates of birth, Email addresses, IP addresses,
Passwords, Usernames, Website activity." So I'm curious what recourse I have.

I have to say that the irony in all of this is that my two personal email
accounts were both caught up in multiple lists, but none of my family members
(who are all considerably less tech-savvy than I am) show up at all.

~~~
schoen
> is that pretty much all I should be doing at this point?

Google also offers two-factor authentication; if you're willing to bear the
extra inconvenience, it's one of the biggest possible security wins.

[https://www.google.com/landing/2step/](https://www.google.com/landing/2step/)

~~~
tyre
To add to this, it isn't too bad of an inconvenience. You can remember a
device for 30 days, so you take an extra 20 seconds 12 times per year.

~~~
mkopinsky
I complete a Google 2FA much less than once a month. I think the 2FA cookie
must live for a month, but my login cookie lasts for much more than that.

On a new device, I do of course need the 2FA token.

------
elliottcarlson
Searching by domain name is great - I use a catch-all on a domain I own and
create new accounts for everything I sign up for - being able to see which
ones were compromised is pretty nice (and relieving to see only one, on a
throw away account).

------
mkane848
Is there really anything I can do other than update passwords regularly if my
email address has been compromised? Apparently I was hit by the Adobe and
Bitcoin forum dump. I (think) I create new passwords and change old account
passwords often enough and I think literally the only time I've gotten
"hacked" was an old XBOX Live account and Origin, and most accounts I care
about nowadays also feature 2-Factor Authentication.

So while I'd be more pleased if Adobe hadn't leaked my email, I'm not really
at risk continuing use of the email address as long as I'm on top of my
passwords, right?

~~~
OliverJones
When the Adobe leak hit, I was already using Keepass to generate random
passwords for each service I use.

So, I changed the password on the Adobe account. I've never had an adverse
event based on that leak.

Lots of people know your email address. It's your job to make your email
address useless, combined with a data breach, to guess a valuable password. If
you use passwords like "mysecret-hackernews", "mysecret-github", and
"mysecret-paypal" then you're in serious jeopardy if your "mysecret-adobe"
gets out. But if you use a bunch of random gibberish for each password, you're
relatively safe, at least from that particular attack.

~~~
mkane848
Sorry for the late reply, but thanks for the feedback. While I believe I have
a decently strong variety between my passwords, I'm looking at Keepass now and
it looks like the exact password manager tool I've wanted for a while now so
I'll be giving it a try.

------
socialjunkie
After reading recent coverage on this guy and his HIBP site, I think it's an
absolute disgrace and to be honest I can't seriously be the only one! He gets
away with having all these stolen database dumps which is illegal anyway but
gets to build a whole site housed around this stolen data and further more
benefits from it?! How is this guy any different to one of the people that
built an Ashley Madison email search site (ones that didn't display PII) or
another search site that allows for the searching through these data dumps?
Why is he above the law, the FBI etc have turned a blind eye to what this guy
is doing yet will happily see these other site owners prosecuted to the full
extent of the law! It's actually quite disgusting that there is one rule for
Troy Hunt but another rule for others. The fact of the matter is that in plain
simple terms what this guy is doing is actually against the law and illegal.
Someone needs to bring a class action lawsuit against this guy and get HIBP
shut down! Oh yeah and he claims that haveibeenpwned.org is nothing to do with
him as well, I say that is a load of BS!

~~~
sculper
How do you think he is violating the law?

He's storing email addresses that have _already_ been leaked, and providing a
free service for those who are concerned they may have been compromised.

------
jafingi
One of my e-mails was in three breaches: Adobe, Money Bookers, vBulletin, and
my Gmail was in only one breach: Boxee forums.

I have been receiving a lot of spam lately on my Gmail, and it's not listed
anywhere online. So it must have come through a breach somewhere.

------
brazzledazzle
Does anyone store fake unique email addresses to see if they've been
compromised? I assume you'd need to seed it with new ones occasionally or on
some kind of schedule.

~~~
chiph
Lots of people will create a single address for each company that wants one,
so that if it gets compromised it can be shut off without affecting any other
incoming emails. And identify the offending firm. I have created an email
account for a store that wanted an address while I was out in their parking
lot (the ones that had plausible future value to me).

So far, none have been sold to marketing lists. Which says either that I'm a
pretty good judge of firms, or that no one is interested in me. ;)

~~~
colejohnson66
If you use gmail, you can insert random periods or append `+word' to your
email address and it'll be delivered to the same address as one without
periods and the `+word' portion. So, an email to
`my.email.address+hackernews@gmail.com' will be delivered to
`myemailaddress@gmail.com'.

~~~
james-skemp
Sadly a number of sites get hung up on the '+', despite being valid. One thing
to keep in mind if relying upon this.

~~~
jessaustin
If they don't know how to email, they probably don't know how to security.

~~~
chiph
To be fair, checking that an email address is formatted correctly and valid is
a challenge. You can't just use a regex.

~~~
jessaustin
Modern browsers use this one [0]:

    
    
      /^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/
    

That certainly accepts plus signs. I'm sure there are theoretically valid
addresses it wouldn't accept (IDNs spring to mind), but HTML5 is willing to be
"practically", rather than theoretically, correct. Anyway, there are email
validation modules in every popular language. A good technique, which seems
very common, is just to make the user type their email twice. Invalidity is
not the most important reason why an address could cause problems.

[0] [https://www.w3.org/TR/html-
markup/input.email.html](https://www.w3.org/TR/html-markup/input.email.html)

------
gk1
Great resource. Even more useful for checking family members' email addresses.
I use strong passwords, change them frequently, and use 2FA wherever
possible... My family (probably) doesn't, nor do they know to check a site
like this.

------
_RPM
Why should I trust this site with my email?

~~~
scholia
[http://www.troyhunt.com/2015/11/im-sorry-but-your-email-
addr...](http://www.troyhunt.com/2015/11/im-sorry-but-your-email-address-is-
not.html)

------
blazespin
Didn't hey use to ask you to verify first before? Great way to snoop on who
else has logged on where.

------
MaulingMonkey
Finally a service that catches one of the 5+ pwnages I'm aware of!

------
zinxq
Wuh oh. All the Mailinator accounts I tried were pwned.

------
cstrat
test@test.com Pwned on 48 breached sites and found 650 pastes.

Ouch.

~~~
tajen
On the other hand, if you own that email you have access to most IT systems of
the world. Let me look up your account with my admin@example.com account ;)

------
hellbanner
This is uh, a great way to get email addresses.

~~~
STRiDEX
You can usually access all the same dumps they use at your favorite shady
website. The smaller ones are on pastebin too.

------
chrisseldo
RIP foo@bar.com

~~~
emehrkay
Surprised by how many people use my bs email mark@mark.com (my name is Mark)

~~~
tehwebguy
Carelessly did the same with kevin@kevin.com (my name is Kevin) until one day
kevin@kevin.com sent me an understandably angry reply!

~~~
emehrkay
Ha. How did he track you down? Mark(@mark.com) is probably furious with me.

~~~
tehwebguy
I'm sure the reply-to address forwarded to me while I was testing.

------
original_idea
Thanks Adobe.

