Each time a breach like this happens I want to download the file and check if
1. My emails are in the dataset, and
2. Any of my passwords are in that dataset.
I really just want the collection of passwords so that I can use it as a check against any of my current passwords.
[EDIT: I know about haveibeenpwned.com; I'm not asking for a service that I send a http request to to determine if a single username exists in the db, I want the db itself so I can chuck it into sqlite and check multiple records at a single time, quickly, for both usernames alone and passwords alone
I also believe it's a bad idea to ask a third-party to perform the check. Even if you trust that third-party now, there is no way to ensure that trust in the future - i.e. it gets bought, breached or pwned itself in the future and best case scenario is that the record of your username lookup is available as "confirmed". Without visiting that site, no one would never know if that record was a throwaway or not.]
If you use any of the better password managers this feature exists and runs automatically. If you don't want to go that route, then you can make use of https://haveibeenpwned.com/
I have a gmail account which google one shows, it along with a username has been leaked on dark web, but haveibeenpwned shows email was not found in any data breach. How is that possible?
I agree - download all the passwords and don't single out what you're checking for someone else to see.
I don't know why we can't use this kind of thing for better privacy everywhere.
A similar example (outside the realm of passwords) would be when checking for a software update. Instead of sending "i have software xyz version 1.2.3", just download a current list of software and check it locally against your software. Probably would be faster anyway to download a static dataset instead of hitting a remote database.
Services already exist that does this. Some password managers will check but the popular service often talked about on here is https://haveibeenpwned.com/
The data breach announcement is a bit vague on the meaning of “login pairs”. The best practices of breaches databases of the like of https://haveibeenpwned.com/ is to maintain records of login matter (username, email, password etc) in a strongly hashed format. This still enables searching and comparing but not extracting for later use. Why the database here looks like plain text is totally unclear. Or maybe the passwords are hashed here also (which anyway exposes email addresses)?
The company I work for (stytch.com, we provide an authentication API) tracks breached passwords and, depending upon config, will invalidate passwords that have been leaked. Will be interesting to watch our logs over the coming weeks.
This reminds me of [0] where they maintain composite lists of frequently used passwords. Also in the repo is probably my favorite pull request ever [1].
I suppose it would require a good few domains and or public mail boxes but imagine if one was to create n fake users for each real user. If any of the fake users log-in on their account all users are forced to change their password.
1. My emails are in the dataset, and
2. Any of my passwords are in that dataset.
I really just want the collection of passwords so that I can use it as a check against any of my current passwords.
[EDIT: I know about haveibeenpwned.com; I'm not asking for a service that I send a http request to to determine if a single username exists in the db, I want the db itself so I can chuck it into sqlite and check multiple records at a single time, quickly, for both usernames alone and passwords alone
I also believe it's a bad idea to ask a third-party to perform the check. Even if you trust that third-party now, there is no way to ensure that trust in the future - i.e. it gets bought, breached or pwned itself in the future and best case scenario is that the record of your username lookup is available as "confirmed". Without visiting that site, no one would never know if that record was a throwaway or not.]