How messed up is it that random 3rd parties collect and assemble all this information to begin with, leaks aside? Sounds to me like all this data fell into unscrupulous hands way before any hackers may have found it on the public internet.
Edit: Added main company link providing this API 
This specific API seems pretty innocuous. They're not doing black magic, it's just aggregating data that people willingly put out there about themselves.
I'm sure I'm out there in many datasets with stolen (or just "shared") information.
That's has always been illegal in my country though (you can't even keep a record of people with pen and paper), and now with GDPR would of course be illegal with actual consequences if it contains data about EU citizens.
And if they try, any American court will be very leery of setting the precedent that Brussels can tell Americans what to do in any sense, particularly with respect to data stored on their servers.
I imagine the company using them will want to recover financial losses they incur after getting reamed by whatever european Data Protection Authority decides to go after them - especially if the culprits did promote themselves as being GDPR compliant.
The point of the EU's strong data protection rules is to have accountability - and it will fall on someone along the chain that caused the mess. Companies can't be allowed to completely disregard how they collect and store data and then go "Oops, haha sorry about that!" when the shit inevitably hits the fan, and just continue their business as usual.
which, as an EU citizen, I can confirm it does
What is this URL: This is not a public API route, it is a proxy to a service called Clearbit to enrich professional emails with public company and person data from multiple public sources such as AngelList or LinkedIn (cf https://clearbit.com/enrichment and https://clearbit.com/our-data).
By using this route, you are using Clearbit with our credentials. Most importantly, we don’t store any data on our end when accessing this route. We don't own this data, it is stored on Clearbit servers.
Why are we using them: We entered in business with Clearbit to help our users get more insights on their webinar sign ups from public data sources.
Are they GDPR ready: Clearbit is GDPR compliant (cf https://clearbit.com/gdpr). You can claim your data here: https://claim.clearbit.com/claim.
We took all the steps necessary with Clearbit to ensure our process was GDPR compliant. However, this information makes us double guess it. Therefore, we are revaluating the compliancy of this specific process and in the meantime, we have deactivated this route.
At what point does this site should that I have given consent for you to process my data? Who are the third parties that have given you my data?
enrich.email also does the same, though.
There’s also https://fullcontact.com.
In the end they all just search and scrape social media profiles and gravatar.
Or often, as in this case, collected it and stored it carelessly so someone else could get in and use or share it at will.
20 years ago my first roommate worked for a company that did direct mail marketing, and they regularly engaged a "cleaning" firm that would scrub and update their mailing lists.
This company had a huge databases of information on people (name/work/address/DOB/income/etc) and they would send them data and they would clean it up.
Now they would not give them any new data, they could only update records they already owned.
Back then in Canada there were apparently only a small handful of well-known cleaners like this who traded on their reputation for accuracy and dataset completeness..
- I see 67,569 mongodb results on shodan
- I searched for shodan API keys on a shady website known as "Github", found one that works, queried for all the mongodb databases
- Tried connecting on the first one, it works...
- What next?
edit: I guess it's a choose your own adventure game:
1) Delete all the data
2) Look up domains from IPs, find owner's emails and email them
3) Make nice stats about all the data
4) Sell the data on the black market
5) Find user's emails, and tell them they are using a service that just doesn't care
6) Just make every ~third integer off by 1
7) Mine bitcoins on them
... So many ideas, so little time
Do any of us really trust our peers?
Personally I think jury duty should just be mandatory (with the only exceptions being extreme circumstances) because it being so easy to get out of creates a major bias in the set of people actually doing jury duty.
Comparable activity: let's ride around the neighborhood and see who hides a key under a rock. Then let ourselves in and look around.
Imagine if your Doctor left your personal medical records lying on the ground in front of their office...
If we're going to try to force physical analogies to represent the digital, I would say it's more like every store, house, and public building looks exactly the same and you don't know if the door is locked or what's inside it until you try to open it.
You have a person cruising for keys to a house and a person cruising for keys to databases. They own neither.
Accessing a URL is like walking around the neighborhood looking from the street. Trying different URLs is like peering into windows. Trying the door is like testing default credentials.
To make something a crime, you need to prove some kind of loss as well as intent. If there's no intent, it's likely a civil case to get restitution (e.g. you won't arrested for breaking something in a store accidentally). Likewise, you could get arrested for intent without loss (e.g. tried to break something in a store, but failed).
IMO, it should only be illegal if there's intent to harm. Likewise, if you accidentally harm something without intent (e.g. accidentally DOS a server), you should be expected to make restitution. Both need to be proven in a court of law, criminal if there's intent, civil if there isn't, and if there's no harm or intent, there's no crime.
Here's an analogy I think is closer: I ask you for all your personal documents and bank account numbers. You promptly give them to me with no questions asked. Was I accessing private property improperly? If you refused and demanded I identify myself as someone authorized to have access, and then I forcibly took them from you anyway, that would be criminal.
Web addresses aren't property, I think that's the wrong analogy. The web is made of requests and responses, and you don't know what response you'll get until you make the request.
It’s clear from this hack that the owners of the hacked site didn’t see emails as something worth securing (stored in plain text on a wide open mongo server)
If you want to keep your email address private (you should), generate a new, random email address whenever you give yours out (the same way you use a password manager). If you have your own domain you can use a catch all/wildcard address, eg. *@mydomain.com, if you use gmail you can use their plus support, e.g. Johnfirstname.lastname@example.org, if you use neither or want more security I’ve recently launched https://idbloc.co which aims to help deal with this.
Non-secrets, like addresses and SSNs, should be rendered harmless rather than squandering resources trying to keep them secret.
Though if you run a full domain you can use emails as one-use affairs. Most don't though and, really, what's the point? It only saves you from journalists, not from a motivated attacker.
or, if you use a service that lets you generate aliases, like gmail's "+", or a service like mailinator.
The problem is that the attack vector of email addresses is they are sometimes used as a username, and therefore contains more information than what is strictly required (for the purpose of a username). Leaking the "real" email address not only leads to spam, but allows a more dedicated attacker to use that email address as a starting point on a different site, or hack the email address altogether.
And with sites increasingly blocking disposable email addresses like mailinator, or disallowing email aliases, the problem can only get worse.
I can confirm the source of every email breach that contains one of my addresses.
p0wn3r $cat testerm
p0wn3r $cat testerm | sed 's/+.*@/@/'
Does Gmail allow sending from the + addresses? There's quite an issue if somebody contacts you on that address but you reply without the alias.
Can you explain this claim?
If you generate your single-use email addresses wisely, then you should know that the one you gave to - say - Marriott should only ever receive emails from Marriott.
If - say - Marriott gets hacked and that particular one of your many different email addresses leaks, then:
a) you'll find out that address is burned just as soon as anyone other than Marriott uses it (you immediately generate a new one, give it to Marriott, and stop accepting any mail at all on the old one)
b) if anyone other than Marriott uses it, you know immediately that that message can't be legit.
ADD: and of course, you aren't really going to stick +BankCo after your real name to generate an email to use with BankCo. You're going to give them something generated like you'd generate a password - "ol48eILm@example.com" or similar - so if anyone finds out that particular email address for you it doesn't tell them with whom you use it.
- Phone up the tax office and find out about all your finances, and make adjustments to your tax percentages and affairs.
- Phone your medical provider, and very possibly socially engineer them into revealing medial information. I could book an appointment with your doctor for example and impersonate you on the phone appointment.
- Call all your utility providers and cancel contract without you knowing about it.
- I highly suspect I could call your phone provider and take over your phone number, and in some cases use this to take over your email.
Whilst SSN should not be treated as a non-secret, the reality is that it is a secret and is often the only line of defense when dealing with companies.
The problem is that using a government issued ID is easy since everyone has one. That's the wrong use for something like a SSN, but you're right, it's what's done in practice.
We should be moving away from that. Government issued ID should merely be the equivalent of a user name, with any real use requiring additional factors of authentication (password, security key, etc). Unfortunately, most of these other factors are also easily accessible (mother's maiden name, date of birth, etc).
Ideally, we'd have something like:
- number issued at birth (like present system), frozen until individual activates it
- individual sets password when unfreezing
- all accesses must be explicitly allowed by the user
- user can grant/revoke/audit access, and access is denied by default
- no private data is stored in the account
- companies that use the account for authentication are required to delete user data when the user requests it, and these systems are audited to ensure this happens
However, that's not the case. We should be fighting to change that. Having something like an email address or identification number become public knowledge shouldn't matter one bit...
I like the idea, but I'd like it better if my existing email provider that I already trust gave me unlimited aliases (long random ones are fine, and I'd even be willing to pay, say $0.10 each). "+" suffixes are a start but they still leak the original email address.
Why have 2 not 1. Well one of them has to be globally unique on the site, the other has to be hard to guess. Two different requirements.
Here's a much more in-depth article from Troy Hunt, the same security guy running HIBP quoted in this story: https://www.troyhunt.com/im-sorry-but-your-email-address-is-...
Right now if you know someone's personal email, you can be pretty sure they use it everywhere. It's not a good practice, but it's hard to understand why until you've been the target of wtf-level social engineering attacks where someone got into one of your many accounts starting with nothing but an email address they don't even control.
Of course, there are some issues with the OP's email forwarding service:
1. You have to be someone concerned enough about security to want to generate email addresses, yet you have to be cool piping them through a third party. Ouch.
2. Niche product. I didn't realize how bad social engineering could be until I started hosting bitcoin services. I've had people break in to my AWS accounts twice (with bogus information) after Amazon told me they made a note on my account and that it should now be impossible. As we speak, I cannot get into nor cancel an AWS account that someone SE'd that's charging my CC every month even though it has my CC on file. I have to issue a charge-back. It's hilarious. Amazon thinks an email is more authenticating than a CC that's been on file for years before the attacker changed the email. They literally don't even have a customer support process for this situation. 99% of people even on HN have no clue how pwned they'd be if they're ever a target.
3. It's more of a feature on an existing product than a standalone service. I'd expect players like 1Password to implement it themselves.
You have to jump through a few hoops for gmail to accept your emails. The documentation is there.
If you’re already running a VPS for other reasons, then this is free. If you’re not, then VPSs are extremely cheap anyway.
If all twitter accounts had a physical address scams would increase, crimes, etc.
Emails are an ID on most sites, that’s it. Just because you provide the email as an ID for the account you want to authenticate against doesn’t mean it’s a factor of the authentication.
The auto-generated email addresses seen in the screenshot look very long. There are sites that restrict email addresses to 30 characters, 40 characters, 50 characters, etc. A quick count on one of those on your site showed the length to be more than 50 characters. So this may not work unless you provide a way to specify the length of each generated email address.
Now for the suggestions: the pricing on the Plus tier seems quite high. In fact, one could pay less than $4 a month (the Plus pricing) and get a Runbox mail account with 100 aliases and a Fastmail account with 600 aliases. So the only value in using your service at that tier seems to be the auto-generation part. The pricing for the unlimited tier is also much higher than the average prices of any paid privacy respecting email service. With the services mentioned above, one wouldn’t get tied into an @users.idbloc.com address either if one were to use their own domain.
>Triggering important security processes (eg reset password, social engineering attacks) are trivial once you know someone’s email address.
Sure, but that alone isn't an issue unless you have weak passwords for your email accounts.
>If you want to keep your email address private (you should), generate a new, random email address whenever you give yours out (the same way you use a password manager)
If you're already using a password manager how's that going to provide any additional security? No one is going to crack your 128 bit entropy passwords even if they have your email.
I can also whitelist some of these private addresses, so that my spam filter can't hide e-mails from services I depend on, no matter who sends them (it can be automated no-reply address, or some support person's address, etc.), so it's not really whitelistable based on the sender. I can't whitelist my public addresses.
It has plenty of benefits with regards to security.
It's basically a secret passphrase you give to a company that it can use to reliably contact you, and that you can use to verify that you're talking with someone you've given this secret to. It's not perfect, but much better than public addresses.
- I have a simple custom web UI for managing mailboxes and aliases in the PostgreSQL database.
- I have postfix/dovecot setup with virtual mailboxes. I avoid DB/postfix interaction by simply dumping DB into a static postfix config files (this is much lighter on resources, than having postfix connect to the database, and much less complicated). I don't use catchall address.
- I use POP3 to get mails from my VPS, to avoid storing archive there, and to avoid having to deal with IMAP in my mail client and filtering scripts
- I use neomutt as a mail client. It has some nice options, like reverse_alias (which I use to display names of my private aliases)
- Neomutt can read configuration from a stdout of some command. I use this to generate aliases and other config options, that make it so that when I reply to something delivered to a private alias, the reply will be sent from that alias too. Using this I can have a single Inbox.
- I filter e-mail using a PHP script (it parses the e-mail, checks the alias database, and decides whether to pass the message to bogofilter, Junk, or Inbox)
Aside from adding aliases to the database, there's no other manual work.
You can probably avoid the database by having some scheme, where address is servicename.[substr(sha256(servicename+secret), 10)]@domain.com and filter/whitelist/highlight on the hash validity.
Of course you may also do this for privacy reasons as well, when your reddit account is leaked do you want that linked to your LinkedIn account?
all the ones in my password manager are
>Are you sure your accounts can’t be broken into with social engineering?
And who's to say that you can't also use social engineering to recover the real email? eg. "I forgot which email I used, but it's on my domain, example.com". This is even more plausible if you use any of the "tricks" that the GP mentioned, like putting random characters, since it further lends credence to your "forgot my email" story.
>Is your password manager infallible?
Considering you have to store all those different emails somewhere, I don't see how this mitigates the treat.
>It’s defence in depth.
Sure, it's defense in depth, but the gains in security is so marginal.
>Of course you may also do this for privacy reasons as well
Sure, I'd buy that aspect, but catchall domains or dot/plus tricks aren't going to cut it because they're relatively easy to deanonimize. Not to mention all the data you leak on those platforms just by using them (location references, friend lists, interests, etc.)
Because an email address is essentially an username.
I've done this before. You will get a shitload of spam. I recommend suffixing the user of all addresses with something you want to go to your inbox.
E.g. I use *.email@example.com (r for "real"). Anything that doesn't end in .firstname.lastname@example.org never hits my main inbox.
Having the "validating substring" in front of the @ will actually make it easier when you have to use the email address in vocal communication: many humans are unaccustomed to encountering more than one dot after the at.
I use a subdomain. *@abc.example.com goes to my CatchAll folder. email@example.com go to their respective users. There are no equivalent names between @abc.example.com and @example.com so if anyone gets cute and tries stripping out subdomains, the messages are rejected. Also, subdomains are common enough that no one thinks anything of it.
As a nice bonus, the subdomain can also be directed elsewhere. I've aimed it at various "we automatically file your e-mail" type scripts and services before, just to try them out.
0 - Spammers absolutely try to get cute and drop the subdomain. For example, in the Dropbox leak, firstname.lastname@example.org and email@example.com of mine were leaked. I see tons of spam attempts to firstname.lastname@example.org and email@example.com daily.
Yours is a good solution.
Isn't it elementary for a bad agent to scrub those? Your actual e-mail address (firstname.lastname@example.org) is too visible.
Even if you have my Johnemail@example.com address it still won't help you with login in on other sites as the address used on another site is still unknown.
so you can do `firstname.lastname@example.org`
then if you get any email to `email@example.com` you can filter it
Of course, they can start stripping dots in gmail addresses too, but then they'd have to be targeting gmail specifically or they'd break for most services where dots do matter.
Sadly, "+" doesn't work. Too many web email validators fail when given a "+" character. Gmail has made some of that a little better, but I still bump into it too much.
I generally use "_" or "." for wildcard start as any email validator that hangs on those will get smashed immediately.
Rather than trying to fight windmills, it's better to use unique emails for sensitive/high-risk purposes. That way they cannot be used as primary key.
Company policy here is to never register any company account under a generic <firstname.lastname@example.org> email but use a partly randomized email for each provider <email@example.com>. That way no one can use one leaked email to even identify other services we use (think reset password/login dialogs leaking information about an email being registered or not).
Nice! I just started working on the same thing, and it would be great to be able to drop the project… except trusting a new service with your e-mail is hard. (It’s even hard to trust established names like Google or ProtonMail.) If only every service supported encrypting e-mail so providers couldn’t read it. :(
So, you need to share your email address to be able to communicate properly, but you also need to hoard it because it is part of how you identify yourself. Aliases are one workaround but are not universally accepted as identifiers.
But maybe firstname.lastname@example.org
I have two Gmail addresses. A public and a private. For anything that is public facing, that exposes the email or that I don't trust as much, I use the public email. For everything important, I use the private one (and sometimes with the + method).
My private address is set to forward to my public. I only respond and send email from the public email.
- ask to connect, perhaps with a short message as to why (maybe limit to 100 text characters)
- once approved, communication is encrypted (PGP keys are exchanged behind the scenes)
I think that would drastically limit the amount of spam, phishing, and intercepted messages, which is good for everyone. Unfortunately, I think it's also quite unlikely to hit the mainstream.
So, I just looked to see, and there is already a service doing this: https://www.abine.com/. The email forwarding is the easy part. They also do "burner" credit card numbers. I don't know that domain to know how you do that, but that seems good. They also do phone numbers. Neat idea.
Email is not a factor in authentication, it’s a user identifier. There is a huge difference between the two. Requiring users to also provide their first name is also not a factor.
Email is just a worse version of the same factor in a login. They are both just “something you know” only email is even worse because it’s something everyone who knows you knows.
Just to drive the point home, if a website asks you to set three passwords that it asks you for on each login, that’s not 3-factor or even 2-factor authentication. They are all just part of the “something you know” factor.
The first time I setup a top level catch all, I had the first dictionary style attack within a day or so. I've been using a catch-all mail-only subdomain for well over a decade, and have had no such attack.
That doesn't make any sense. Just create multiple email addresses, one for talking with people and a separate one for signing up for services.
The one problem is there are companies like Facebook where even if you sign up with a secret email address, they will send password reset tokens to the other public-facing email addresses you have on your profile.
Personally I have a bunch of spare email addresses from google, yahoo, and others that I created many years ago before they demanded a phone number when signing up. I just use those for website logins.
email@example.com -> (some unique symbol to denote hash, say +)(hash)@gmail.com
It wouldn't be complicated for them to implement, and give everyone a few such aliases.
Forget about email addresses, the network should scale globally from the user's public key that can be stored on any number of nodes as the user id with the data.
The fact is that email is often the keys to the kingdom, if compromised, or the baited hook to be used in phishing. With consequences which have had globally-significant consequences, to date.
Why am I not surprised...
MongoDB shares some of the blame here. Software needs be secure by default.
I believe more of the blame should be put onto markets that provide images with insecure settings as MongoDB doesn't bind to the public interface by default and hasn't done so for years.
PS: And yeah I remember that famous password bypass bug from 2012, but that's all.
I'm not sure if the reason MySQL was open with no password is MySQL's fault or the various distribution packagers, they have a lot of say in how its ultimately configured, but it wasn't a good look.
That's a pretty dumb way to validate an email address. You can just end the connection after a successful RCPT TO and no one needs to be the wiser. The method causes a false validation in the case of an intermediate mail relay that accepts everything but that is such a bad idea that no one does that.
#> swaks --quit-after RCPT --TO firstname.lastname@example.org
=== Trying gmail-smtp-in.l.google.com:25...
=== Connected to gmail-smtp-in.l.google.com.
<- 220 mx.google.com ESMTP a199si3828494itd.133 - gsmtp
-> EHLO example.com
<- 250-mx.google.com at your service, [xx.xx.xx.xx]
<- 250-SIZE 157286400
<- 250 SMTPUTF8
-> MAIL FROM:<email@example.com>
<- 250 2.1.0 OK a199si3828494itd.133 - gsmtp
-> RCPT TO:<firstname.lastname@example.org>
<** 550-5.1.1 The email account that you tried to reach does not exist. Please try
<** 550-5.1.1 double-checking the recipient's email address for typos or
<** 550-5.1.1 unnecessary spaces. Learn more at
<** 550 5.1.1 https://support.google.com/mail/?p=NoSuchUser a199si3828494itd.133 - gsmtp
<- 221 2.0.0 closing connection a199si3828494itd.133 - gsmtp
=== Connection closed with remote host.
Can additional OAuth scopes be requested and do the third parties request the contacts permission, then harvest the organizations' contacts, and is there no setting in the admin menu to prevent either the OAuth grant or the contact access?
EDIT: Replies have corrected my misunderstanding, thanks! Troy was saying that his own personal email was also in the dataset.
Hunt says some of his own information is included in the Verifications.io exposure. "The main takeaway for me is that this is just another case where someone has my data, and hundreds of millions of other people’s data, and I’ve absolutely no idea how they got it," Hunt says. "I’d never heard of the company until now and I certainly can’t ever recall consenting to their use of my data. Of course, it’s entirely possible that buried in some other service’s terms and conditions it says they’re allowed to pass my data around in this fashion, but that’s not really consistent with my expectations of how my data should be used.""
If someone gains unauthorized access to a system you own, you assume it's compromised and take it down, even if logs don't show anything. If you leave a system unsecured on the internet, assume it has been compromised and take appropriate measures.
In this case, it is more likely than not that this guy was not the first to discover the database. In the absence of proof, assume it has been compromised before.
Is it worth it?
I don't know. It's part of the patchwork of steps I take, better than many people but by no means failsafe.
It took me all of 10 minutes to find a convenient JSON endpoint with incrementing IDs that didn't disallow cross-account pulls. It wasn't a public MongoDB endpoint like the above, but we did get a pretty sweet discount rate for reporting it to them and, you know, not abusing some other customer lists with 300M+ emails.
Now my data is in another database. Sigh.
An interesting situation nonetheless.
The first is an awesome resource, and they publish password leaks as well, and they have a secure way to check if your password has been in a leak (my password manager, Bitwarden, integrates with them).