Hacker News new | past | comments | ask | show | jobs | submit login
800M Email Addresses Leaked Online by Email Verification Service (securitydiscovery.com)
501 points by tlrobinson on Mar 7, 2019 | hide | past | favorite | 211 comments

> Some of data was much more detailed than just the email address and included personally identifiable information (PII) ... ‘Emailrecords’ was structured to include zip / phone / address / gender / email / user IP / DOB

How messed up is it that random 3rd parties collect and assemble all this information to begin with, leaks aside? Sounds to me like all this data fell into unscrupulous hands way before any hackers may have found it on the public internet.

I've seen a demo where given my email address they could pull up my name, address, profession, age, and I forget what else. Yes, it's already in unscrupulous hands.

I've seen that too. This API for example [0] called "email enrichment" where I got my full name back, based on email address.

Edit: Added main company link providing this API [1]

[0] https://app.livestorm.co/api/v1/utils/email-enrichment/?emai...

[1] https://docs.enrich.email/

My data is empty, even though I'm not that paranoid about privacy (but I do take some care).

This specific API seems pretty innocuous. They're not doing black magic, it's just aggregating data that people willingly put out there about themselves.

I'm sure I'm out there in many datasets with stolen (or just "shared") information.

> This specific API seems pretty innocuous. They're not doing black magic, it's just aggregating data that people willingly put out there about themselves.

That's has always been illegal in my country though (you can't even keep a record of people with pen and paper), and now with GDPR would of course be illegal with actual consequences if it contains data about EU citizens.

Regardless, they're not based in Europe. Unless they do business in the EU, they can't touch them. This is due to the fact that extradition requires something to be a crime in both countries, so unless the EU has assets to seize, nothing they can do.

And if they try, any American court will be very leery of setting the precedent that Brussels can tell Americans what to do in any sense, particularly with respect to data stored on their servers.

Perhaps not directly, but if they're processing the information of EU data subjects on behalf of another company that does business in the EU, then that company will have to justify using this service, which is clearly not GDPR compliant.

I imagine the company using them will want to recover financial losses they incur after getting reamed by whatever european Data Protection Authority decides to go after them - especially if the culprits did promote themselves as being GDPR compliant.

The point of the EU's strong data protection rules is to have accountability - and it will fall on someone along the chain that caused the mess. Companies can't be allowed to completely disregard how they collect and store data and then go "Oops, haha sorry about that!" when the shit inevitably hits the fan, and just continue their business as usual.

>now with GDPR would of course be illegal with actual consequences if it contains data about EU citizens.

which, as an EU citizen, I can confirm it does

Hi, Gilles here CEO of Livestorm. Thank you for surfacing this.

What is this URL: This is not a public API route, it is a proxy to a service called Clearbit to enrich professional emails with public company and person data from multiple public sources such as AngelList or LinkedIn (cf https://clearbit.com/enrichment and https://clearbit.com/our-data).

By using this route, you are using Clearbit with our credentials. Most importantly, we don’t store any data on our end when accessing this route. We don't own this data, it is stored on Clearbit servers.

Why are we using them: We entered in business with Clearbit to help our users get more insights on their webinar sign ups from public data sources.

Are they GDPR ready: Clearbit is GDPR compliant (cf https://clearbit.com/gdpr). You can claim your data here: https://claim.clearbit.com/claim.

We took all the steps necessary with Clearbit to ensure our process was GDPR compliant. However, this information makes us double guess it. Therefore, we are revaluating the compliancy of this specific process and in the meantime, we have deactivated this route.

Hang on a minute.

Your GDPR compliance page is absolute crap. Ditto your Privacy policy. https://clearbit.com/privacy

For starters:

At what point does this site should that I have given consent for you to process my data? Who are the third parties that have given you my data?

Not only that, but their "delete my data" button says "we've deleted your data" but it's still there when I revisit.

That's Clearbit's page. Not Livestorm's. Again, this URL is a proxy to Clearbit's API :)

Actually (after contacting their support because of wrong data that was returned for my contact), they seem to be using https://clearbit.com/ instead as backend.

enrich.email also does the same, though.

There’s also https://fullcontact.com.

In the end they all just search and scrape social media profiles and gravatar.

What surprised my is that when you go to https://clearbit.com/ at "Understand your customers" it actually showed our company logo. Probably based on Ip address because I'm at the office right now and cleared my cookies.

I checked mine and it has only my full name. But "indexedAt" gives the current date and time, interesting :)

Mine has information harvested (provided by?) from about.me. (I had a page there about my dancing activities) So this API tells anyone where and when they can meet me for a dance. (9 years ago, that is) :D

I am just waking up, is there a link to check if your email was part of this?

Mine only has whatever Gravatar had.

FullContact? Although it's "Enhanced Contact API" has been discontinued.


Equifax and every other latest fintech startup who believes there amazing data slurping ml/ai idea is going to change the world


Some companies in this space: Clearbit, FullContact

people were so afraid of BIGTECHs getting their hands on all your personal data. When in reality, a 13 year old can just ask around on some forums and get it.

The 13 year old can get it because BIGTECH collected it and shared it carelessly, no?

> collected it and shared it carelessly

Or often, as in this case, collected it and stored it carelessly so someone else could get in and use or share it at will.

30 years ago you could already buy almost all of this + estimated income and education on 90% of americans. It's how credit card companies know who to send mail to.

I was going to say this is nothing new..

20 years ago my first roommate worked for a company that did direct mail marketing, and they regularly engaged a "cleaning" firm that would scrub and update their mailing lists.

This company had a huge databases of information on people (name/work/address/DOB/income/etc) and they would send them data and they would clean it up.

Now they would not give them any new data, they could only update records they already owned.

Back then in Canada there were apparently only a small handful of well-known cleaners like this who traded on their reputation for accuracy and dataset completeness..

Does anyone have this dump, or at least a way to check exactly which of my data was in it? HIBP doesn't tell you, unfortunately, even though it definitely could, since I have verified my email address with them.

Yeh, I assume this a GDPR violation in the EU (I mean the business model seems like it violates GDPR, let alone the actual data breach).

For fun:

- I see 67,569 mongodb results on shodan

- I searched for shodan API keys on a shady website known as "Github", found one that works, queried for all the mongodb databases

- Tried connecting on the first one, it works...

- What next?

edit: I guess it's a choose your own adventure game:

1) Delete all the data

2) Look up domains from IPs, find owner's emails and email them

3) Make nice stats about all the data

4) Sell the data on the black market

5) Find user's emails, and tell them they are using a service that just doesn't care

6) Just make every ~third integer off by 1

7) Mine bitcoins on them

... So many ideas, so little time

As much of a game as it may seem, a lot of these could put you in prison.

Depending on your country, just loading that first one up to see if it worked could put you in prison.

In the United States, simply accessing a publicly accessible URL can put you in prison, assuming the government doesn't lose the appeal on the technicality of a choice of venue, which they won't do a second time... https://www.eff.org/press/releases/appeals-court-overturns-a...

He was convicted by a jury of his 'peers'.

My biggest problem with the justice system.

Do any of us really trust our peers?

Not just peers, peers who didn't get out of jury duty.

Personally I think jury duty should just be mandatory (with the only exceptions being extreme circumstances) because it being so easy to get out of creates a major bias in the set of people actually doing jury duty.

Who would you prefer to be on the jury?

That's a much harder question and one I don't claim to have an answer to. It doesn't change the fact that I don't trust my peers. I don't trust your average American to be informed enough and educated enough to make decisions about my life. Jury of peers might as well be trial by mob.

Well, it's better than a trial by your accuser.

As well it should.

Comparable activity: let's ride around the neighborhood and see who hides a key under a rock. Then let ourselves in and look around.

Physical analogies for security like the one you just made are useless because they don't reflect how the computers and the internet work. You can't just ping a rock and ask it if it's got a key. The rock isn't expected to exist in a world where some script kiddie in China is going to ask it if is has a key every 5min. The analogy simply falls flat on its face once you try to use it to reason about anything because existing as an internet connected device is far different than existing on private (but easily accessible) property.

If you'll refer to what I was replying to, I wasn't talking about physical security being analagous to digital security. I was saying it is illegal to attempt to open and access property that you do not own.

"Let's ride around the neighborhood and see who left all their belongings in their front yard" would be a closer example.

Imagine if your Doctor left your personal medical records lying on the ground in front of their office...

That's an absurd analogy. How are you supposed to know if a completely unauthenticated URL is meant to be private or not?

If we're going to try to force physical analogies to represent the digital, I would say it's more like every store, house, and public building looks exactly the same and you don't know if the door is locked or what's inside it until you try to open it.

I wholeheartedly disagree, despite the down votes in this community. The act of attempting to open it is what is illegal, in both situations.

You have a person cruising for keys to a house and a person cruising for keys to databases. They own neither.

But if a database is unsecured, how do you know it's not meant to be publicly accessible?

Your point is good of course, but I think the question has already been answered in society. If you aren't explicitly allowed into someone else's property, you aren't allowed.

I'm allowed to look at property from the street. There's nothing stopping me from walking around the neighborhood looking into everyone's property. I can even walk up to the house in most cases, and peering into windows is usually legal as well. I would even argue that trying the doors is also legal, though entering is probably illegal in most areas.

Accessing a URL is like walking around the neighborhood looking from the street. Trying different URLs is like peering into windows. Trying the door is like testing default credentials.

To make something a crime, you need to prove some kind of loss as well as intent. If there's no intent, it's likely a civil case to get restitution (e.g. you won't arrested for breaking something in a store accidentally). Likewise, you could get arrested for intent without loss (e.g. tried to break something in a store, but failed).

IMO, it should only be illegal if there's intent to harm. Likewise, if you accidentally harm something without intent (e.g. accidentally DOS a server), you should be expected to make restitution. Both need to be proven in a court of law, criminal if there's intent, civil if there isn't, and if there's no harm or intent, there's no crime.

It's not analogous because in internet protocols the server has to actively send the "property" in response to a specific request to access it. If a request is made and a response freely given with no further authentication required then it is a stretch to compare this to coming onto someone's property without invitation. Perhaps if one is accessing unauthenticated services on a private network this analogy would sort of work, but the assumption is that if you run a service that responds to random requests on the public internet you intended the responses to be public. It is more like an owner hiring someone to stand on their property and invite people in then complaining that they gave their employee poor directions and faulting the public for trespass.

I wouldn't try to use physical property as an analogy. You can't even see what properties there are until you enter them. If the door wasn't locked you won't know until you're already inside.

Here's an analogy I think is closer: I ask you for all your personal documents and bank account numbers. You promptly give them to me with no questions asked. Was I accessing private property improperly? If you refused and demanded I identify myself as someone authorized to have access, and then I forcibly took them from you anyway, that would be criminal.

Web addresses aren't property, I think that's the wrong analogy. The web is made of requests and responses, and you don't know what response you'll get until you make the request.

I will go with "Let it be, hoping someone does something about it someday somehow"

The canonical approach seems to be "encrypt the data in place, ask for Bitcoin"

boring. Configuring them all as a replica of each other is more fun

Generate records in them containing "Help, I'm trapped in the MongoDB development building and food is running out!" every couple of month :)

I'm actually curious why this wasn't done yet, or maybe I'm missing something

I strongly believe that we should be keeping our email addresses as secure as our passwords. It’s a really important attack vector as it’s often the starting point for any targeted attack, and although it’s not usually considered as a factor, it is the 2nd factor required for most logins (email and password). Triggering important security processes (eg reset password, social engineering attacks) are trivial once you know someone’s email address.

It’s clear from this hack that the owners of the hacked site didn’t see emails as something worth securing (stored in plain text on a wide open mongo server)

If you want to keep your email address private (you should), generate a new, random email address whenever you give yours out (the same way you use a password manager). If you have your own domain you can use a catch all/wildcard address, eg. *@mydomain.com, if you use gmail you can use their plus support, e.g. John+uniqueidentifier@gmail.com, if you use neither or want more security I’ve recently launched https://idbloc.co which aims to help deal with this.

Completely disagree. Security resources are scarce and should be allocated toward protecting secrets.

Non-secrets, like addresses and SSNs, should be rendered harmless rather than squandering resources trying to keep them secret.

Yeah. Email addresses aren't secrets. If any one server or inbox gets popped you're fucked.

Though if you run a full domain you can use emails as one-use affairs. Most don't though and, really, what's the point? It only saves you from journalists, not from a motivated attacker.

> if you run a full domain you can use emails as one-use affairs

or, if you use a service that lets you generate aliases, like gmail's "+", or a service like mailinator.

The problem is that the attack vector of email addresses is they are sometimes used as a username, and therefore contains more information than what is strictly required (for the purpose of a username). Leaking the "real" email address not only leads to spam, but allows a more dedicated attacker to use that email address as a starting point on a different site, or hack the email address altogether.

And with sites increasingly blocking disposable email addresses like mailinator, or disallowing email aliases, the problem can only get worse.

Having your own domain is cheap. Generating an email that’s a function of the target website is trivial. I’ve done it for 20 years.

I can confirm the source of every email breach that contains one of my addresses.

Yep, I do the same. I typically use <sitename>@sites.<domain>.<whatever> for website logins (stored in my password manager, so I don't need to think about it to login), so that if my password ever leaks, I know where it was leaked from.

sadly using elite hacking skills this is easily circumvented

    p0wn3r $cat  testerm
    p0wn3r $cat  testerm | sed 's/+.*@/@/'

and so if you set up your email filters right, you can find out who is doing this sort sort of "hacking" to get your real email address. What you do afterwards is up to you.

By definition, if someone is offloading your data to a 3rd party and they sanitise the addrs, then you can't tell.

The "+" is not limited to Gmail, it's standard. The problem is many services with fancy mail validation don't accept it.

Does Gmail allow sending from the + addresses? There's quite an issue if somebody contacts you on that address but you reply without the alias.

They do allow it, but it’s a pain to set up for each + address, especially on iOS.

I have not find a way to send email from Gmail, using either the web interface of their SMTP server, from a custom username (left side of @ symbol). I have a custom domain using Google Apps, but to send mail I use a third party SMTP server to customize the username portion of the From field.

I've never had an issue using a different email for support, I always mention that I own the domain or email suffix and they can verify that if they want to (though nobody has so far).

I'd forgotten about the gmail trick. You're certainly right about that. Though I will say one thing: I've not been hiding my email this past decade and—as far as I know—it has not bit me.

> you can use emails as one-use affairs [..] what's the point? It only saves you from journalists, not from a motivated attacker

Can you explain this claim?

If you generate your single-use email addresses wisely, then you should know that the one you gave to - say - Marriott should only ever receive emails from Marriott.

If - say - Marriott gets hacked and that particular one of your many different email addresses leaks, then:

a) you'll find out that address is burned just as soon as anyone other than Marriott uses it (you immediately generate a new one, give it to Marriott, and stop accepting any mail at all on the old one)


b) if anyone other than Marriott uses it, you know immediately that that message can't be legit.

c) chances are that all your incoming phishing traffic will arrive at mismatched addresses. Makes it even harder to fall for it in that moment of mental blackout.

But if someone phishes you on you+BankCo@example.com you're probably more likely to imagine it's legit. Swings and roundabouts.

No-one would know that you+BankCo@example.com is an email address you can even be reached on, unless it leaks out from BankCo.

ADD: and of course, you aren't really going to stick +BankCo after your real name to generate an email to use with BankCo. You're going to give them something generated like you'd generate a password - "ol48eILm@example.com" or similar - so if anyone finds out that particular email address for you it doesn't tell them with whom you use it.

But that's a big if because phishers will always aim for the weakest individuals in the flock.

You can do a lot with a SSN in Finland. The first 6 digits are your date of birth followed by 4 random characters. If I had your SSN (and address) I could:

- Phone up the tax office and find out about all your finances, and make adjustments to your tax percentages and affairs.

- Phone your medical provider, and very possibly socially engineer them into revealing medial information. I could book an appointment with your doctor for example and impersonate you on the phone appointment.

- Call all your utility providers and cancel contract without you knowing about it.

- I highly suspect I could call your phone provider and take over your phone number, and in some cases use this to take over your email.

Whilst SSN should not be treated as a non-secret, the reality is that it is a secret and is often the only line of defense when dealing with companies.

And it shouldn't be.

The problem is that using a government issued ID is easy since everyone has one. That's the wrong use for something like a SSN, but you're right, it's what's done in practice.

We should be moving away from that. Government issued ID should merely be the equivalent of a user name, with any real use requiring additional factors of authentication (password, security key, etc). Unfortunately, most of these other factors are also easily accessible (mother's maiden name, date of birth, etc).

Ideally, we'd have something like:

- number issued at birth (like present system), frozen until individual activates it - individual sets password when unfreezing - all accesses must be explicitly allowed by the user - user can grant/revoke/audit access, and access is denied by default - no private data is stored in the account - companies that use the account for authentication are required to delete user data when the user requests it, and these systems are audited to ensure this happens

However, that's not the case. We should be fighting to change that. Having something like an email address or identification number become public knowledge shouldn't matter one bit...

I agree with most of what you said, but it's even more important to keep access to your email secure than it is to keep the address private, so why should I trust your relatively unknown service to be the gatekeeper to all my accounts?

I like the idea, but I'd like it better if my existing email provider that I already trust gave me unlimited aliases (long random ones are fine, and I'd even be willing to pay, say $0.10 each). "+" suffixes are a start but they still leak the original email address.

"+" suffixes are generally a waste of time. Most actors will strip everything between + and @ to avoid exposure like this.

Too many sites disallow the + thing. It's not a great solution.

Sendmail and postfix both allow you to configure that. I have my mail set up so '+', '.', and '_' all work the same. I've never seen any site deny the '.' or '_' character.

If you run your own mail server, you can simply catch-all. The + trick is for the rest.

There are other privacy respecting email providers who give aliases. Posteo and Mailbox.org provide three aliases to begin with. Additional aliases on Posteo are EUR 0.1 per month. Runbox provides 100 aliases per account, and more can be purchased later. Fastmail provides about 600 aliases per account.

Practically, Runbox gives you much more aliases. You can use their own set of domains for each alias which is not small. Can't tell exact number now but around 30 or so. Really handy. There is another problem though. After you've done with many aliases and domains it's getting more and more difficult to maintain. You have to at least vaguely remember which one you have used already. I wish they could introduce a better way of managing aliases.

Is there any technical reason for charging per alias? It’s just an extra $0.00000001 database entry per, no?

It might be to slow alias namespace depletion, and put a cost on potential squatting.

Makes sense on their domain, but if I’m pointing my $1.95/year .info domain at them, I don’t understand the limits.

Fastmail has unlimited aliases, they aren't trivial to set up, but easy enough that I've set up ~10.

That’s not true. Fastmail provides about 600 aliases per account, not unlimited. [1] That’s quite high, but it’s also easy to run out of if one hands out aliases on a per-site basis.

[1]: https://www.fastmail.com/help/account/limits.html

It also supports a catch-all alias though.

Ah! Thanks for the correction.

Keeping emails private defeats their purpose as communication address and can lead to a false sense of security instead of properly controlling passwords and multi-factor authentication.

Having a "username" that you login with which is different than the email address would solve the same issue.

Yes because that's the same as requiring two passwords. It's even more secure to get three or five. But using the email address serves a different purpose, people rarely forget them.

Not really, in the event of a leak, two hashed passwords leaks less information than a hashed password and an email address.

Why have 2 not 1. Well one of them has to be globally unique on the site, the other has to be hard to guess. Two different requirements.

You can say that for sufficiently high values of hard, "hard to guess" implies "globally unique".

It is also a built in way to do password reset, and they are automatically globally unique, by definition.

While that is of course true enough, I go to a fair amount of trouble to remove my name, age, address, etc, from public databroker sites because I value my privacy. I don't want my email going around either. Any information leakage can be used to harm me.

Doesn't that just identify you to the databroker as someone with enough money to care? Now you're just on a much shorter list of high-value targets.

Emails are designed for communication, but we also use them for authentication and targeted marketing. For those use cases it should be private.

They're a public endpoint and are open to receiving from any other address. It's no different than your physical address. You don't use them for authentication or marketing, they're just an identifier. Keeping them private accomplishes nothing.

Here's a much more in-depth article from Troy Hunt, the same security guy running HIBP quoted in this story: https://www.troyhunt.com/im-sorry-but-your-email-address-is-...

One major issue with email address reuse is social engineering. It's a lot harder to attack someone's account if you can't even provide the email address.

Right now if you know someone's personal email, you can be pretty sure they use it everywhere. It's not a good practice, but it's hard to understand why until you've been the target of wtf-level social engineering attacks where someone got into one of your many accounts starting with nothing but an email address they don't even control.

Of course, there are some issues with the OP's email forwarding service:

1. You have to be someone concerned enough about security to want to generate email addresses, yet you have to be cool piping them through a third party. Ouch.

2. Niche product. I didn't realize how bad social engineering could be until I started hosting bitcoin services. I've had people break in to my AWS accounts twice (with bogus information) after Amazon told me they made a note on my account and that it should now be impossible. As we speak, I cannot get into nor cancel an AWS account that someone SE'd that's charging my CC every month even though it has my CC on file. I have to issue a charge-back. It's hilarious. Amazon thinks an email is more authenticating than a CC that's been on file for years before the attacker changed the email. They literally don't even have a customer support process for this situation. 99% of people even on HN have no clue how pwned they'd be if they're ever a target.

3. It's more of a feature on an existing product than a standalone service. I'd expect players like 1Password to implement it themselves.

It’s not too hard to set up your own VPS with Debian/OpenBSD and do the email forwarding/server yourself. The key is that it’s then literally a matter of a bash script to generate and accept one-time email addresses.

You have to jump through a few hoops for gmail to accept your emails. The documentation is there.

If you’re already running a VPS for other reasons, then this is free. If you’re not, then VPSs are extremely cheap anyway.

But you do keep your address private or at private as possible. You give it out only as needed.

If all twitter accounts had a physical address scams would increase, crimes, etc.

This convoluted idea is only getting purchase here because the idea of effective penalties for mishandling/misuse appears outlandish.

Emails are not used for authentication. Access to an email account can be used for authentication the “reset password” approach.

Emails are an ID on most sites, that’s it. Just because you provide the email as an ID for the account you want to authenticate against doesn’t mean it’s a factor of the authentication.

Your service looks very interesting, but I have one concern and some suggestions.

The auto-generated email addresses seen in the screenshot look very long. There are sites that restrict email addresses to 30 characters, 40 characters, 50 characters, etc. A quick count on one of those on your site showed the length to be more than 50 characters. So this may not work unless you provide a way to specify the length of each generated email address.

Now for the suggestions: the pricing on the Plus tier seems quite high. In fact, one could pay less than $4 a month (the Plus pricing) and get a Runbox mail account with 100 aliases and a Fastmail account with 600 aliases. So the only value in using your service at that tier seems to be the auto-generation part. The pricing for the unlimited tier is also much higher than the average prices of any paid privacy respecting email service. With the services mentioned above, one wouldn’t get tied into an @users.idbloc.com address either if one were to use their own domain.

I disagree.

>Triggering important security processes (eg reset password, social engineering attacks) are trivial once you know someone’s email address.

Sure, but that alone isn't an issue unless you have weak passwords for your email accounts.

>If you want to keep your email address private (you should), generate a new, random email address whenever you give yours out (the same way you use a password manager)

If you're already using a password manager how's that going to provide any additional security? No one is going to crack your 128 bit entropy passwords even if they have your email.

It helps avoid most phishing emails. I have a DB with a list of randomly generated aliases and services I've provided them to. My e-mail client is set up so that these individualized per-service aliases are highlighted (and renamed to the name of the service I provided the secret alias to) in the message list, so I instantly know when I receive e-mail from a service, and when someone is trying to spoof it by sending e-mails in the name of the service to my public address.

I can also whitelist some of these private addresses, so that my spam filter can't hide e-mails from services I depend on, no matter who sends them (it can be automated no-reply address, or some support person's address, etc.), so it's not really whitelistable based on the sender. I can't whitelist my public addresses.

It has plenty of benefits with regards to security.

It's basically a secret passphrase you give to a company that it can use to reliably contact you, and that you can use to verify that you're talking with someone you've given this secret to. It's not perfect, but much better than public addresses.

Your system looks useful and at the same time seems to involve some manual work. Please consider writing one or more blog posts about your setup. It could be very helpful to others.

I doubt it would be very useful. It's very niche.

- I have a simple custom web UI for managing mailboxes and aliases in the PostgreSQL database.

- I have postfix/dovecot setup with virtual mailboxes. I avoid DB/postfix interaction by simply dumping DB into a static postfix config files (this is much lighter on resources, than having postfix connect to the database, and much less complicated). I don't use catchall address.

- I use POP3 to get mails from my VPS, to avoid storing archive there, and to avoid having to deal with IMAP in my mail client and filtering scripts

- I use neomutt as a mail client. It has some nice options, like reverse_alias (which I use to display names of my private aliases)

- Neomutt can read configuration from a stdout of some command. I use this to generate aliases and other config options, that make it so that when I reply to something delivered to a private alias, the reply will be sent from that alias too. Using this I can have a single Inbox.

- I filter e-mail using a PHP script (it parses the e-mail, checks the alias database, and decides whether to pass the message to bogofilter, Junk, or Inbox)

Aside from adding aliases to the database, there's no other manual work.

You can probably avoid the database by having some scheme, where address is servicename.[substr(sha256(servicename+secret), 10)]@domain.com and filter/whitelist/highlight on the hash validity.

Totally agree that generally speaking no one is breaking high complexity passwords. Why break the password when you can just convince someone to change it for you?

Easiest way is just to put up a fake service and they’ll give you their real password (same one they use everywhere). I can do this in a day and get thousands of real passwords and emails if I wanted to.

Are you sure all your passwords are secure? Are you sure your accounts can’t be broken into with social engineering? Is your password manager infallible? It’s defence in depth.

Of course you may also do this for privacy reasons as well, when your reddit account is leaked do you want that linked to your LinkedIn account?

>Are you sure all your passwords are secure?

all the ones in my password manager are

>Are you sure your accounts can’t be broken into with social engineering?

And who's to say that you can't also use social engineering to recover the real email? eg. "I forgot which email I used, but it's on my domain, example.com". This is even more plausible if you use any of the "tricks" that the GP mentioned, like putting random characters, since it further lends credence to your "forgot my email" story.

>Is your password manager infallible?

Considering you have to store all those different emails somewhere, I don't see how this mitigates the treat.

>It’s defence in depth.

Sure, it's defense in depth, but the gains in security is so marginal.

>Of course you may also do this for privacy reasons as well

Sure, I'd buy that aspect, but catchall domains or dot/plus tricks aren't going to cut it because they're relatively easy to deanonimize. Not to mention all the data you leak on those platforms just by using them (location references, friend lists, interests, etc.)

So are you saying that all recommendations regarding non disclosure of usernames are useless?

Because an email address is essentially an username.

> If you have your own domain you can use a catch all/wildcard address, eg. @mydomain.com

I've done this before. You will get a shitload of spam. I recommend suffixing the user of all addresses with something you want to go to your inbox.

E.g. I use *.r@mydomain.com (r for "real"). Anything that doesn't end in .r@mydomain.com never hits my main inbox.

Been using a wildcard for one of my email domains for years (facebook@..., linkedin@... etc) and never once have I had issues with spam. Everything for services I dont care about goes through SpamAssassin on the mail server and items that pass get dumped to Gmail, where only "ham" messages will get forwarded through to my real inbox.

I think they were using their catch-all for non-transactional/newsletter emails. I have a catch-all for websites and then another that I give to people to email me on. This lets me see emails sent to me by a person easily while having separate email addresses for each site.

Do you have catchall directly on the second level domain (@mydomain.com) or do you only catchall a specific subdomain (@mail.mydomain.com)? I'd expect a very big difference in the volume of blue sky spam between those two. If GP sbov has the catchall directly on mydomain.com, then restricting the value space to *.r will make a meaningful difference.

Having the "validating substring" in front of the @ will actually make it easier when you have to use the email address in vocal communication: many humans are unaccustomed to encountering more than one dot after the at.

Yep, directly on the second level domain. I do not use a mail subdomain.

> I recommend suffixing the user of all addresses with something you want to go to your inbox.

I use a subdomain. *@abc.example.com goes to my CatchAll folder. usernames@example.com go to their respective users. There are no equivalent names between @abc.example.com and @example.com so if anyone gets cute[0] and tries stripping out subdomains, the messages are rejected. Also, subdomains are common enough that no one thinks anything of it.

As a nice bonus, the subdomain can also be directed elsewhere. I've aimed it at various "we automatically file your e-mail" type scripts and services before, just to try them out.

0 - Spammers absolutely try to get cute and drop the subdomain. For example, in the Dropbox leak, dropbox@abc.example.com and db@abc.example.com of mine were leaked. I see tons of spam attempts to dropbox@example.com and db@example.com daily.

Thats a good solution. For 15 years or so I just used the user.site@domain.com approach. But in the last 4 or 5 years, the spammers have gotten smarter and now are stripping off the site and just emailing user@domain.com.

Yours is a good solution.

Since we’re sharing anecdotes: I do this presently and get very little spam to random addresses. I’ve been on this scheme for about a decade now. All spam is to a specific address, which I’d given out.

I use a wildcard on my domain. Spam filtering is _really_ good these days. It truly is not a problem. I get tons of spam, and it's all filtered.

You don't use wildcard addresses for this, you use sub/plus-addresses. That way the spam problem is completely avoided and multiple users per domain will work normally.

I'd use something else not plus because gmail uses plus so spammers may be aware they can change the plus to anything


Isn't it elementary for a bad agent to scrub those? Your actual e-mail address (john@gmail.com) is too visible.

Sure, but from a security/authentication standpoint that won't be an issue.

Even if you have my John+ajf@example.com address it still won't help you with login in on other sites as the address used on another site is still unknown.

Yes, it is- however it could stop an automated attack where the bad agent isn’t targeting you specifically or is not a particularly knowledgeable agent. It also might help you find out where a leak or spam is coming from (eg if you get a reset password email from John+facebooksecretlogin@gmail.com, you know your Facebook email address is known to an attacker). So it’s better than nothing.

Obviously this only works if all emails to john@gmail.com and john+unrecognisedidentifier@gmail.com are rejected without notifying the recipient. This was how Yahoo's scheme worked (I never actually used it, not sure if it's still active). If gmail's doesn't, then I agree it's worthless.

gmail also ignores dots

so you can do `j.o.h.n+ycombinator.com@gmail.com`

then if you get any email to `j.o.h.n@gmail.com` you can filter it

Of course, they can start stripping dots in gmail addresses too, but then they'd have to be targeting gmail specifically or they'd break for most services where dots do matter.

Pretty easy reason why you shouldn't; under GDPR you don't have consent to use this email (though this reasoning is a bit shakier) and scrubbing the suffix runs afoul of spam laws in my country (if you're given an email for contact, you use THAT email, not anything else).

> John+uniqueidentifier@gmail.com

Sadly, "+" doesn't work. Too many web email validators fail when given a "+" character. Gmail has made some of that a little better, but I still bump into it too much.

I generally use "_" or "." for wildcard start as any email validator that hangs on those will get smashed immediately.

It unreasonable to assume anyone could ensure his email is kept private the moment she shares it with at least one other entity.

Rather than trying to fight windmills, it's better to use unique emails for sensitive/high-risk purposes. That way they cannot be used as primary key.

Company policy here is to never register any company account under a generic <admin@company.tld> email but use a partly randomized email for each provider <provider-xyz-123@company.tld>. That way no one can use one leaked email to even identify other services we use (think reset password/login dialogs leaking information about an email being registered or not).

> I’ve recently launched https://idbloc.co which aims to help deal with this

Nice! I just started working on the same thing, and it would be great to be able to drop the project… except trusting a new service with your e-mail is hard. (It’s even hard to trust established names like Google or ProtonMail.) If only every service supported encrypting e-mail so providers couldn’t read it. :(

I've been working on similar project since last year as well. (I slacked a little bit -- grad school and stuff). Too bad the SMTP protocol is limited. I was thinking of ways such a service can be designed in a provably secure way (where we don't get to actually read the messages, just apply rules based on headers only). Any thoughts?

(Actually, replace “trusting a new service” with “trusting 3+ new services” – it’s probably hosted on AWS or something, and https://idbloc.co/privacy mentions that mail is forwarded with MailGun.)

The irony lay in that email addresses were originally designed as communication channels (like snail mail), but have since been co-opted into serving as a unique identifier (like SSN).

So, you need to share your email address to be able to communicate properly, but you also need to hoard it because it is part of how you identify yourself. Aliases are one workaround but are not universally accepted as identifiers.

Many of us do not use our email address for any part of our identity. I believe it’s a poor choice to do this or allow/enable others to, as well.

Tell that to the companies who use it as a login.

A catch all address is just asking for a sunami of spam.

But maybe something*@example.com

I have two Gmail addresses. A public and a private. For anything that is public facing, that exposes the email or that I don't trust as much, I use the public email. For everything important, I use the private one (and sometimes with the + method).

My private address is set to forward to my public. I only respond and send email from the public email.

BTW this is how you market a product. Intimately knowing the space. Providing insightful context on the importance of the problem in a public forum. Offering the free solutions, which are usually manual and cumbersome. Then, finally, offering your service as an easier alternative. Well done.

I think we need to completely reimagine email, though a proper fix is likely pretty easy. I think social networks have done a pretty good job at the basic idea:

- ask to connect, perhaps with a short message as to why (maybe limit to 100 text characters) - once approved, communication is encrypted (PGP keys are exchanged behind the scenes)

I think that would drastically limit the amount of spam, phishing, and intercepted messages, which is good for everyone. Unfortunately, I think it's also quite unlikely to hit the mainstream.

I wonder if an email masking service would make sense. You register with it, and give it your real address. You then can generate a new public email anytime you like. Control how it behaves, etc. [/me hits up google]

So, I just looked to see, and there is already a service doing this: https://www.abine.com/. The email forwarding is the easy part. They also do "burner" credit card numbers. I don't know that domain to know how you do that, but that seems good. They also do phone numbers. Neat idea.

There's also http://gishpuppy.com which I've been using for about 14 years now.

Until Gishpuppy itself gets hacked...

There's also Burner Mail https://burnermail.io

I have also seen https://lttrfeed.com/ with a focus on newsletters.

>is the 2nd factor required

Email is not a factor in authentication, it’s a user identifier. There is a huge difference between the two. Requiring users to also provide their first name is also not a factor.

Email is just a worse version of the same factor in a login. They are both just “something you know” only email is even worse because it’s something everyone who knows you knows.

Just to drive the point home, if a website asks you to set three passwords that it asks you for on each login, that’s not 3-factor or even 2-factor authentication. They are all just part of the “something you know” factor.

I highly suggest using a catch-all on a subdomain, not a top level domain. Top level domains get dictionary-style spam/phishing storms (eg alice@, bob@, charlie@, etc), but those don't seem to occur with subdomains. You can also create just an MX record for a subdomain, which helps 'hide' it to some extent.

The first time I setup a top level catch all, I had the first dictionary style attack within a day or so. I've been using a catch-all mail-only subdomain for well over a decade, and have had no such attack.

> I strongly believe that we should be keeping our email addresses as secure as our passwords.

That doesn't make any sense. Just create multiple email addresses, one for talking with people and a separate one for signing up for services.

The one problem is there are companies like Facebook where even if you sign up with a secret email address, they will send password reset tokens to the other public-facing email addresses you have on your profile.

idbloc looks good. Maybe consider a one-time price option too. Or for people on the free plan, consider allowing them to buy a one-time top-up pack of email "credits" if they use up their free 100.

Personally I have a bunch of spare email addresses from google, yahoo, and others that I created many years ago before they demanded a phone number when signing up. I just use those for website logins.

there is also theothermail.com to generate throwaway email addresses that forwards emails to your original inbox, you don't need to use your personal email address to sign up for a newsletter

we created the service theothermail.com to generate throwaway email addresses that forwards emails to your original inbox, you don't need to use your personal email address to sign up for a newsletter

Thanks for the tip about gmail's "+" support. I'll start using it. I don't think it does anything to keep your email address private, though. Recovering the original valid email address from "John+uniqueidentifier@gmail.com" is trivial.

this is like saying that you should keep your public street address private... you are thinking about it backward. What needs to be done is don't use your email address as a login or worst, as a password recovery option.

Google (and others) should support hashing/obfuscation the unique part, for example

john@gmail.com -> (some unique symbol to denote hash, say +)(hash)@gmail.com

It wouldn't be complicated for them to implement, and give everyone a few such aliases.

While not of any value for 95% of users, for techies a great workaround is to have your own domain and use a distinct email for each service that all go to one catchall account.

Eg: netflix@accounts.MY-DOMAIN.com

Public-Private cryptography is the answer.

Forget about email addresses, the network should scale globally from the user's public key that can be stored on any number of nodes as the user id with the data.

Someone pointed out the other day just remove everything after the plus and you have the real email.

"+" is unfortunately rejected by far too many validators.

Cool. So I just have to wait till your backend gets compromised and starts sniffing my emails mid-forward? Or is it already set up to do that?

Email is inherently weak and should not be used for anything sensitive. It’s weak by design.

"Should not be" and "is not, ever" are, unfortunately, two wildly different world states.

The fact is that email is often the keys to the kingdom, if compromised, or the baited hook to be used in phishing. With consequences which have had globally-significant consequences, to date.

The linked article is down. Here is a Wired article discussing the same leak: https://www.wired.com/story/email-marketing-company-809-mill...

Unsecured MongoDB in the default configuration.

Why am I not surprised...

> in the default configuration

MongoDB shares some of the blame here. Software needs be secure by default.

Absolutely. If software needs configuration before it is secure than the default behavior should be to throw an error while starting up with a link to the documentation for configuration. The default should NEVER be to just start up unsecured.

MongoDB listens to localhost by default and provides a few warnings on startup if you don't have authentication enabled. See:


I believe more of the blame should be put onto markets that provide images with insecure settings as MongoDB doesn't bind to the public interface by default and hasn't done so for years.

Which is true for most popular systems, which I appreciate. I need to consciously decide to put something on an externally accessible port, which reminds me that it's time to make sure everything is secure (TLS, authentication, user privileges, etc).

Redis and MySQL have started to lock down with random passwords now when installed, which is nice.

Did MySQL let you connect to instance from remote without password like ever? I used it first time like 16-17 years ago and don't remember anything like that.

PS: And yeah I remember that famous password bypass bug from 2012, but that's all.

MySQL's "security model" used to be that it only bound to, but beyond that there was no passwords. I may be recalling incorrectly, but root did have access via `%` on most systems I used.

I'm not sure if the reason MySQL was open with no password is MySQL's fault or the various distribution packagers, they have a lot of say in how its ultimately configured, but it wasn't a good look.

I really can't recall if / when it's changed, but for many years you just wouldn't be able to connect to MySQL as root unless you set the password. Also if someone have access to your localhost it's almost always mean you're already in trouble no matter how well database configured.

its Web scale

For other confused readers: no emails were leaked, only email addresses.

Here's a cached version as it's down at the moment. https://web.archive.org/web/20190307231618/https://securityd...

>They do this by literally sending the people an email. If it does not bounce, the email is validated.

That's a pretty dumb way to validate an email address. You can just end the connection after a successful RCPT TO and no one needs to be the wiser. The method causes a false validation in the case of an intermediate mail relay that accepts everything but that is such a bad idea that no one does that.

Most major email providers will block your IP from doing this a lot

catch-all domains cannot be validated this way. The only way(wrong) to verify them is to send a real mail and see if it bounces.

Actually a lot of email addresses cannot be validated this way since most ESPs including gmail have adopted an 'accept-all' approach to incoming email. So getting the email accepted by the server is no indication that the address exists, or that the email will actually be delivered to that address even if it exists.

>... including gmail ...

  #> swaks  --quit-after RCPT --TO kdskr3j2@gmail.com                
  === Trying gmail-smtp-in.l.google.com:25...
  === Connected to gmail-smtp-in.l.google.com.
  <-  220 mx.google.com ESMTP a199si3828494itd.133 - gsmtp
   -> EHLO example.com
  <-  250-mx.google.com at your service, [xx.xx.xx.xx]
  <-  250-SIZE 157286400
  <-  250-8BITMIME
  <-  250-STARTTLS
  <-  250-PIPELINING
  <-  250-CHUNKING
  <-  250 SMTPUTF8
   -> MAIL FROM:<bob@example.com>
  <-  250 2.1.0 OK a199si3828494itd.133 - gsmtp
   -> RCPT TO:<kdskr3j2@gmail.com>
  <** 550-5.1.1 The email account that you tried to reach does not exist. Please try
  <** 550-5.1.1 double-checking the recipient's email address for typos or
  <** 550-5.1.1 unnecessary spaces. Learn more at
  <** 550 5.1.1  https://support.google.com/mail/?p=NoSuchUser a199si3828494itd.133 - gsmtp
   -> QUIT
  <-  221 2.0.0 closing connection a199si3828494itd.133 - gsmtp
  === Connection closed with remote host.

I wish Google would build an OAuth flow that would provide an obfuscated and unique email address to services that need an email address for logins or any communication at all.

I’d just be happy if Google made it harder for third parties implementing “login with google” to not get access to your entire org’s user list (when the org uses gsuite.)

Can you elaborate? I thought "login with Google" meant a pretty useless token (granting access to that user's basic profile info).

Can additional OAuth scopes be requested and do the third parties request the contacts permission, then harvest the organizations' contacts, and is there no setting in the admin menu to prevent either the OAuth grant or the contact access?

Lots of SaaS companies have a “sign up with google” button which gives them access to the employee directory assuming your org uses google suite. It’s unfortunate yesteryear’s shit growth hack for consumer apps is reappearing in a worse way.

[WAS: If I'm reading the Wired article[1] correctly, the haveibeenpwned.com database of email addresses has itself been leaked?]

EDIT: Replies have corrected my misunderstanding, thanks! Troy was saying that his own personal email was also in the dataset.

Hunt says some of his own information is included in the Verifications.io exposure. "The main takeaway for me is that this is just another case where someone has my data, and hundreds of millions of other people’s data, and I’ve absolutely no idea how they got it," Hunt says. "I’d never heard of the company until now and I certainly can’t ever recall consenting to their use of my data. Of course, it’s entirely possible that buried in some other service’s terms and conditions it says they’re allowed to pass my data around in this fashion, but that’s not really consistent with my expectations of how my data should be used.""

[1] https://www.wired.com/story/email-marketing-company-809-mill...

No, he's saying the leaked database has an entry for him. Not that his own database of leaks has been breached.

Ah, got it. Like, Troy is a real person who has his own email address too :)

No, he's adding this email dump to his DB. And I don't exactly understand why, I mean is there any evidence that this mongodb was actually harvested? Other than by security researchers.

That's not how security works.

If someone gains unauthorized access to a system you own, you assume it's compromised and take it down, even if logs don't show anything. If you leave a system unsecured on the internet, assume it has been compromised and take appropriate measures.

In this case, it is more likely than not that this guy was not the first to discover the database. In the absence of proof, assume it has been compromised before.

I do what others have said - I have domains and use pre-defined (DreamHost, no catch all) addresses different for most sites. I also define a bunch of generic ones I cam reach for on the go. I have about 150 now.

Is it worth it?

I don't know. It's part of the patchwork of steps I take, better than many people but by no means failsafe.

Why do you not use catch all? I do and if someone somehow should get the main mail address I'd just change it.

Company I work for was using an email verification service for the first time. We have a lot of brands, so I was being my white hat self and checking it out before we risked importing our largest subscriber brands.

It took me all of 10 minutes to find a convenient JSON endpoint with incrementing IDs that didn't disallow cross-account pulls. It wasn't a public MongoDB endpoint like the above, but we did get a pretty sweet discount rate for reporting it to them and, you know, not abusing some other customer lists with 300M+ emails.

So you still went with them after exposing their incompetence?

„They fixed the bug and promised the data would be secure now.“ /s

This is why I sign up with a different email address for every site. You need your own domain to do that properly, but Gmail users can use (for example) "email+hackernews@gmail.com" to end up with a unique address. Of course it would be simple to get your real address out of that, but I doubt anyone but a spear phisher would bother.

The fact that the company shutdown their website is a little bit sketchy.... I wonder if it was a legit operation.

> .. Troy Hunt is adding the Verifications.io data to his service.

Now my data is in another database. Sigh.

Time to advise your friends to be extra cautious against phishing attempts.

I honestly don't see how this could have went in a positive direction, even if it was stated somewhere deep in the TOS that they might use the data.

An interesting situation nonetheless.

Unsecured MongoDB as the default configuration. No wonder it was leaked.

I have more than that in my spam folder

HN, Hug of Death?

There needs to be a wall of shame

Fair warning on the second link: it's Tumblr and redirects https -> http (probably intentionally, given the content), and I didn't see an obvious way to search.

The first is an awesome resource, and they publish password leaks as well, and they have a secure way to check if your password has been in a leak (my password manager, Bitwarden, integrates with them).

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact