
How to get gmail.com banned (2011) - makmanalp
http://mailinator.blogspot.com/2011/05/how-to-get-gmailcom-banned-not-that-i.html
======
zinxq
Wow. Do my daily HN scan for the day and find an article you wrote ~4 years at
#1.

I hadn't read that in many years, and what fun to do a re-read.

Thanks Internet - don't stop being you.

~~~
dice
I was sad to see that the link to the domain generator was broken. The new one
on the home page is a div that's generated server-side.

I hope you don't mind that I wrote a quick one-liner to see if you're still
detecting bots...

    
    
        @bobmail.info
        @zippymail.info
        @thisisnotmyrealemail.com
        @spamhereplease.com
        @safetymail.info
        @suremail.info
        @mailinator2.com
        @spamherelots.com
        @mailinator2.com
        @spamhereplease.com
        @spamherelots.com
        @spamherelots.com
        @mailinator.net
        @mailinator.net
        @mailinator2.com
        @mailinator.net
        @mailinator.net
        @mailinator2.com
        @mailinator.net
        ...
    
    

Yup :)

I didn't see any "evil" insertions, though...

~~~
nathanm412
I tried something similar and got a similar pattern. After a few changes, I
just started seeing mailinator and mailinator2 over and over.

------
CWuestefeld
A few years back we came into work one morning to find that some bot was
scanning our site so hard that it seemed the lights nearly dimmed. Some
detective work suggests that it was a service performed on behalf of a
competitor, to get our price list (bear in mind that our catalog has a few
hundred thousand products).

We were really annoyed that rather than just ask us, they had launched what
amounted to a DDOS attack. So we thought about how we might exact vengeance...

After a few hours we figured out a pattern to the rogue requests that allowed
us to filter them, despite their efforts at stealth (like, they cycle through
a list of various user agent strings to make it look like there are multiple
different users). We toyed with the idea of, rather than outright banning
them, making our pages sensitive to their presence, so that when we detected
them, we'd display a false price, defeating their whole operation.

We finally just decided to take the high road, temporarily banning any rogue
IP addresses we detected (we couldn't make it permanent because many of the
requests came from the Amazon cloud, from which we also receive some
legitimate requests)

EDIT: you wouldn't think that requests for a few hundred thousand products
would amount to a DDOS, but the bot was rather poorly written and grossly
inefficient in the way it walked through the list.

~~~
adrianpike
I built a system called caltrops that did almost exactly that. As a given
session's requests grew more and more suspicious, their data would skew from
reality further and further. A real user on the line would notice immediately
(and the more real-looking the user interactions, the more it would reduce
suspicion), but competitors scraping our data would get pretty deliciously
bunk data.

~~~
motocycle
to deal with similar problems, kickstarter built a pretty useful tool called
rack-attack [https://github.com/kickstarter/rack-
attack](https://github.com/kickstarter/rack-attack)

------
zer00eyz
"Thousands of people use Mailinator everyday, so clearly, its a useful tool
that many sites accept"

How many of you would have an outright revolt on your hands from your QA/QE
folks if you banned mailinator? I think everyplace I worked would experience
this same issue if we did this.

~~~
nkassis
Could use + in the first part of email such as: youremail+blahblah@example.com
to create throwaways. most sites consider those to be different email address
then youremail@example.com for account purposes but email service, who respect
the rfc, will threat them as the same.

~~~
paulmd
Many sites won't accept email addresses with + in them, because many devs have
extremely wrongheaded ideas about validation.

I used to have a first.m.last@university.edu address and that one was touch-
and-go as well due to the fact that the mailbox had two .'s in it. I actually
had to file a support request to get Amazon Student to accept it, even. Nobody
from a university with that scheme ever registered before?

For the record, the gold standard for email validation is "send a confirmation
link and see if they click it". Don't try and get fancy.

One other trick is that Gmail ignores .'s in addresses entirely.
first.last@gmail.com is the same as firstlast or f.irstlast.

~~~
GauntletWizard
My favorite is e-mails with three dots in them. Which is actually not a valid
address - the RFC specifies that you must have a valid textual character
between dots[1]. However, because of poor decisions by Japanese telcoms, a
substantial chunk of their users have 'e-mails' associated with their mobile
phones with three dots, breaking goddamn every sensible validation script.

[1]
[https://tools.ietf.org/html/rfc2822#section-3.2.4](https://tools.ietf.org/html/rfc2822#section-3.2.4)

~~~
ori_b
There's only one sensible validation script: "Send an email with a confirm
link".

~~~
GauntletWizard
Sadly, if you're sending e-mail sanely, your mail provider likely validates
recipients, and will be annoyed at you if you send them recipients they think
are bogus.

~~~
belovedeagle
The relevant rfc (on mobile; don't remember which) specifically states that
intermediate servers must not validate mailboxes (local parts). And honestly
the domain should be "validated" by the server doing an mx lookup; let dns
handle it.

------
8ig8
One way to get around a domain blacklist is to point your own domain to
Mailinator. Heck, since last year you can even get your own private
Mailinator...

[http://mailinator.blogspot.com/2014/10/mailinator-
launches-p...](http://mailinator.blogspot.com/2014/10/mailinator-launches-
private-domains_25.html)

------
jessaustin
This reminds me of the sites that discouraged hotlinking by examining Referer
and then sending Goatse.

~~~
kpcyrd
[http://ascii.textfiles.com/archives/1011](http://ascii.textfiles.com/archives/1011)

~~~
jessaustin
I'm glad I was brave enough to click. b^)

------
scoj
That was a fun read.

It took me a bit to get my head around the use cases. It's sometimes amazing
how many different ways you can twist a simple (complex really) thing like
email into a product/idea.

------
brobinson
This is great, but it seems like he got rid of the separate page now (the link
in the article 404s) and the text is just inline again.

------
thetruthseeker1
I love mailinator!

However tricking site scrappers may not work perfectly if the site scrappers
maintained a list of websites in their "whitelist". Say if I am scrapping
mailinator.com for domain names, if I see gmail.com or yahoo.com, I might just
not put them in my database because they are in my whitelist.

------
fapjacks
I've used Mailinator for years and it's always interesting to read what this
dude has to say.

------
codexon
Mailinator seems to have added some other anti-scraping detection.

Unfortunately it does not work very well as I was not scraping mailinator, but
still somehow got IP banned. Fortunately my ip has changed. But they
definitely have some strange and overzealous method now.

------
simi_
Here's a list of disposable email domains if you'd really like to block them:
[https://github.com/lavab/disposable](https://github.com/lavab/disposable)

I would go one step further and look for {spam_words} in
"username+{text}@{googledomain}.com", where spam_words can be "junk", "spam",
etc. This is like a _very_ narrow edge case, but still might catch something.
Again, if you're into that kind of thing; I'm quite skeptical that it brings
any value.

~~~
octo_t
until you have a the german guy, Joseph Unker, with junker89@gmail and your
validation prevents them signing up :)

~~~
simi_
That's not at all what I meant. Gmail redirects emails to
"youremail+whateveryouwanthere@gmail.com" to "youremail@gmail.com". Some
people filter emails this way, henceforth my suggestion to check for this edge
case.

[http://gmailblog.blogspot.de/2008/03/2-hidden-ways-to-get-
mo...](http://gmailblog.blogspot.de/2008/03/2-hidden-ways-to-get-more-from-
your.html)

PS: The downvoters are downright silly on this website lately.

------
w8rbt
Great story! Before deciding to use blacklists or lockouts (on anything), know
that it can and will be used against you.

------
serve_yay
So much fun to read. Great post, thank you.

------
belovedeagle
Most of the comments here about '+' parts are rendered completely irrelevant
by single-user domains. Not to mention "one email ~= one person" schemes.

------
botbot
Why not encode the domain strings into an image?

OCR requires a lot more programming effort compared to a text-based content
scraper

~~~
tempestn
FTA: "Could I make it harder to scrape? Well, I could, but wouldn't really
slow anyone down much."

I think that's the basic idea. He could spend his time making it harder to
scrape, like the bar across the steering wheel. Some people would be deterred,
others wouldn't, and time would be wasted all around.

------
tegansnyder
I'm not sure his method would prevent a headless scraper like CasperJS or
PhantomJS from doing the dirty work, but nice technique nonetheless.

~~~
geofft
At least at the time of writing, if you had enough foresight and engineering
time to set something like that up, you had enough foresight and engineering
time to not make your system treat email addresses as meaningful identities.

~~~
nirvdrum
Perhaps I'm missing something, but an extremely high percentage of the sites I
have accounts on use my email address for authentication. Those that don't
often suffer from username squatting. Maybe most sites are just doing it
wrong, but what's the prevailing alternative?

~~~
geofft
Your email address isn't your _identity_. It's a name associated with your
identity, but the identity itself is your account. Or put another way, not all
valid email addresses are valid identities for these websites.

If the website is doing things right, they have other means (like a CAPTCHA at
the least, or phone verification, or you buying an item from them) before
deciding that an email address really is an identity.

~~~
nirvdrum
I guess I still fail to see the distinction. CAPTCHAs really only keep out
bots . . . they do nothing for keeping out Mailinator abuse. Throwaway phone
numbers are easily obtainable. They might not be as cheap as Mailinator, but
the point is Mailinator made it faster and cheaper for people. Buying an item
doesn't really work out when the expectation is you offer a free trial and
that's where the bulk of abuse occurs.

I realize this was a non-comprehensive list and I'm not trying to just attack
it. I think I agree with the core assessment around what constitutes an
identity. But short of some really draconian methods, I think you're basically
trading off one insufficient method for another. And at that point, you may as
well focus on making things easy for people, which typically means just
working with email verification.

FWIW, when faced with Mailinator abuse I resorted to requiring a credit card
number to sign up for a trial of my SaaS product. The abuse stopped
immediately. But there were other impacts to the business as a result. I still
debate the wisdom of it and how much of this should have been foresight. As a
bootstrapped company, dealing with abuse was just a resource drain and forced
me to focus my efforts on dealing with a segment of the population that was
never going to give me money. Suffice to say, it was all very disheartening.

Anyway, thanks for sharing your thoughts on the matter.

