

Ask HN: On crawling email ids - amolgupta

People use the format abc[at]companyname[dot]com to escape crawlers from getting their email ids when posting on public formus. How efficient is that?Cant the crawlers just change their regex to parse this patterns as well ?
======
27182818284
Yes and more to that point, it is trivially easy. A Stanford class had you
case out for the [at] in an early assignment of writing a spam bot a few years
ago.
[http://www.google.com/recaptcha/mailhide/apikey](http://www.google.com/recaptcha/mailhide/apikey)
might be more effective, but the best thing is just to have a great spam
filter and accept that someone will guess or find your email rather than
trying to hide it.

Remember that an email address can be spread around more than by crawlers too.
Sign up to a grocery store discount card with it? It is in some for-sale
database somewhere.

------
wglb
Pure speculation here: crawlers go for bulk and likely don't care if they pick
up garbage. There may be another level of email harvesting that goes to the
level that you suggest, but in seeing all the conventions that are used like
the one you show, such code would have to cover lots of them. Return might be
very low.

~~~
stevekemp
They really don't care about accuracy at all.

For example I regularly receive email delivery attempts to message-ids - which
obviously look like email addresses but are not.

------
jrs235
I would assume some crawlers are configured to find and scrape email addresses
using that (now) "de facto" form and similar ones. With that said, I recently
saw someone use a form similar to:

Email me at abc shift+2 key companyname period com

which, until that becomes more common, might offer better protection from
scrapers.

------
HaseebR7
just take a screenshot of the text. like this

[http://i.imgur.com/NF2nqbO.png](http://i.imgur.com/NF2nqbO.png)

