

Ask HN: How to block contact form spam? - damoncali

I've got a low traffic website that has a very typical contact form on the home page. Very rarely, it gets a legit lead. Often, it gets spammed.<p>Not being a particularly important form, the pragmatic thing to do would be to remove it. But I took it as a challenge to see if I could block the spam.<p>I've failed miserably.<p>I've tried several variations of hidden fields and javascript character insertion in an attempt to trick bots and come to the conclusion that either the bots are just better than I am, or they're human, and I have no hope.<p>So what are the typical contact form spam methods? How do you stop them?<p>Is there a trick to doing this that I'm just not seeing?<p>EIDT: by the way, captchas for users are not an acceptable solution. It must be transparent to an actual person filling out the form.
======
gus_massa
I think that it is easier to filter the messages before reading them, for
example with the Gmail spam filter or something like SpamBayes (
<http://spambayes.sourceforge.net/> ).

You can help the filter a little with a static "captcha", like 'Write the word
"orange".' or 'Calculate: 2+2=' . Or make a special filter rule to test the
"captcha". (The real captcas are more difficult to configure and annoying to
the real users.)

~~~
metachris
Another technique is implementing a honeypot field which is hidden with
javascript. It's supposed to be empty, but bots often fill them out too.

I recommend to evaluate Akismet[1], the blog spam detection api from
Automattic. It worked very well for me.

[1] <http://akismet.com/>

~~~
damoncali
I tried the javascript method - didn't work at all. Not even once.

------
traskjd
I don't have an answer, but here's a write up about some approaches that I
took earlier in the year but failed with:

[http://www.mindscapehq.com/blog/index.php/2011/04/03/a-faile...](http://www.mindscapehq.com/blog/index.php/2011/04/03/a-failed-
attempt-at-stopping-spam-bots/)

~~~
damoncali
I'm getting a 500 on that page (and the root, for that matter).

------
pdenya
i use

<input type="text" name="email" style="display: none;" value="" />

and verify that it's blank on the backend and if it's not I don't send the
email. It doesn't block 100% but I get very few come through.

------
petervandijck
1\. Remove the contact form.

2\. Put a Gmail email address (in plain text) instead.

Gmail's spam filtering is better, and lowering the bar may actually get you
more qualified leads.

~~~
damoncali
Yes, that would be the easy thing. Is there no clever way to stop bots at
least at the form?

~~~
epc
No.

Detailed version: there's too many types of bots seeking too many different
paths into web sites to be worth your time trying to write code to block each
currently known bot method.

They're hitting the form for multiple reasons: \- probing for potential
XSS/CSRF exploits \- probing for places to publish spam content \- probing for
potential SQL injection exploits \- probing for exploits no one can foresee at
the present time

You could try:

\- the hidden field route #1 - if the field is filled in, assume it's a bot
and reject the content with a 4xx, maybe a 400 or 410

\- the hidden field route #2 - assuming the form is dynamically generated,
fill a hidden field with a unique key, like a uuid derived from the current
time and IP address. If you get a form submission with a duplicate key to one
you've already received then reject it as a bot.

\- if a preponderance of bots are originating from specific networks, ban the
networks using mod_security or your preferred method

Do you echo back the content of the feedback form? If so, stop doing this as
the bots may be checking to see if what they submitted was published on your
site and marking your site as either good for publishing spam content, or
possibly exploitable through CSRF/XSS, or both.

Or you could just publish an email address.

~~~
damoncali
_Or you could just publish an email address._

This is what I've done where it matters. I just have this one site where it
doesn't really matter, so I'm trying to do this more as an exercise than
anything. It's humbling to see how ineffective I have been at stopping it.

~~~
epc
I've been fighting spam bots of one form or another since 1994. I think it's
actually easier now to use a plain email address on a web site as a contact
method than it was ten years ago…almost everyone gets that you can click on an
address (assuming it’s linked) and kick off an email.

Certainly it’s worth learning various techniques since there’s going to be
many forms that you need to have on your site(s) which can get hit by bots.
But if the entire purpose is to allow inbound contacts, and you don’t need to
drag along any transactional or demographic information with the contact, an
plain email address should suffice. Whatever email address you use will get
added to every spammer’s contact list so make sure it’s a generic address that
goes into a separate mailbox.

