Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How to block contact form spam?
3 points by priv_acy on Nov 13, 2011 | hide | past | favorite | 11 comments
I've got a low traffic website that has a very typical contact form on the home page. Very rarely, it gets a legit lead. Often, it gets spammed.

Not being a particularly important form, the pragmatic thing to do would be to remove it. But I took it as a challenge to see if I could block the spam.

I've failed miserably.

I've tried several variations of hidden fields and javascript character insertion in an attempt to trick bots and come to the conclusion that either the bots are just better than I am, or they're human, and I have no hope.

So what are the typical contact form spam methods? How do you stop them?

Is there a trick to doing this that I'm just not seeing?

EIDT: by the way, captchas for users are not an acceptable solution. It must be transparent to an actual person filling out the form.



I think that it is easier to filter the messages before reading them, for example with the Gmail spam filter or something like SpamBayes ( http://spambayes.sourceforge.net/ ).

You can help the filter a little with a static "captcha", like 'Write the word "orange".' or 'Calculate: 2+2=' . Or make a special filter rule to test the "captcha". (The real captcas are more difficult to configure and annoying to the real users.)


Another technique is implementing a honeypot field which is hidden with javascript. It's supposed to be empty, but bots often fill them out too.

I recommend to evaluate Akismet[1], the blog spam detection api from Automattic. It worked very well for me.

[1] http://akismet.com/


I tried the javascript method - didn't work at all. Not even once.


I don't have an answer, but here's a write up about some approaches that I took earlier in the year but failed with:

http://www.mindscapehq.com/blog/index.php/2011/04/03/a-faile...


I'm getting a 500 on that page (and the root, for that matter).


i use

<input type="text" name="email" style="display: none;" value="" />

and verify that it's blank on the backend and if it's not I don't send the email. It doesn't block 100% but I get very few come through.


1. Remove the contact form.

2. Put a Gmail email address (in plain text) instead.

Gmail's spam filtering is better, and lowering the bar may actually get you more qualified leads.


Yes, that would be the easy thing. Is there no clever way to stop bots at least at the form?


No.

Detailed version: there's too many types of bots seeking too many different paths into web sites to be worth your time trying to write code to block each currently known bot method.

They're hitting the form for multiple reasons: - probing for potential XSS/CSRF exploits - probing for places to publish spam content - probing for potential SQL injection exploits - probing for exploits no one can foresee at the present time

You could try:

- the hidden field route #1 - if the field is filled in, assume it's a bot and reject the content with a 4xx, maybe a 400 or 410

- the hidden field route #2 - assuming the form is dynamically generated, fill a hidden field with a unique key, like a uuid derived from the current time and IP address. If you get a form submission with a duplicate key to one you've already received then reject it as a bot.

- if a preponderance of bots are originating from specific networks, ban the networks using mod_security or your preferred method

Do you echo back the content of the feedback form? If so, stop doing this as the bots may be checking to see if what they submitted was published on your site and marking your site as either good for publishing spam content, or possibly exploitable through CSRF/XSS, or both.

Or you could just publish an email address.


Or you could just publish an email address.

This is what I've done where it matters. I just have this one site where it doesn't really matter, so I'm trying to do this more as an exercise than anything. It's humbling to see how ineffective I have been at stopping it.


I've been fighting spam bots of one form or another since 1994. I think it's actually easier now to use a plain email address on a web site as a contact method than it was ten years ago…almost everyone gets that you can click on an address (assuming it’s linked) and kick off an email.

Certainly it’s worth learning various techniques since there’s going to be many forms that you need to have on your site(s) which can get hit by bots. But if the entire purpose is to allow inbound contacts, and you don’t need to drag along any transactional or demographic information with the contact, an plain email address should suffice. Whatever email address you use will get added to every spammer’s contact list so make sure it’s a generic address that goes into a separate mailbox.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: