Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: reCaptcha isn't stopping spam, what should I do?
6 points by _ummc on Aug 28, 2013 | hide | past | favorite | 22 comments
Hey guys, I recently built a small posting site and put it up on a previously used url that I had to test it. Within a few hours I had dozens of spam posts. So I installed reCaptcha and tested it (to make sure it was working) and then cleaned all the spam out.

But even with reCaptcha installed, I am still getting dozens of spam posts a day still. Is there a better solution? I have looked into akismet and mollom but I am not wanting to subject post content to third parties. Is there a better captcha system?

I also have included a form field hidden from the user to catch bots but I do not think it is very effective.

Thanks for the help




One time tested approach for me is to very precisely measure how long the fastest human could fill out a specific form in ms (must be done per form, and must consider browser autofilling). Then, include an encrypted timestamp value as a hidden field value on said form and check that:

1) The form was recently submitted (i.e. you cannot submit forms from two hours ago). Done well with another approach I take, this also catches bots that try to resubmit already cracked form instances. This is a bigger issue than you might give it credit for. Often they will crack a form instance by hand and then submit variations of fields they care to spam in programatically. Crack once by hand, submit spam 10000 times automatically therafter.

2) That the delta between receiving the form submission and when it was generated is greater than how long the fastest human would take.

It has a throttling effect to spamming (if nothing else), in addition to preventing most programatic spam. It is also nice that it does not depend on client-side javascript that can be tampered with. Used in combination with some other approaches, I have several sites that serve millions of users a year that all but remove the need for captchas (contact me if you are interested).


I was thinking about this, but the form submission could honestly take less than a second by a person since all you have to do is type a comment. (Thinking of someone posting 'lol') to code. I have had users complain when they can't post fast enough (generally when two or more are in a vivid discussion) and always end up turning off the "wait between posts" check for forums like vB, xF, phpBB, etc..

Do bots submit instantly? Would just inserting a timestamp with javascript fix it?


It has worked for me with forms that take less than 1000ms. For simple one field forms, they typically have a reasonable length requirement on my platforms (e.g. 32 characters). Things that fit in less than that length are typically things that should be tags (e.g. funny, insightful, etc). Trying this 32 char minimum myself just typing jibberish gets me over the minimum needed time to detect bots.


I see... that makes sense... I will have to check with some of my user base and see how they feel about this


That's good approach and it just works.


I've had a lot of success by including a field called email in my form, hiding it with css so humans cannot see it, and whenever a submission/login request/post is received, I merely check to see if the email field has been set.

If so, I know it wasn't a human that submitted the post - they could not see it - whilst automated spam tools seemingly cannot resist entering something in that field. I emailed suspect postings to myself with IP address info to add them to an IP block-list later, but showed them a fake "success" page or a "your post has been selected for moderation" page. The latter turned out to be more effective as it resulted in fewer repeated attempts.

I removed the annoying reCaptcha code altogether as a test, and never had to reinstate it. Real users hated the reCaptcha thing anyhow.


User's hating the captcha has been my biggest fear with this. I will definitely look into the hidden email field as well. That seems more user friendly. Do you hide the element with css or some other way?


This is actually pretty smart. Love the idea. I think I am going to give this a shot soon.


Is the spam very specifically targeted at your site? If not, just implement your own very simple captcha system and see if they can handle that.

Spammers generally use captcha solving APIs which map to humans in low-wage countries. They pay ~$2/1000 solved captchas (a few years ago, not sure what it's like now.)

If you're not a specific target, changing your captcha might be enough to no longer easily be a victim of such a service without changes to the spammer software.


I am not sure how to tell if it targeting the site. I will look into creating another captcha to see if that will help


Perhaps as an experiment try "enter the first character of your post/comment". That would kill any remote human captcha solvers (since they don't actually know the post content), and likely require some rewriting of the spamming software, assuming that is automated.

Of course, this is not a long term perfect solution against motivated adversaries, but it's a way to see how the current spammers work.

Another: Add a keydown handler to your message-textarea and log (to a hidden form field) how many key presses are being used per post. If the spam software is setting the content field programmatically, you then know how to detect them.


That is a great idea... much simpler than I had imagined. I will implement this tomorrow to base off results from today


They're both good ideas, but bear in mind that the keydown detection may trigger under other conditions (eg I use a plugin to let me edit text fields in external vim. People pasting quotes/urls could also be odd keystroke numbers)


I could just check for 1 or greater. I wouldn't have to check for an equal amount


Hi Prefinem,

Include a small piece of Javascript that sets a hidden field in your form to some password when the page loads.

Then, when the form is posted, verify on the server that that field has been set to the password.

This should get rid of most of your spam. The reason it works is that you are being hit with spam from a program called "Xrumer", which doesn't emulate the Javascript of the pages it interacts with. It simple brute forces the forms on every single page it finds, and solves their captchas too. However, even the simplest Javascript will stop it from working.


Thanks nekitamo! I will try this after a change I will do tomorrow morning.


You can change an action of the form using Javascript (ie. http://stackoverflow.com/questions/2701041/how-to-set-form-a...). Spam robots don't emulate Javascript. Users with no Javascript will see the message "This action requires Javascript".


The problem is you would lose the ability for users to post this way. Seems strange to cut out a percentage (albeit probably very small) of users to stop spam


Would you be willing to share the link?

Also, was there any significant reduction in the spam posts after installing reCaptcha?


There was reduction. Like I said, I had several dozen in a few hours, and then it went to about several dozen in a day...

The link is http://wtpaf.com

FYI, I have cleared out everything just a few minutes ago while working on the site.


Captcha seems to be working fine. Normally, you don't see bot bypassing it that easily. Can you record the user-agent for each submission?


I can... I will start doing that and let you know what I see




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: