

Ask HN: reCaptcha isn't stopping spam, what should I do? - Prefinem

Hey guys, I recently built a small posting site and put it up on a previously used url that I had to test it.  Within a few hours I had dozens of spam posts.  So I installed reCaptcha and tested it (to make sure it was working) and then cleaned all the spam out.<p>But even with reCaptcha installed, I am still getting dozens of spam posts a day still.  Is there a better solution?  I have looked into akismet and mollom but I am not wanting to subject post content to third parties.  Is there a better captcha system?<p>I also have included a form field hidden from the user to catch bots but I do not think it is very effective.<p>Thanks for the help
======
hashtree
One time tested approach for me is to very precisely measure how long the
fastest human could fill out a specific form in ms (must be done per form, and
must consider browser autofilling). Then, include an encrypted timestamp value
as a hidden field value on said form and check that:

1) The form was recently submitted (i.e. you cannot submit forms from two
hours ago). Done well with another approach I take, this also catches bots
that try to resubmit already cracked form instances. This is a bigger issue
than you might give it credit for. Often they will crack a form instance by
hand and then submit variations of fields they care to spam in
programatically. Crack once by hand, submit spam 10000 times automatically
therafter.

2) That the delta between receiving the form submission and when it was
generated is greater than how long the fastest human would take.

It has a throttling effect to spamming (if nothing else), in addition to
preventing most programatic spam. It is also nice that it does not depend on
client-side javascript that can be tampered with. Used in combination with
some other approaches, I have several sites that serve millions of users a
year that all but remove the need for captchas (contact me if you are
interested).

~~~
Prefinem
I was thinking about this, but the form submission could honestly take less
than a second by a person since all you have to do is type a comment.
(Thinking of someone posting 'lol') to code. I have had users complain when
they can't post fast enough (generally when two or more are in a vivid
discussion) and always end up turning off the "wait between posts" check for
forums like vB, xF, phpBB, etc..

Do bots submit instantly? Would just inserting a timestamp with javascript fix
it?

~~~
hashtree
It has worked for me with forms that take less than 1000ms. For simple one
field forms, they typically have a reasonable length requirement on my
platforms (e.g. 32 characters). Things that fit in less than that length are
typically things that should be tags (e.g. funny, insightful, etc). Trying
this 32 char minimum myself just typing jibberish gets me over the minimum
needed time to detect bots.

~~~
Prefinem
I see... that makes sense... I will have to check with some of my user base
and see how they feel about this

------
notlisted
I've had a lot of success by including a field called email in my form, hiding
it with css so humans cannot see it, and whenever a submission/login
request/post is received, I merely check to see if the email field has been
set.

If so, I know it wasn't a human that submitted the post - they could not see
it - whilst automated spam tools seemingly cannot resist entering something in
that field. I emailed suspect postings to myself with IP address info to add
them to an IP block-list later, but showed them a fake "success" page or a
"your post has been selected for moderation" page. The latter turned out to be
more effective as it resulted in fewer repeated attempts.

I removed the annoying reCaptcha code altogether as a test, and never had to
reinstate it. Real users hated the reCaptcha thing anyhow.

~~~
Prefinem
User's hating the captcha has been my biggest fear with this. I will
definitely look into the hidden email field as well. That seems more user
friendly. Do you hide the element with css or some other way?

------
computer
Is the spam very specifically targeted at your site? If not, just implement
your own very simple captcha system and see if they can handle that.

Spammers generally use captcha solving APIs which map to humans in low-wage
countries. They pay ~$2/1000 solved captchas (a few years ago, not sure what
it's like now.)

If you're not a specific target, changing your captcha might be enough to no
longer easily be a victim of such a service without changes to the spammer
software.

~~~
Prefinem
I am not sure how to tell if it targeting the site. I will look into creating
another captcha to see if that will help

~~~
computer
Perhaps as an experiment try "enter the first character of your post/comment".
That would kill any remote human captcha solvers (since they don't actually
know the post content), and likely require some rewriting of the spamming
software, assuming that is automated.

Of course, this is not a long term perfect solution against motivated
adversaries, but it's a way to see how the current spammers work.

Another: Add a keydown handler to your message-textarea and log (to a hidden
form field) how many key presses are being used per post. If the spam software
is setting the content field programmatically, you then know how to detect
them.

~~~
Prefinem
That is a great idea... much simpler than I had imagined. I will implement
this tomorrow to base off results from today

------
nekitamo
Hi Prefinem,

Include a small piece of Javascript that sets a hidden field in your form to
some password when the page loads.

Then, when the form is posted, verify on the server that that field has been
set to the password.

This should get rid of most of your spam. The reason it works is that you are
being hit with spam from a program called "Xrumer", which doesn't emulate the
Javascript of the pages it interacts with. It simple brute forces the forms on
every single page it finds, and solves their captchas too. However, even the
simplest Javascript will stop it from working.

~~~
Prefinem
Thanks nekitamo! I will try this after a change I will do tomorrow morning.

------
Misiek
You can change an action of the form using Javascript (ie.
[http://stackoverflow.com/questions/2701041/how-to-set-
form-a...](http://stackoverflow.com/questions/2701041/how-to-set-form-action-
through-javascript)). Spam robots don't emulate Javascript. Users with no
Javascript will see the message "This action requires Javascript".

~~~
Prefinem
The problem is you would lose the ability for users to post this way. Seems
strange to cut out a percentage (albeit probably very small) of users to stop
spam

------
roadg33k
Would you be willing to share the link?

Also, was there any significant reduction in the spam posts after installing
reCaptcha?

~~~
Prefinem
There was reduction. Like I said, I had several dozen in a few hours, and then
it went to about several dozen in a day...

The link is [http://wtpaf.com](http://wtpaf.com)

FYI, I have cleared out everything just a few minutes ago while working on the
site.

~~~
roadg33k
Captcha seems to be working fine. Normally, you don't see bot bypassing it
that easily. Can you record the user-agent for each submission?

~~~
Prefinem
I can... I will start doing that and let you know what I see

