
A negative captcha - capex
https://github.com/subwindow/negative-captcha
======
meowface
This is probably good to avoid general low-hanging-fruit spam, but anyone who
spends even a second manually configuring something to spam your exact site
will be able to get past this with extreme ease.

So for most sites, it'd probably be a good idea to put things like this in
place but then to also be ready to roll out reCAPTCHA at any given moment, in
the event a serious spam campaign begins.

~~~
jtheory
Yes, it'd be easy to get around this for a single site.

But the people writing bots _really do not_ want to have to tweak their bot
for any single site. And most of the people _using_ bots cannot write their
own.

I've been using a very simple negative captcha for many years with great
success. I just don't get bot spam, and before I put it in place I got lots of
bot spam.

There are some gotchas, though -- the main one that this project doesn't seem
to know about are form-fillers, like Google Chrome, RoboForm, etc. -- i.e.,
bots that you _want_ to be able to use your form.

My first version of a negative captcha tried to be sneaky -- I called the
field "name" or something like that, and called the real name field "name2".
This was a disaster; the form-fillers all put values into the name field, and
suddenly tons of users (especially Google Chrome users) were unable to submit
comments to me... and so quite probably a lot of them were unable to find a
way to even _tell_ me it was broken. Problems like this are a very good
reminder to _always_ use helpful, kind error messages even when you think
you've just caught a spammer or some other nasty. You may have caught an
actual customer.

My current version names the field something obviously NOT a standard field
name, and gives it a label which is also not a standard field name (this was
important), and I even clear the field with JavaScript on form submission just
in case. The name of the field is hard-coded, as are all of the other field
names on the page.

I have been ready to roll out more clever versions, but the years go by, and
there's still no need yet.

Unless you have a site that's a big target for bot spammers, I highly
recommend you just roll your own, and leave it extremely simple unless the
spam returns (someday it will, presumably, but it has been at least 6-7 years
and I'm still on the so-simple-it's-silly version).

This library could work with some tweaks and simplification, but currently I'd
worry about form-fillers in real users' browsers. It hashes all of the real
field names and makes the honeypot fields look valid. So a user who is
accustomed to having their name/address filled in will first find that
function is broken -- next they'll find themselves accused of being a bot when
they submit the form (after tediously filling it out by hand).

~~~
meowface
It depends on the kind of spam you're fighting against.

I speak from the perspective of someone who often creates websites that
various groups of people, for whatever reason, would very much like to cause
chaos on. I use "spam" to mean both "advertising spam" as well as "distributed
flooding" (sometimes in the form of so-called "shitposting", and/or just
random text and images).

Heavy spamming can be an effective form of DoSing if it's not limited or
controlled well enough. There are many people out there who take great
personal pleasure in disrupting or reducing the quality of a service.

If you want to protect against typical pharma spam, this will be good. If you
want to protect against the kind of thing I described above, then only a
service like Cloudflare can help, and even then it can only help you so much.

(It's not easy to bypass Cloudflare from a straight bandwidth DDoS
perspective, but it's not too hard if you just want to get through its anti-
bot filtering to post spam. In such a case you can enable reCAPTCHA from the
Cloudflare security panel, though.)

~~~
stevekemp
To give a perspective on numbers I host/maintain blogspam.net and we've
blocked 1 million spam comments in just over two weeks.

That's a hell of a lot of spam.

------
eli
I'd urge real caution in using a honeypot field with a meaningful name.
Various crappy browser plugins will helpfully auto-fill a field named "name"
even if it's hidden. Also consider how the field will look to screen readers.

~~~
kijin
<div style="position: absolute; left: -2000px;"> will definitely cause
problems with screen readers.

These problems can be mitigated to some extent if you have multiple tests and
allow clients to fail some of them. That's the approach I used in my latest
project. There are approx. 10 tests. Some use CSS and JS. Some use random
tokens and hashes. Some are based on assumptions about the target demographic,
such as their location, preferred language, and browser capabilities. (e.g. If
you're accessing a mobile site with an IE7 user agent, something's fishy.)

Fail any one of them, and your submission is accepted but flagged. Fail two,
and your submission is rejected. Fail two X times in Y minutes, and your IP is
banned for Z hours. (The fail2ban approach is useful when someone tries to
brute-force your hashed field names.)

Occasionally, of course, you're going to run into a legitimate client who
fails two or more tests simultaneously, e.g. someone who uses a screen reader
with a crappy autofill plugin. But those cases should be exceedingly rare. In
any case the rules can be easily adjusted.

~~~
eli
That sounds too complicated and fragile for me to use.

Really, I was just getting at that you should probably label the field as "Do
not type anything in this box" so users with assistive devices don't get
trapped. I'd also provide a useful, non-cryptic error message for if the
captcha fails.

------
neotek
I've been using this technique with great success for a while now, and still
catch hundreds of fake registrations a day[1]. The few accounts that slip
through are fairly easy to detect algorithmically, and it seems that they're
mostly created by real people, usually in China or Russia, presumably for a
small fee.

[1] [http://i.imgur.com/Y6lEFGS.png](http://i.imgur.com/Y6lEFGS.png)

~~~
tantalor
More details please?

* What is your site?

* How do you classify spam accounts?

* Which anti-captcha did you use? How does it work?

* You only showed 1 day after the change. Was the drop in spam account registration permanent? I would expect it to immediately bounce back the next day once the adversary figured out what you changed.

~~~
neotek
The graph shows all the attempted spammer registrations I caught over the last
month, the count of '1' for today is because the clock had just ticked over
when I generated the image.

~~~
tantalor
Ah, so how many were you receiving before the change?

------
latchkey
I implemented this years ago for kink.com (nsfw). It really helped quite a bit
with the automated logins. It also shows a captcha after a number of
unsuccessful logins or if someone is clearly sharing an account. We called it
the 'cockblocker'. Heh.

~~~
Jhsto
Porn sites are pray of bruteforcers, who use thousands of proxies to scan for
working accounts. I know people who do this and I can say that no negative or
usual CAPTCHA stops them. What you would need is a lazy loaded CSRF field,
which breaks their bruteforcing applications.

~~~
latchkey
We just showed a captcha after enough failures. If we saw a massive spike of
failures, we'd just add their ip to our firewall and disappear of the net for
a period of time. This stops 99.99% of the attacks. For the rest of them, the
feeling was that if they wanted free porn badly enough, then it was free
marketing for the company.

~~~
Jhsto
I've understood that after the page shows CAPTCHA, the bot either tries to
solve it with OCR or then it just rotates to the next proxy. By rotating
trough lets say 7000 proxies, by the time it has gone trough all of them the
first ones do not display CAPTCHA challenge anymore.

One funny thing I remember of being told is that if the main site seems to
hard to config, they just change to the mobile version of the site, which
usually has less security.

For them it really is free porn besides of hobby, as I've seen logfiles of
hundreds of subscribed accounts on them. Some people steal the credit card
information attached to them while some just preserve a giant libary of porn
accounts on demand.

When I noted this problem I coded a service which would have prevented all
these attacks made by the tools they used. I contacted various of porn and
filehosting sites, but none never replied to me. It's a pity that the site
owners either do not care or can't address the problem.

~~~
latchkey
In the case of kink, we changed things up quite a bit. An account is
meaningless until you add a subscription or kinks (the micro-currency we
developed) to it and also your account never goes away. This is different than
99.99% of the sites out there which generally tie an account to a
subscription, when the subscription is up, so is the account. Those sites
generally just add/remove lines from a simple .htpasswd file and it is those
files which usually get stolen from badly configured servers and sent around
the forums. If someone shares an account on kink (different ip's/browser
agents over a period of time), they are all automatically cockblocked... by
changing the users password, thus making them do the password recovery dance.

It is no surprise nobody contacted you back... most sites are either run by
people who have no clue about tech (and thus wouldn't be able to do anything
with you) or they are smart and don't trust some random person emailing them
saying they can fix their security issues.

------
ajmurmann
I am afraid that password managers like 1Password will get caught in this. I
always use it not only to remember my password, but also to fill in sign up
forms with my data. I am sure it would fall for the honeypot and it would be
very hard for me to find out what went wrong. I am sure the experience would
be incredibly frustrating.

~~~
tantalor
How effective are these honeypot form fields? There are only so many ways to
make a form field invisible. Simple inspection of the DOM and CSS ought to be
enough to determine whether the field is visible and nearby the other fields.

~~~
Flavius
They are so effective that you'll get dozens of support tickets from the
1Password users that can't sign up or log in.

Although this is the best way to avoid spam, I cannot recommend it unless you
don't care about lost customers.

~~~
eridius
If they can't sign up, can they even send you support tickets?

The worst part about blocking legitimate sign-ups is you may be making it
impossible to even hear their problems.

------
lwf
I can imagine this might be problematic for people using assistive technology…

~~~
67726e
There are HTML attributes you can use to denote things like "ignore this
field" for screen readers and the like.

[http://www.w3.org/WAI/intro/aria](http://www.w3.org/WAI/intro/aria)

~~~
nyrina
And the bots knows this information as well, and uses it.

------
druska
This is a form of security through obscurity. A bot could easily (relative to
a positive captcha bot) be created to check if the form fields are visible.

~~~
xfs
Captcha itself is by no way a security measure.

~~~
brandynwhite
Totally agree. Bots vs humans is not an issue of security at all in a
cryptographic sense (which that phrase refers to). For this particular task
all we have are tricks that have practical value. It isn't even clear if the
problem is meaningful in an absolute sense, while cryptographic protocols can
be clearly defined and reasoned about.

Would it even be possible to solve this problem in a serious way? If you could
then would that mean strong AI is not possible? If not then why don't we
figure out something better like asking users to actually pay for things and
then we don't have to solve these philosophical quandaries. If it's too hard
for people to pay for things then lets focus on that problem instead. If you
don't want money and just want to rate limit then look into proof of work
puzzles.

------
geerlingguy
I wrote something like this as a module for Drupal a few years ago, and it
still works very well for most sites:
[https://drupal.org/project/honeypot](https://drupal.org/project/honeypot)

The honeypot (or honey trap) is somewhat effective against most bot software,
and the timestamp protection is actually pretty effective against most human
spammers. However, there are many sites that will require active spam
prevention due to their popularity, and thus targetability.

Also, regarding accessibility, there are many ways of implementing these spam
prevention techniques without impacting accessibility, even without using JS.

------
adamb_
I feel like having this knowledge be widespread is somewhat self-defeating, as
if it were to catch on to any capacity bot makers would improve their bot's
ability detect which is a "valid" form.

------
julianz
This seems like it's going to bugger up LastPass and it's ilk completely. That
would be annoying to say the least.

~~~
gpvos
It's only necessary in the registration form, not in the login form.

------
callum85
Won't people who use automatic form-fillers be wrongly identified as bots?

~~~
pcowans
Yes, I've experienced this exact problem in the past.

------
ars
I do this all the time when installing off-the-shelf forum/blog software.

I'll modify the form to have a field not in the original. Bot authors assume
the forum is stock and don't customize it for mine, and that's enough to block
almost all spam.

------
Kequc
This feels very hacky and not like a long term solution to the problem. With
some effort on the part of spammers they could write bots that simply read the
field labels or placeholder text, check css attributes, etc. Then you have the
problem again and your code is bastardised all to heck for your effort.

Just use positive captcha for now until an actual solution comes along but
don't start screwing around with the output of your markup. That will cause
problems when you want to change or even just run tests on things.

Markup should be readable by machines.

~~~
StavrosK
You're dismissing it too quickly. I've used this method on my stock contact
forms and it has worked 100%. Nobody, _nobody_ bothers writing custom bots for
your thing, so, no matter which method you use, if it's not something common,
it will stop 100% of spammers, if you're a low-profit site for them.

If you're Craigslist, sure, it won't work, but you probably aren't. All my
clients' contact forms thank me for using (something like) this method.

~~~
tokenizerrr
> nobody bothers writing custom bots for your thing

This is not quite true, having written several custom bots myself. Sites with
any kind of social contact will become targetted at some point.

~~~
StavrosK
Later on in the sentence: "if you're a low-profit site for them".

Still, it's a very easy way to deter large amounts of spam, no matter how good
a target you are, so the ROI is almost always positive.

------
JulianMorrison
Stuff hidden far off to the side is going to be a nuisance for blind people
using screen-readers.

------
xenophanes
my blog just does basically this in js:

    
    
      a = "cat"
      b = "dog"
      c = a + integer_to_string(234234) + b
    

then you have to submit the c value with the form. i get no spam. (a prior
somewhat simpler did get some spam). obviously this would do nothing if i was
google, but i'm a small blog and no one is going to customize a bot for me,
and apparently none of the bots want to run a full js interpreter.

there are lots of simple solutions to spam as long as no one cares about you.
and if they do care it's hard.

i do think making you do some math (maybe something serious that takes a
quarter second to run) to be able to submit comments is a good approach in
general though. basically humans have lots of spare CPU, but i'm not sure spam
bots do currently. and if posting in a lot of places required running some
math, it would rate limit spam bots.

------
krapp
I've been working on something which will automatically hash the name fields
in a form and generating all my forms with javascript. Haven't considered
adding hidden fields but that might be a good idea. And probably recaptcha
when all that inevitably fails.

Unfortunately I think spam detection is always going to be a moving target. A
lot of these techniques depend on spammers not wanting to go through the
trouble of tweaking something for your particular site, but if your site turns
out to be popular or if they have that sixth can of Red Bull and decide to get
clever there's not much you can do.

------
eksith
This is basically extending the encoded input field trick with another
(CSS)hidden field.

You can get rid of the (CSS)hidden field by using just the encoded field names
instead. That will prevent someone from just copying the HTML field and mass
submitting the same form with multiple IPs etc...

E.G. Encode all the input field names using the session ID and some salt
(maybe the URL of the page?). I've done something similar in PHP previously as
:

    
    
      public function encodedFieldName( $name ) {
      	return hash( 'ripemd160', $name . FIELD_KEY . $this->IP() );
      }
    

...Where FIELD_KEY is a pre-defined random string unique to the application or
you can set it to the user's session/cookie etc... And IP() is, well, just
getting the IP.

You can then retrieve the actual field name using something like...

    
    
      public function encodedFieldValue( $name, $fields = array() ) {	
      	$enc = $this->encodedFieldName( $name );
      	foreach ( $fields as $k => $v ) {
      		if ( $k === $enc ) {
      			return $v;
      		}
      	}
      	return NULL;
      }
    

And then you can use it as...

    
    
      $name = $this->encodedFieldValue( 'name', $_POST );
    

If you're really paranoid, you can add two extra hidden fields that's a nonce
and some unique key (maybe using the session_id)

    
    
      $nonce	= hash( 'tiger160,4', $this->someRandomStr( 10 ) );
      $pk		= hash( 'tiger160,4', $nonce . session_id() );
      
      $nonce_name	= $this->encodedFieldName( 'nonce' );
      $pk_name	= $this->encodedFieldName( 'pk' );
    

Send PK and nonce to the user in the hidden fields...

    
    
      echo "<input name='{$nonce_name}' value='{$nonce}' />"; 
      echo "<input name='{$pk_name}' value='{$pk}' />";
    

...When checking form input recalculate the PK with the nonce to see if it
matches later.

    
    
      $nonce	= $this->encodedFieldValue( 'nonce',	$_POST );
      $pk		= $this->encodedFieldValue( 'pk',	$_POST );
    
      if ( $pk ===  hash( 'tiger160,4', $nonce . session_id() ) ) {
      	return true;
      }

~~~
subwindow
Perhaps I'm misunderstanding your post, but that's exactly what this does.

------
bostonaholic
One thing I have done which works really well is:

Validate the form and fail silently if any of the following are true:

1) first_name or last_name match the email regex

    
    
      - no one is named texx12508@buyrakes.com
    

2) first_name is the same as last_name

    
    
      - I'm ok with Chris Chris having trouble filling out my form.
    

For the "Contact Us" form I fail silently to decrease spam in our inbox.

I did this because I noticed 99.99% (made up number) of our spam matched these
rules.

------
peterwwillis
There's some work published by some friends of mine called "Botnet Resistant
Coding" where they combine several techniques to defeat the average script
kiddie with a bot and force them to target someone else out of laziness and
ignorance. This seems very similar, but if your find their paper you can
incorporate more advanced techniques.

------
MattBearman
I have two forms on BugMuncher
([http://bugmuncher.com](http://bugmuncher.com)), neither have captchas, and
neither have ever received a single spam submission.

Other sites I run with much less traffic, and worse google rankings get a fair
bit of spam.

Anyone here got any ideas why this is? (not that I'm complaining)

------
Cogito
I think everyone is aware that we aren't going to be able to completely
eradicate spam. As bots become more intelligent our defence against them must
improve as well.

I like that this is a non-intrusive technique that seems to have met with some
success. It doesn't matter that the method is not perfect, nor that it is
possible for bots to engineer their way past. Innovation in the bot detection
space is the only way to keep up with the spammers.

I would be more interested to find out if the method has any impact on
accessibility, for example if screen readers are unable to use these forms. I
would guess that anything designed to be visible to bots will also be visible
to screen readers.

~~~
IanCal
> I would be more interested to find out if the method has any impact on
> accessibility, for example if screen readers are unable to use these forms.

Which could make using this illegal, depending on where you are (it would in
the UK).

------
manarth
Three critiques:

1\. Browser auto-completion of forms will break, because that typically uses
the form name to identify the data that's expected, and this plugin hashes the
form name. 2\. The page would not be cacheable, because the hash key changes
for each request. 3\. The question of form-labels isn't addressed in the
plugin. Perhaps the "real fields" do have labels - at least they _should_ \-
but the negative-captcha fields don't? That would give a clear signal on
fields to avoid. If they were to have labels, what label should be used? A
dictionary word? That would have usability/accessibility implications.

------
snowwrestler
The hidden form field has worked very well for us. We just gave it a likely-
sounding name like "address2" and set visibility:hidden on the site's main
style sheet. If the form has a value, we discard the entire form data.

Spam submissions dropped quite a bit when we implemented this.

The knock against this is that it's easily circumvented. True! But the value
is that it greatly reduces the "script kiddie" background noise. Not only is
that just nicer overall, but it also makes it easier to tell when someone is
purposefully targeting our site.

------
aleem
I have been using a similar technique with great effect:

    
    
      <input type="text" name="jscheck" value="fail" style="display:none" />
    
      <script> $('input[name="jscheck"]').val('pass'); </script>
    

On the server side you need only check if the jscheck=pass.

This comes with two caveats. If the spam bot supports JS, their submission
will pass. If a user browser has JS disabled, their submission will fail.

In production, this has worked really well for me.

------
LukeShu
I once implemented a registration page that asked for year of high-school
graduation (it was relevant to the site, and wasn't mandatory). After a couple
of days, someone noticed that there were a bunch of registrations with
something other than a year in that field (usually the name again), and I was
told to check out if the form was messing up. It turned out that my form
worked, but those were all spam-bots. It turned out to be a fairly effective
way to filter for spammers.

------
EGreg
Negative Captchas are only good against bots that hit many different sites and
don't specialize in yours. Which probably means they are fine for when you are
starting out.

However! If your site relies on people not creating lots of fake accounts and
spamming the site, you may want to require verification of a cellphone number.
After all, cellphone numbers are expensive to obtain, right?

Or are they? I heard Craigslist tried this and people somehow defeated it?

------
joeblau
I worked on something like this 4 years ago for a site I build called
[http://www.mftranslator.com](http://www.mftranslator.com). The concept that
my buddy and I created was a Nomadic Post. I should post the spec for it
online, but we essentially removed all bot spam from posting on our site
without having accounts or captchas and it never interrupted user flow.

------
blossoms
This is a cool idea but I am skeptical of how effective it is long-term. If I
am understanding the design correctly this method comparable to simply
changing the name attributes on input tags. It'll take awhile for spammers to
catch on though I believe this negative captcha's effectiveness will decrease
dramatically with adoption.

------
skrebbel
I have the completely unscientific idea that these kinds of negative captchas
have a property that's the exact opposite of security code: it works best when
you code it yourself.

Someone can code a bot that understands the exact tricks that this gem
performs. Make up different but similar tricks, and don't publish them.

~~~
e2e8
I think that strategy is called "security through obscurity."

~~~
icebraining
A captcha is not "security", so the term doesn't apply. It's just a nuisance
filter.

------
nekopa
I thought that ts was going to be a security feature in that I have a piece of
software on my computer which I use to challenge a site to make sure that it
is what it says it is, or that the customer service rep I'm talking to is a
human...

------
tuxguy
Is anybody else having trouble opening this link ?

(i am trying from bangalore, india, behind an enterprise nat. i am unable to
open any github or facebook.com page, the page loads partially & never
completely finishes loading)

------
llogiq
Ned Batchelder shared a similar if not equal trick some time ago:
[http://nedbatchelder.com/text/stopbots.html](http://nedbatchelder.com/text/stopbots.html)

------
jobigoud
Obligatory SMBC reference: [http://www.smbc-
comics.com/?id=2999](http://www.smbc-comics.com/?id=2999)

------
Mizza
This is cool, I've been using similar hacks in django to prevent spambots
posting "herbal medicines."

------
cbsmith
This is a funny joke, but I can't imagine it is terribly effective.

~~~
paulgb
I did some experimentation with this sort of technique a few years ago. At
first I was amazed at how naive bots actually are, but it makes sense: they're
not going to spend the extra effort dealing with edge cases like this when
they can just widen their net by hitting more sites. I don't know if things
have changed but a few years ago this would have caught most bots.

~~~
cbsmith
Yeah, but the same argument was true of captchas. Once they caught on, it
wasn't long before bots adapted.

------
etherealG
the first rule of negative captcha is...

been doing this for years, works wonderfully.

------
neves
Nice idea. It just kills the autoform fill usability enhancement.

------
teach
Can bots not read the labels next to the form fields?

~~~
neotek
Unless your site is very heavily trafficked, bot coders aren't going to modify
their scripts just to deal with you.

~~~
mmahemoff
By the same obscurity-based argument, it's probably easier to just roll your
own Captcha than installing/integrating/depending on a third-party library.

 _Please prove you 're a human by typing "cat" below_

~~~
etherealG
that's true, but this form of captcha is more user friendly, as the user isn't
even aware it's there.

------
mikos02139
shouldn't this be an __inverse __Captcha? (cf. negative)

