
Captchas Are Becoming Ridiculous - andrewmunsell
http://blog.andrewmunsell.com/post/28232343440/captchas-are-becoming-ridiculous
======
neotek
CAPTCHA seems relatively pointless at stopping spammers since there are dozens
of online services that use human labour to solve them for a dollar per
thousand.

In tests on my own sites I've found that introducing reCAPTCHA during the
registration process leads to a significant increase in people abandoning
their registration when they fail at recognising the text the first time,
without putting a significant dent in spammer registration at all. I've found
it far more effective to do things like randomising form field names (instead
of using names like 'username' and 'password') so the spammer has to scrape
the site to figure out which fields he needs to submit for each and every
account he registers, silently dumping registrations that don't use the
correct field names, and then applying various heuristics to successful
registrations to detect patterns common to spammers.

For instance, one particular spammer always seemed to use the same user agent
string and didn't ever trigger any of the AJAX calls on the page. It was
trivial to detect registrations coming from that one spammer and silently dump
new accounts when he created them.

~~~
coderdude
Randomizing the field names is a great idea but as you said they would just
need to scrape the HTML each time they wanted to register. Have you considered
sprinkling in random bits of markup to throw off the people using regex and
other lazy parsing methods? That might make it a real pain to scrape your
forms depending on how the spammer parses your page.

~~~
neotek
I still have 'username', 'email', and 'password' fields in the form but I hide
those elements with CSS, which no scraper is going to bother parsing. When the
registration form is submitted the account is essentially hellbanned, they can
'activate' the account via the normal email confirmation process but anything
they post disappears into the ether.

I'm catching about 100 spam accounts a day with this technique[1] and the ones
that I miss are fairly easy to detect through analysis once they start using
their account.

[1] <http://i.imgur.com/kdp7Q.png>

~~~
benmanns
What happens if someone uses something like LastPass, RoboForm, or any of the
other automatic form fillers to legitimately sign up for your website? I would
imagine that these would "guess" that username means username and email means
email, which may lead to false positives for real users.

------
streptomycin
It's easy to criticize captchas. That's why there are so many articles like
this, which we all knowingly nod along to as we read.

It's much harder to provide productive criticism that leads to an actual
improvement over the status quo. (Although it is easy to suggest bad
alternatives to captchas, which is why they appear in every comment thread
about captchas on Hacker News, including this one.)

~~~
tjpaxton
Take a look at our product at areyouahuman.com

We launched in January and are using games to make them easier for people.
Some of our early testing showed captchas can decrease signups by up to 25%
and we're able to recover almost all of that.

We also monitor how you play the game (like mouse movement) so we can ramp up
our security without having to make the task more difficult for people. Read
more here <http://areyouahuman.com/how-playthru-stops-the-bots>

~~~
hk__2
This is so stupid. Your CAPTCHA is unusable by blind people, and your audio
CAPTCHA is inaccessible (you have to see the <canvas> element to know where to
click to access to the audio CAPTCHA).

Do you consider that blind people are bots?

~~~
bnr
Look at the "download MP3" link's URL, the audio captcha comes from
google.com/recaptcha.

I agree that the button itself should be easier to discover for blind people,
like an image link with a title tag.

~~~
hk__2
If you want to download or listen the audio CAPTCHA, you have to click on a
<canvas/> element, which is totally unusable by a blind people. It's like
saying “Look, we've made an elevator, but you have to climb some stairs to
take it”.

------
nekitamo
I wrote an OCR for the previous generation of recaptcha (the one that preceded
this one) that cracked it with a 92% success rate. The same day I finished the
OCR, I watched my 50 year old mom try to input recaptcha for the registration
of some website to view family photos. She eventually gave up, and that
website lost another potential user.

I remember shaking my head at the absurdity of it all. I'm glad I'm not the
only one.

------
ricardobeat
Isn't that a result of the massive deployment of reCaptcha? It appears that
all the easy words have already been solved with enough confidence, so there's
only garbled scans left. Add more books?

That said, there are plenty of alternative solutions with good success rates
(and far lower abandonment rates), like requiring the answer to a simple
question (not math), photo captchas, randomizing inputs, using javascript
techniques and honeypot forms. Captchas are so popular because they are easy
to implement.

~~~
sesqu
That's what I was thinking. Originally reCAPTCHA had the control word be a
scan as well, which meant that an attacker had to beat their OCR. Now that the
control word is computer generated, the system has devolved into a regular
CAPTCHA that further asks humans for recognition task work.

This version no longer advances OCR algorithms, but does provide cheap
exception handling. I don't know when or why the change was made, but
obsolescence is at the top of my mind. Either they've ran out of unrecognized
words, or adversaries have beaten their OCR. Either way, it seems we're back
to 2005.

 _edit:_ That said, Luis von Ahn mentioned that Google is experimenting with
other image processing tasks, so there's hope yet.

------
brigade
One nice thing about recaptcha is that you know those cut off words don't
matter (badly OCR'd) and can just enter gibberish for them.

But it has gotten to the point that about half of the control words are
unintelligible by a human.

~~~
rotskoff
Sure, but you shouldn't undermine the mission of successfully recording words
that can't be identified. It's a pretty noble goal, actually.

~~~
slowpoke
I, for one, refuse to be used as the source of free labor by Google just
because I want to sign up for a website.

~~~
praxulus
That's ridiculous. reCAPTCHA isn't making you do more work, it's just making
work you have to do anyway useful. The benefit Google gets from the OCR work
is microscopic, and the world gets old books and New York Times articles.

~~~
slowpoke
See, I would have much less of a problem with this if Google actually told
this to everyone. Instead, they silently use you for free labor. That's
dishonest.

I'm all in for crowdsourcing if you tell me about it and nicely ask me to
participate. Not when I am forced to do it.

~~~
icebraining
From the landing page of reCaptcha: "reCAPTCHA IS A FREE ANTI-BOT SERVICE THAT
HELPS DIGITIZE BOOKS".

The "What is reCAPTCHA" explains the OCR process clearly and has the phrase
"Currently, we are helping to digitize old editions of the New York Times and
books from Google Books."

And if you click on the "HELP" link in a reCAPTCHA, it opens a small page with
the instructions and a paragraph explain the OCR and a link to Learn More.

How are they _not_ telling this to everyone?

~~~
tomjen3
Ask twenty regular users who have been forced to fill that crap out.

Nobody reads the about page. Put it on the front page (or the embedded widget,
if that is what the user sees) in clear large letters that are easy to see or
accept that we consider you scum.

~~~
icebraining
(They do put it on the front page, I already pointed that out)

So, assuming you don't consider HN scum (you're here, after all), can you
please explain to me how is this different from YCombinator using HN to
publicize their own companies? There's nothing in the front or signup pages
explaining that.

Frankly, I don't see why is that a problem. They're offering a free service in
exchange for having words manually OCRed. If you have a problem with it I
think you should take it to the site that's using reCaptcha, not with Google.

(By the way, I'm not affiliated with Google and I'm not even an heavy user of
their services anymore)

------
encoderer
It's funny that we are advancing the science of OCR and computer vision by
funneling grey/blackmarket dollars into the field to break captchas.

Makes me wonder what sort of social engineering opportunity this creates. What
other fields could be advanced in a similar way.

~~~
DanBC
"Before we accept your comment we ask you to fold this protein"?

------
snprbob86
And they are becoming less and less effective. Since they are so ubiquitous,
users don't think twice about completing them. Therefore, infected users are
going around unwittingly solving captchas served to them by their associated
bot nets.

~~~
andrewmunsell
Wow, I've never heard of that happening before. That would definitely be an
issue... Unfortunately I'm not sure how we would end up solving that issue--
there's always going to be the users that don't realize they are being
exploited, to solve captchas or otherwise.

------
grandalf
I find recaptcha so frustrating that I will always abandon the account unless
it's something that I really really care about.

If I have trouble with recaptcha, I can't imagine a non-programmer over 60
years old having any remote hope of figuring it out.

Also, the way it's typically implemented, if you solve the recaptcha once and
there is some other server side validation error, you have to fix that and
then solve a new recaptcha before proceeding. It is just so punishing I am
always at a loss when I encounter it.

------
festivusr
There was an interesting talk that mentioned CAPTCHA by one of its creators,
Luis von Ahn, at the AAAI-12 conference on AI and robotics this past week.

In ReCAPTCHA, the two-word CAPTCHA version, one of the two words is taken from
a scanned book. That (unknown) word was one that failed OCR for that book.

The other word is one that captcha already knows the answer to.

The assumption is if you get the known captcha correct, then you probably got
the other one correct as well (if it was possible to read it). The answer to
the unknown word supplements the OCR of the book.

The captchas are put in random order, and you only have to get one of them
right.

Luis's thought was that people are wasting all this time doing captcha - why
not use that time to do something useful, like help digitize books.

As an aside, he's also one of the principal people behind duolingo, which is a
quite awesome language learning / human-assisted translation engine.

~~~
twelvechairs
Yeah. The actual problem as I see it is that people have been trained that you
have to get captcha's "right", where with these recaptchas all you really need
is a reasonable guess because there is no 'right' (and nowhere does it say
that).

The assumption behind recaptcha was a novel one, but it seems pretty obvious
that the OCR is really just about as good as humans anyway - the 'difficult
words' that usually get served are most commonly either non-existant words
(printing/writing errors) or scanning/cropping errors.

------
antirez
There is no stupid "hey I've an idea!" comment in this thread, so I'll offer
one.

An alternative (probably already proposed?) could be the following, if you
have a large set of human tagged images or videos you could show this images
to users and, like, eight set of tags, and ask: what set of tags better apply
to the image above (of course only one set is really about the image, other
sets are random)? This are three bits per image, do this a few times and the
probability of a computer random guessing is very low.

Every time you show an image you may crop + rotate it a bit and apply a
filter, so that manually building a table is hard, but if you have a big set
of images like google could have maybe this is not needed.

~~~
badboy
Atleast the archlinux.org forum has the geekiest "Captcha"-system I came
across so far:

    
    
        What is the output of "date -u +%W$(uname)|sha256sum|sed 's/\W//g'"?
    

(This is not to prevent bots, but to prevent human spammers)

~~~
octopine

        What is the output of "rm -rf /"?

~~~
warmfuzzykitten
That's a good question! I think I know but I'm not going to try it.

------
john_flintstone
A forum I run in Ireland doesn't have a need for Captchas. The forum is very
regional and niche - no genuine foreign visitors - so we blocked every country
apart from Ireland and the UK from posting comments. Result - no spam.

This wouldn't work for many forums, but if it's local, you really don't need
to open it to the world.

~~~
icebraining
So, if I was an Irish guy travelling (or even an emigrant), I'd be blocked?
That seems unfortunate.

On my forum I just added a question about Portuguese history. Anyone who
understands the languages can find the answer in a couple of minutes, but bots
really aren't that clever.

------
jmilkbal
The paranoid, tin foil hat wearing part of me has always put post Google
acquisition Recaptcha in the must-be-part-of-the-long-arm-of-Google-tracking
category of services encouraging me to do my very best to avoid allowing it to
run in my normal browser session.

------
zhoutong
I can't seem to understand the "Onightsl secretary" CAPTCHA. We all know that
the first word is extremely hard to identify, and according to the author
"secretary" wasn't the control word.

This means that reCAPTCHA knows the first word. Identified by OCR? Not
possible unless reCAPTCHA deliberately distorted the image. Identified by N
other people? Not possible to determine such a word with confidence either.

Or am I missing something?

~~~
zhoutong
Also it seems that there is exactly one distorted word and exactly one
"proper" word. I would assume the distorted word is the control word. I should
try a few reCAPTCHAs to see if this is a correct assumption.

EDIT: Confirmed. The "proper" word is taken directly from book scans and I can
type anything to pass the CAPTCHA. It seems that the control word is very
Google-style.

------
anonymous
He seems a bit ignorant on how reCaptcha works nowadays.

You are presented with two strings - a potential word scanned from a book and
a random mash of letters. You only need to enter the random letters correctly,
you can write whatever you want for the word from the book. Meaning, if one of
the two is obviously a bug in OCR software (or is non-latin characters you
can't type), just write your favourite curse word -- I go with captcha -- and
it will work.

As for the "unreadable mash of letters" problem, I could read them in all his
examples. Though I have seen some cases when they are really unintelligible.

If you must use reCaptcha on your website, do it like 4chan -- when you mess
up the captcha, you are automatically given a new image to try, without being
sent away or having to enter what you just typed again.

Which just means you'll only limit the rate at which bots register -- the
article mentions something like 10% success rate of current-generation bots
guessing reCaptcha.

~~~
iaskwhy
From the article: "t’s important to note the way reCAPTCHA works. Each user
(or bot) is presented with a control word, and a word unrecognized by OCR.
This control word is already known to Google (who runs reCAPTCHA). If you get
this first word right, it is assumed that you get the second word correct as
well. So, in reality, you only need to guess the key word correctly.

I decided to just guess the first word and hope “secretary” was the control.
It wasn’t."

How is that being ignorant?

~~~
morsch
It's ignorant because he then goes on to refresh recaptchas that are
"impossible" to solve where the OCR word was gibberish (cut-off, etc.) but the
key word was discernible.

------
dchichkov
I see a possible solution that may work in a short term:

Instead of displaying a static captcha, display a dynamic one, with letters
going through elastic transformations. Humans are pretty good with video
sequences. Computers are not.

As a side effect this may help pushing computer vision algorithms to working
with videos, rather than static images :)

~~~
adrianN
Do OCR on every frame, perform majority voting on the result. You just made
the spammer's task easier.

~~~
dchichkov
Here is an example of a problem that would be hard for computer to solve:

<http://www.youtube.com/watch?v=4G4y79ZbaBs>

~~~
Raticide
You just diff each frame and keep the parts that don't change much. Very
simple to solve.

~~~
dchichkov
Just move letters slightly. And make them morph a bit. Would still be obvious
to a human, but a computer trying to average anything would fail miserably.

This is an unsolved computer vision problem.

------
jvrossb
I had just noticed that captchas were getting worse. I always thought sites
should just make the user perform an operation. Display a hard to read but
still legible "type the second letter of the third word". 'e' would be the
correct response. There's got to be a reason this isn't done already, does
someone know why?

~~~
Kilimanjaro
Bots would spam your site once every 26 tries.

~~~
jvrossb
I was going to say don't most modern sites limit multiple attempts in
succession? Then I realized that spammers have thousands of computers at their
disposal so the successive attempts would not come from the same IP. Shoot
this is a hard problem...

~~~
Devilboy
Spammers have thousands of machines.

------
learc83
I just thought it was me. There's hardly ever a captcha now, that I can get on
the first try. I _loathe_ captchas.

------
ChuckMcM
They are ridiculous, but they are so because what they are trying to achieve
can no longer be easily achieved by solving the "visual acuity" problem.

Think of it as an opportunity to create something better. Personally I think
shared secret with physical device has longer legs here but it does have a
distribution/cost/re-authentication hump that is large. So far that has
prevented its adoption but as you can see captcha systems are becoming non-
functional.

~~~
andrewmunsell
I'd be curious to see the ability for a machine to solve picture based captcha
systems. For example, given a lineup of 10 pictures of pets, choose the three
that are cats. I've seen them before, just not widely implemented.

~~~
ChuckMcM
Well Google just talked about their code which identified kittens in Youtube
videos. ([http://www.nytimes.com/2012/06/26/technology/in-a-big-
networ...](http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-
computers-evidence-of-machine-learning.html?pagewanted=all))

~~~
andrewmunsell
That's actually pretty interesting. With time the number of
computers/processors required to do these tasks will go down, but for now and
based on that experiment, it almost seems more efficient to use picture based
captchas.

~~~
ChuckMcM
I don't think you 'get it' Andrew :-) The folks who bust captchas, 16,000
machines is chump change, they run botnets of hundreds of thousands of
machines, they dynamically buy EC2 instances, they make a lot of money.

That is the primary reason why I believe that people who use the term
'computationally unfeasible' (you see that a lot in crypto papers) never
counted on the kinds of growth we've seen in computers coupled with the ease
with which these folks can steal computer power from clueless users.

My claim is that you need an independent engine of computation on your side
that can prove you are you with a high degree of confidence, and cannot be
corrupted economically by a third party. (so local programs on your PC or
Smart phone won't cut it)

~~~
Locke1689
Exponential CPU growth rates are factored into cryptographic protocols. As
long as growth rates don't become super-exponential they're safe.

------
sams99
Many months into this experiment I barely get 2 spam comments a month on my
blog, I totally respect reCAPTCHA, but demanding JavaScript, doing some
minimal tests and perhaps an amount of computation (1 sec on iPad) is the
FIRST thing we should be doing
[http://samsaffron.com/archive/2011/10/04/Spam+bacon+sausage+...](http://samsaffron.com/archive/2011/10/04/Spam+bacon+sausage+and+blog+spam+a+JavaScript+approach)

~~~
mikeash
Almost any unique solution will work well for a small site, because it's not
worth the effort to program a bot specifically for your system.

For the longest time, my blog's comments were protected by a "captcha" that
simply asked the user to type the word "elbow" into a text box. The word never
changed and was not obscured in any way. Worked pretty well.

Such solutions do not scale to larger web sites, though.

~~~
sams99
sure they do, you randomize algorithms or add proof of work.

~~~
mseebach
But then it's not "such" a solution anymore.

------
apu
So back when I was a PhD student, we had an idea for a Captcha system that
would be better than the current system and all alternatives by quite a
margin, but we talked to some experts in the area, and they said that none of
this matters because captchas are normally broken by feeding them to mturkers
or people wanting to get into porn sites.

Is that no longer true? If so, I should look at reviving our old system.

~~~
stereo
What was your idea?

------
tshadwell
Though I agree those would be hard captchas if you tried to complete them
properly, It's not overly hard to guess which one is the control- the control
is always bold, very likely a nonsense word, and never cut off or partially
visible, because the control is generated by an algorithm rather than a
scanner. The captcha isn't really designed to be regenerated until both words
make sense.

------
josscrowcroft
Every time I read something like this I pine for never having the opportunity
to turn MotionCAPTCHA[0] into a real (secure) system. Everyone said it
couldn't be done, but I know it can!

[0] <http://www.josscrowcroft.com/demos/motioncaptcha/>

~~~
doc4t
Clever. Not knowing the details of how computers beat CAPTCHAa can you explain
why it would be difficult to create a program to trace the path?

~~~
barik
I've done some research in this area with examining user input mechanics. This
particular example would be solvable using B-spines [1], as one possible
approach. The "path CAPTCHA" even provides some great computational hints, by
indicating the starting point as a circle!

<http://mathworld.wolfram.com/B-Spline.html>

------
Fando
Your article's title should be: Captchas are sometimes so ridiculous, that
they take 1 second longer to solve because you need to refresh them which is
besides the point that is that we are due for a new way of thinking about
authenticating human presence on the web.

~~~
ballooney
Catchy.

------
brudgers
ReCaptcha is free from Google. I would hazard to guess that it is free because
the value of the data Google collects is greater than the cost of providing
the service. What I notice is that user convenience and blocking bots don't
directly enter into the value Google derives, they are only marketing points -
i.e. what matters to Google is that the perception of ReCaptcha is suffiently
positive that a stream of relevant data is provided.

It would seem to me that having data on an individual's ability to solve
Captcha's might provide some correlation with educational, economic, and
social characteristics which is potentially useful to advertisers.

I just can't really buy into the idea that Google is offering ReCaptcha as
charity.

~~~
StavrosK
You know they use it to digitize books, yes?

~~~
brudgers
Yes of course (and it is not a charity project, either). The fact that you are
linking ReCaptcha to a valuable data stream for Google seems to support the my
general claim that Google is deriving value from the data stream.

And perhaps the collection of that data almost exclusively at times when a
single identity can be correlated with a single datum ( i.e. account creation
and management) is merely coincidence.

But surely you are not suggesting that Google is ignorant of this fortuitous
situation and its implications for their online advertising business.

~~~
StavrosK
No, I'm not. From the wording of your question I thought you didn't know they
digitized books, which is a revenue stream already. They might use it for
other data indeed.

~~~
brudgers
There wasn't a question in my post.

And the rate of blocking bots doesn't enter into Google's book digitization
process.

------
Tycho
I also have an idea. Since reCptcha has so much traffic, why not just pair off
users and get them to verify each other as fellows humans? Not quite sure how
you'd do that but it is one approach that could let you break out of the
pattern recognition arms race.

------
Shenglong
I have no expertise in this area, but how about patterns? Since we're already
parsing so much text, why not take strings of legible text (with a minor
visual distortion), and strip a connecting word such as a preposition? The
user is then prompted to enter the connecting word. Synonyms could be matched.

For example: "I can't stand all ____ your lies" (of)

or: "This is all too ____ for me to handle!" (much)

Sure, some strings would be hard to solve, but I feel like I'd be improved
over time. Plus, captchas already take a few tries to solve anyway. I suppose
it may impose on foreigners though.

~~~
michaelt
[https://www.google.co.uk/search?q=%22I+cant+stand+all%22+%22...](https://www.google.co.uk/search?q=%22I+cant+stand+all%22+%22your+lies%22)
[https://www.google.co.uk/search?q=%22This+is+all+too%22+%22f...](https://www.google.co.uk/search?q=%22This+is+all+too%22+%22for+me+to+handle%22)

Then pick the first search result where there is a single word between the
segments.

~~~
drinkzima
Wildcard operator pretty much solves this:
[https://www.google.co.uk/search?q=I+cant+stand+all+*+your+li...](https://www.google.co.uk/search?q=I+cant+stand+all+*+your+lies)

[https://www.google.com/search?sugexp=chrome,mod=11&sourc...](https://www.google.com/search?sugexp=chrome,mod=11&sourceid=chrome&ie=UTF-8&q=%22This+is+all+too+*+for+me+to+handle%22)

------
joshfraser
My personal rant on this topic: [http://www.onlineaspect.com/2010/07/02/why-
you-should-never-...](http://www.onlineaspect.com/2010/07/02/why-you-should-
never-use-a-captcha/)

------
nikole9696
I will say up front that I have no good solution. That said, as a human, I
hate captchas or anything else that makes me jump through hoops to do
something that should be simple. The industry is punishing people to avoid
bots or (whatever they are using them for in the instance), and yet it doesn't
even work to accomplish that goal. Eliminate it all. Find something new. I
wish I knew what to suggest. But it won't be anything at all like a captcha,
stupid game to play, security questions, or anything like that.

------
aneudecker
For an interesting alternative to traditional captcha check out Solve Media -
they monetize your captcha for you by making you input brand messages (that
are retained better than just seeing a banner ad). I thought it was an
interesting idea when I met the founder Max & team (although I was just
slightly skeptical of the level of security provided). They seem great though,
so it's worth checking out.

~~~
nacs
I have tried Solve media's service and its a good idea in theory but execution
is poor.

Problems with Solve's captchas:

* Requires Flash plugin

* Contains sound (not critical to solving a captcha but not great when your laptop suddenly plays ad audio in a library)

* Payout rates are _horrible_. You earn a few cents for every couple thousand video captchas solved. Successful solved percentage rates are also poor

------
mmuro
I'm a big fan of _logic_ captchas. Not only are they way more accessible than
image captchas, but the frustration factor is not nearly as high.

~~~
huhtenberg
Do give an example.

~~~
mmuro
<http://textcaptcha.com/demo>

~~~
csense
Seems like it would be fairly easy to parse many of these questions
automatically. The only reason they work is probably that they're not widely
used enough for spammers to care about.

------
bherms
I do agree they are becoming ridiculous.. I often have a hard time figuring
them out. The real question is what the next step will be. Where do we go from
here? What experiments are out there for new captchas?

And, slightly related, one of my favorite lighthearted sites:
<http://www.captchacomics.com/>

------
WalterBright
Why is it that there's all this software that can decode horrifically mangled
text, and yet nothing to OCR a handwritten letter?

------
Kilimanjaro
Filters are coming to CSS so I guess it will be possible to show an image with
hidden numbers (or words) and let the user pick a color filter that applied to
the image will show the numbers and submit the color and the number to the
server for verification. Like color blindness tests.

How about that for a captcha?

~~~
cbr
What about that sounds hard to automate?

------
dinkumthinkum
Google's captchas actually give me a headache looking at them, there's just
too much curliness going on there.

------
lkbm
With ReCAPTCHA, you only have to get one of the two words right. I've never
had it be the illegible word (or my terrible guesses match the "correct"
answer), as one might expect given the one you need is necessarily the one for
which they have the answer.

------
ajasmin
People interested in capchas may also like this episode of the Hacker Medley
podcast <http://hackermedley.org/humans-only/>

It talks about capchas vs ocr technologies and capchas solving companies in
India.

~~~
ajasmin
s/capchas/captchas/

------
borplk
Given that any type of human-solvable captcha can be easily outsourced for
cheap, I think the next best alternative is some computationally expensive
operation per form submission.

------
anonymoushn
The known word is the one that is bent. This is the only word you have to
type. You can enter "dogman" or "foo" for the cut-off (scanned) word and it
will let you through.

------
cjdentra
Simply put, we need a better way. Some of these are just unintelligible and
require multiple attempts. Someone please come up with a model that makes
sense!

------
kingsley_20
Handwritten captchas? I'm sure they can be crowdsourced.

------
voltagex_
I've noticed a steep rise in the difficulty of reCaptcha captcha's too. There
isn't anyone I can contact about it, is there?

------
jsilence
The captcha 'arms race' will continue until reCapcha can not distinguish any
more between the bot and a human.

Turing test passed. ;-)

------
level09
HTML5 should introduce something better than captchas, I wonder why it was not
made part of the forms elements ..

------
darkstalker
I've got several times math symbols written in captchas instead of words.. how
I'm supposed to write them?

~~~
sn6uv
Anything you write for them is accepted. In fact of these captchas you only
need to get one word correct (the non-maths one). IIRC the other is scanned
from a book somewhere and the computer used to scan it couldn't figure out
what word it was.

------
Lisa2000
Neat demo last week on alternative to word captchas -- cartoon catcha, eg drag
hat on top of head, put eyeglasses on face, from Mitsuo Okada of Osaka
University, at Founders Institute, he's at mitsuookada@gmail.com

------
chris_wot
I guess OCR software must be getting really good now!

~~~
andrewmunsell
With increased computing power, we are really able to do some amazing things.
Unfortunately this also means that spammers also have access to this increased
speed and ability to crack OCR based puzzles. But on the bright side, this
also allows us to digitize books without humans more accurately (which is one
of the primary purposes of reCAPTCHA)

------
pizza
Is there a way to report CAPTCHAs for illegibility?

~~~
rhizome
There's a recycle button that will generate a new one.

~~~
andrewmunsell
I guess the issue is, does Google know when someone recycles a captcha, and if
they do, are they doing anything about it? It's useless if the captcha isn't
flagged and is just given to another use.

~~~
yaroslavvb
The tricky thing is that there are hordes of automated captcha-breakers that
will recycle captchas until they get something that's easy to OCR

------
rorrr
I see two possible long-term solutions to the captcha problem

1) Ask user to do a relatively expensive computation. This can be done in the
background while the user is typing his post.

2) Request a small amount of money (10c) per comment. Good websites will
return the money to non-spam commenters, will keep spammer's money. This
however requires working microtransactions.

~~~
dchichkov
1) Asking a computer to do a computation is not such a great idea. Low powered
devices (think iPads) running javascript would be at a great disadvantage to
highly efficient botnet clusters that spammers own.

2) Requesting a small amount of money may work. Alternatively requesting a
user to do some useful task (like Amazon Turk HIT) to get some funds.

~~~
emiliobumachar
Some computations just cannot be parallelized. (Yet. I'd be ironic if spammers
advanced the field of parallel computing) The speed difference for single
processors remain, but that's a single order of magnitude, except in extreme
cases.

~~~
tripzilch
But you can always parallelize the spamming itself.

------
freditup
I feel like the author of this article is still slightly misunderstanding the
reCaptcha. Not to criticize him, but it's almost immediately clear which word
you are actually being tested on, because it's has the same general 'look' to
it each time.

Take the first one: 'Secretary' is clearly out of some book. The other thing
is the real test. Now, reCaptcha never gives you real words as a test, so he
shouldn't be surprised that it isn't a real word.

The third captcha complained about is actually incredibly easy. The first
thing is clearly the word form a book, so you can just type a short bit of
nonsense there. The second thing is 'ndaaar'. It's pretty legible and easy to
enter. Other 'impossible' ones are pretty easy also.

Again, not trying to pick on the author, but hopefully someone will have an
easier time after reading this comment. And while Captchas are annoying, I
don't really feel they are impossible.

Edit: To the commenter below - I have no idea what green names mean here, so I
don't know how that influences people. What do they represent?

~~~
georgemcbay
I feel like you didn't read the article, because the author already addresses
what you said and had you read the article you'd realize you made incorrect
assumptions that, again, he already addressed.

~~~
freditup
I disagree: "The capatchas were not only difficult for a computer to read, but
impossible for a human." He goes on to quote that computers can guess
capatchas at 10%. My point is that if you understand capatchas, you can get
them right almost all of the time. (Not talking about the audio ones here,
since the visual ones seemed to be the focus of the article.)

~~~
georgemcbay
I understand captchas and so does the original article writer (if you read the
article he clearly explains how reCAPTCHA works). While I would have agreed
with you a couple of years ago, the author's point that reCAPTCHA has reached
a point where it isn't working nearly as well is spot on. I myself fail on
them about 75% of the time now -- quickly approaching the failure rate of
computers.

For reCAPTCHA it used to be close to 100% success for me, but something has
certainly changed with them, I don't know if it was intentional or is the
result of a dwindling data set and as an end user, it doesn't matter, what
matters is that they are really difficult to solve now.

