I think the article misses the point of Captcha's, and I think he's incorrect in his conclusion because of that. The point of captcha's isn't that they are unbreakable, the point is that you want to raise the transaction cost above zero for creating a new account.
Spammers rely on massive accounts/contacts and get a very small return. Let's say you need to send out ten million e-mails to get a single response, and maybe you can get ten thousand e-mails out of an account before it's banned. If the cost of creating a new account is zero, or limited by zero, then it doesn't matter if you have to create 10, 20, 100, or even 1,000 new accounts to get the e-mails out so you can get one response and make a 5 dollar commission. If you can automate that hell out of it, who cares.
On the other hand if it costs you 5 cents to break every captcha, those costs really add up in the aggregate, but the transaction cost is so low for a single use that it really is almost inconsequential.
So as with all security, it's not about making it "unbreakable", it's about raising the transaction cost high enough to make it less attractive or even unprofitable to circumvent it.
I think the article misses the point of Captcha's, and I think he's incorrect in his conclusion because of that.
The site is actually written by a woman, Amy Hoy. I think it's a good sort of historical introduction into the issue for someone who doesn't know anything about it, but the article really doesn't seem to impart anything new to those who do.
I gather Amy Hoy is primarily a graphic designer who also does some dev work. I think most of the content on her site reflects this. As a person who has the same kind of split focus in a lot of their work, I have sometimes found content on her site interesting.
This isn't a CAPTCHA flaw, this is about using humans to break systems designed only for humans.
So the solution is to differentiate humans from each other.
The only thing a real user and a spammer can be differentiated on is _time_.
A human labour captcha breaker won't want to spend more than x seconds per spam as they work on a volume basis, whereas a real user will only do the registration once, and since they want to use your service, they are more willing to spend a couple extra minutes to do so.
However, that annoys new users who don't really want to go through a huge ordeal just to register on your site.
So the solution is to allow users to register without any captcha but disable their privileges on the site that could be abused for spam until either:
a) they gain enough credibility (like with karma on hacker news to unlock features)
b) they go through the actual signup form which would take about 5 minutes to fill (maybe having to identify 50 photographs, copying a passage of text etc.)
That would deter spammers, because it automatically limits them to x captchas to break per day.
I had a similar idea once. The key is that you need to add a cost to signing up on a site. If you make the cost be paid in time that will tend to deter normal users. The correct deterrent, I feel, is CPU cycles. Light browsers/users have plenty of them, a spammer needs all he can get.
The idea would be to pair a CAPTCHA test that a human can do well on with a compute-intensive task that really only a computer can do (like solving a big discrete log). An implementation would be to have the user type in the specific values (which are given in CAPTCHA form) into an on-page javascript calculator. This will then take some time to solve the equation (which can be easily adjusted by the size of the discrete log).
If a spammer uses humans to bypass the image recognition task, they'll still have to spend CPU time to solve the equation. Likewise, cracking the CAPTCHA with OCR will not circumvent the computation. If too many successful spam logins are still occurring the difficulty of the compute task and or the OCR task can be adjusted.
The article does a fine job of explaining what we all know now about CAPTCHAs: "people farms" crack them.
However, the article closes out pointing to a solution that is a very hard set of problems to solve: behavioral patterns, Bayesian filtering, keywords.
I think the problem with captchas is that they are too complicated even for humans. Its frustrating as hell to have to retype that crap 20 times before you get logged in.
Why can't the whole solution get simplified? A captcha that asks a user some real world question in clear text. Have one admin enter a set of questions, answers and links once a day from top Google News. And then have the widget match it to answers provided. Include a 30 second wait time after each failed attempt to hinder the bots.
I mean captchas get cracked left and right, because they are just a matching system...all you need to do is analyze the letters and retype them. By making it a question from News or Wikipedia, you'd improve the world by a) eliminating bot spam b) teaching people some new information c) and even promoting artificial intelligence research
As an admin I'd see this go as this:
[Enter Question]
[Enter Answer]
[Enter Hint Url]
Hell you can even monetize it by charging 1/100th of a penny per login attempt or make a partnership deal with a newspaper to get paid in exchange for using their stories
I think the best deterrent against spammers is to make your service unattractive for them. This is quite hard if you are creating a webmail platform, but luckily most of us aren't. Hide down-modded comments, surround outgoing emails with descriptive templates, etc.
Spammers rely on massive accounts/contacts and get a very small return. Let's say you need to send out ten million e-mails to get a single response, and maybe you can get ten thousand e-mails out of an account before it's banned. If the cost of creating a new account is zero, or limited by zero, then it doesn't matter if you have to create 10, 20, 100, or even 1,000 new accounts to get the e-mails out so you can get one response and make a 5 dollar commission. If you can automate that hell out of it, who cares.
On the other hand if it costs you 5 cents to break every captcha, those costs really add up in the aggregate, but the transaction cost is so low for a single use that it really is almost inconsequential.
So as with all security, it's not about making it "unbreakable", it's about raising the transaction cost high enough to make it less attractive or even unprofitable to circumvent it.