Awesome, let's hand the problem over to the spammers and it will be solved within 1 year.
It's funny you quoted that part. I can't tell you how many "next-gen" captchas I've broken trivially with the object recognition software our company has developed (http://demo.pittpatt.com). Granted, our software is useless against this type of Captcha, but many aren't.
Captchas have become a really interesting area of research for us because it's essentially the opposite problem that we are trying to solve. What's really curious about this is object recognition people seem keenly aware of the advancements in Captcha design and Captcha designers seem blissfully ignorant of the advances in object recognition.
I've seen so many proposed "next-gen" captchas that I could break before I finished reading their powerpoint slides.
Gonna be adding newer, better, cooler ones soon.
"Imagine a beowulf cluster of spammers"
I'm relieved that the comments on this are mostly against it. I was sort of worried. A captcha is a very delicate balance between not pissing off the good users and keeping the bad users out.
And let's not forget what seems to be the most effective method of cracking a captcha, to just proxy it to an actual user who thinks they are verifying themselves for some other site (porn). This doesn't address that at all.
I see CAPTCHA as a bandaid and I can understand why people turn to it: spam has reached epidemic proportions. That said, any form of CAPTCHA, either the existing form or the proposed 3-D form of this article suffers from two fundamental flaws:
1. It annoys users and makes it harder for people to contribute
2. Spammers will get around it eventually, and once one spammer gets around it, they all will
CAPTCHA bears similarities to copy protection schemes in these two flaws: both annoy users and merely are road bumps to the undesirables (spammers and crackers).
I think the solution is three fold:
1. Make it easy to post so real humans contribute
2. Filter spam aggressively
3. Incorporate trust mechanisms
Making it easy so real humans contribute removes obstacles (such as CAPTCHAs, registration, etc.) that get in the way of people posting. Every obstacle means less people will post. With community contributors at about 1% of visitors, there's a lot of room to grow by making it easier to contribute.
Spam filtering works because spam is fundamentally different from a valid post, and always will be. Bayesian schemes such as pg described long ago work well. Gmail, for example, does an amazing job of filtering spam. The few posts that get through are easily dealt with.
Trust mechanisms take advantage of the fundamental weakness spammers have: they aren't members of the community. A simple trust mechanism is don't auto-link links from posters with less than 10 comments. Since most spam contains links of some sort and most comments don't, spam will be predominantly affected by this. Once a posters get to 10 posts (or 10 karma), their comments retroactively get auto-linked. This is simple to implement, but reduces the impact of spammers significantly since their spam isn't accessible unless someone goes to the trouble of copy-pasting it, preventing accidental clicks by users. At the same time it doesn't punish new users, since their valid links will still be accessible and will become on equal footing once they grow into the community.
Lisa: What have you done with my report?
Bart: I've hidden it. To find it you'll need to decipher a series of clues, each more fiendish than...
Lisa: Got it!
This is actually not a meaningful way to attack current CAPTCHAs, so now that I think about it... this 3D CAPTCHA would probably be less secure than the current ones that rely on OCR.
So even if you could get your hands on the 3D source file used for rendering, generating all possible images is impossible.
The numbers don't refer to brute force since the answer changes on each try.
If you used the same answer 'ABC' each time you'd take (assuming perfect random distribution) 15,600 tries before getting it right.
I don't know about you, but after getting 15 thousand failed requests in a row from the same IP, I'd assume they were a bot ;)
Given an object under different lighting and vantage points, the captcha breaker can build a similar object and automatically generate a database of silhouettes from a sparsely sampled set of vantage points. Then, given a captcha image, he can search the database for an approximate silhouette match, then iteratively improve the vantage point by matching the silhouettes of nearby views. Since the vantage point and the labeled object entirely determines the captcha answer, this approach may be good enough to break the captcha.
A more dynamic scene would be more challenging for this approach, but it would also be more difficult for the server to come up with human-solvable scenes.
However in a 3D context there is no way a computer can infer what an object would look from a different vantage point, since not even a human can do this.
For example, looking at a CRT and an LCD head on, would give you the same image - but would give you no information about the depth of the monitor. Multiple view points would help the computer figure out the full three dimensional object, but then again, object recognition comes into play, which object is which?
This system works with humans because we have good 3D object recognition and a huge database of experience with which to compare it against, all of which is calculated in an instant.
Replicating that behaviour in a computer is still a long way away.
The captcha-breaking computer has no need to infer what an object would look like from another view if someone has already manually reproduced the library of models; in that case the problem reduces to identifying which models from the library are in the picture and what angle they are being viewed from. Although the problem is no doubt difficult, the silhouette strategy I described is similar to other published object recognition approaches known to work, e.g.:
And the approach doesn't need to work perfectly: the captcha breaker is only interested in improving his chances of guessing correctly. If an automated approach only guesses correctly even 20% of the time, the captcha is effectively broken.
In the case where two images of objects are very similar -- like your CRT vs. LCD example -- even a human would have difficulty differentiating. By definition that makes these objects bad for the captcha, so the captcha author would either leave them out of the library of objects, or he would need to make the captcha more tolerant of human error, which makes things easier for the captcha-breaker.
Luckily, unlike text, which follows a very constrained set of rules, (eg an X will always be two lines criss-crossed), the same doesn't apply to 3D objects, where you can have 2 images of the same object that look entirely different, a simple example being the chair, that comes in all varieties of shapes but still easily identifiable to a human.
So this would automatically require human input in respects to identifying the object, you can't create a program that would 'learn' new objects, at least, not yet.
Also, the silouhette strategy can only be applied when a shape remains relatively constant, moving the camera a little to the left would render a completely new silouhette.
Add that the bot would still need to be told how to answer the arbritrary 'How many legs does the chair that the man is sitting on have?' questions.
The fact that so much human input is required just to identify /one/ object in the captcha, the fact that once that object has been compromised it is trivial to switch in another one (which is impossible in text captcha because there are only 26+10 amount of characters that the whole world knows) means that this is a damn effective captcha.
The difficulty of the image library captcha depends on the size of its library, while the difficulty of the 3D captcha depends on the fact that it's much easier for a computer to go from a 3D model to a 2D image rather than the other way around.
It would, however, be very difficult for anyone to correctly guess ;)