

3-D CAPTCHA: A way to fix the broken CAPTCHAs? - iamelgringo
http://spamfizzle.com/CAPTCHA.aspx

======
adnam
> Object recognition is a completely unsolved computer vision problem.

Awesome, let's hand the problem over to the spammers and it will be solved
within 1 year.

~~~
lbrandy
> > Object recognition is a completely unsolved computer vision problem.

It's funny you quoted that part. I can't tell you how many "next-gen" captchas
I've broken trivially with the object recognition software our company has
developed (<http://demo.pittpatt.com>). Granted, our software is useless
against this type of Captcha, but many aren't.

Captchas have become a really interesting area of research for us because it's
essentially the opposite problem that we are trying to solve. What's really
curious about this is object recognition people seem keenly aware of the
advancements in Captcha design and Captcha designers seem blissfully ignorant
of the advances in object recognition.

I've seen so many proposed "next-gen" captchas that I could break before I
finished reading their powerpoint slides.

~~~
boredguy8
[http://img.timeinc.net/time/time100/2007/images/time100landi...](http://img.timeinc.net/time/time100/2007/images/time100landingimage.jpg)
\- misses Obama. (It's good software, was just funny to me.)

~~~
lbrandy
Yea, it does. I just re-ran that image on our newest models and it does find
Obama in that image, but at low confidence. See above about how our website
demos are out of date.

Gonna be adding newer, better, cooler ones soon.

~~~
boredguy8
Sweet, good work! Being able to get the top google results for "faces" might
not mean much in reality, but those are some of the first things people will
throw at the demo.

------
sratner
I have a captcha idea: lets make users watch a 90-minute movie and summarize
the plot in 6 words. Oh, and serve ads over it while they are watching, just
to recover the bandwidth costs.

~~~
silentbicycle
Or maybe just four? :) <http://www.fwfr.com/>

------
dominik
I harbor a strong distaste for CAPTCHAs. At best, they're annoying. At worst,
they're infuriating, such as when I simply can't see the letters or when I
stumble into a cross-site hosted CAPTCHA that requires JavaScript -- which I
have disabled -- and then loses the post I was going to make when I press the
back button to try again with JavaScript enabled.

I see CAPTCHA as a bandaid and I can understand why people turn to it: spam
has reached epidemic proportions. That said, any form of CAPTCHA, either the
existing form or the proposed 3-D form of this article suffers from two
fundamental flaws: 1\. It annoys users and makes it harder for people to
contribute 2\. Spammers will get around it eventually, and once one spammer
gets around it, they all will

CAPTCHA bears similarities to copy protection schemes in these two flaws: both
annoy users and merely are road bumps to the undesirables (spammers and
crackers).

I think the solution is three fold: 1\. Make it easy to post so real humans
contribute 2\. Filter spam aggressively 3\. Incorporate trust mechanisms

Making it easy so real humans contribute removes obstacles (such as CAPTCHAs,
registration, etc.) that get in the way of people posting. Every obstacle
means less people will post. With community contributors at about 1% of
visitors, there's a lot of room to grow by making it easier to contribute.

Spam filtering works because spam is fundamentally different from a valid
post, and always will be. Bayesian schemes such as pg described long ago work
well. Gmail, for example, does an amazing job of filtering spam. The few posts
that get through are easily dealt with.

Trust mechanisms take advantage of the fundamental weakness spammers have:
they aren't members of the community. A simple trust mechanism is don't auto-
link links from posters with less than 10 comments. Since most spam contains
links of some sort and most comments don't, spam will be predominantly
affected by this. Once a posters get to 10 posts (or 10 karma), their comments
retroactively get auto-linked. This is simple to implement, but reduces the
impact of spammers significantly since their spam isn't accessible unless
someone goes to the trouble of copy-pasting it, preventing accidental clicks
by users. At the same time it doesn't punish new users, since their valid
links will still be accessible and will become on equal footing once they grow
into the community.

------
josefresco
Holy usability nightmare Batman!

~~~
Hexstream
I'd happily take this over mass obnoxious in-your-face spamming any day.

~~~
jacobbijani
You know, a lot of users don't even realize its spam. That's _kind of_ the
idea. Some of them seem to enjoy it, even.

I'm relieved that the comments on this are mostly against it. I was sort of
worried. A captcha is a very delicate balance between not pissing off the good
users and keeping the bad users out.

And let's not forget what seems to be the most effective method of cracking a
captcha, to just proxy it to an actual user who thinks they are verifying
themselves for some other site (porn). This doesn't address that at all.

~~~
thwarted
Today I was told, via spam email subject lines, that Elton John died in a
rocket crash and James Brown died of a heart attack. Almost fell for it.

~~~
Hexstream
And I was told I was "caught naked in the shower". How humiliating it would be
if everyone learned I take my shower without clothes!

------
axod
Still fails to address the proxying of captcha's off to be solved by
unsuspecting people. * setup porn/warez site * proxy captcha's from the target
site to the porn/warez site * let real users solve the captchas for you.

------
kimboslice
Hysterical: "A bot attempting to brute force a solution to the above example
will need to work its way through (26)(25)(24) = 15,600 possible combinations.
Asking for the identification of four unique features gives 358,800 possible
combinations while 5 unique features will render 7,893,600 possible
combinations"

~~~
zepolen
Why?

~~~
jrockway
These numbers don't mean much, but it's "hilarious" because you could simply
generate every image and compare them to the one on the site in about a second
of CPU time.

This is actually not a meaningful way to attack current CAPTCHAs, so now that
I think about it... this 3D CAPTCHA would probably be less secure than the
current ones that rely on OCR.

~~~
zepolen
The final image is rendered based on random variables /each time/ - Even just
moving the light source would result in an entirely (from a bitmap point of
view) different image.

So even if you could get your hands on the 3D source file used for rendering,
generating all possible images is impossible.

The numbers don't refer to brute force since the answer changes on each try.

If you used the same answer 'ABC' each time you'd take (assuming perfect
random distribution) 15,600 tries before getting it right.

I don't know about you, but after getting 15 thousand failed requests in a row
from the same IP, I'd assume they were a bot ;)

~~~
jcl
While moving the light source results in different pixels, the object
silhouettes don't change, and even internal edges will remain reasonably
consistent under different lighting.

Given an object under different lighting and vantage points, the captcha
breaker can build a similar object and automatically generate a database of
silhouettes from a sparsely sampled set of vantage points. Then, given a
captcha image, he can search the database for an approximate silhouette match,
then iteratively improve the vantage point by matching the silhouettes of
nearby views. Since the vantage point and the labeled object entirely
determines the captcha answer, this approach may be good enough to break the
captcha.

A more dynamic scene would be more challenging for this approach, but it would
also be more difficult for the server to come up with human-solvable scenes.

~~~
zepolen
You are describing object recognition, which even on just a 2D static image is
an insanely hard to get working correctly (I have experience as it was my
university project).

However in a 3D context there is no way a computer can infer what an object
would look from a different vantage point, since not even a human can do this.

For example, looking at a CRT and an LCD head on, would give you the same
image - but would give you no information about the depth of the monitor.
Multiple view points would help the computer figure out the full three
dimensional object, but then again, object recognition comes into play, which
object is which?

This system works with humans because we have good 3D object recognition and a
huge database of experience with which to compare it against, all of which is
calculated in an instant.

Replicating that behaviour in a computer is still a long way away.

~~~
jcl
I realize that the general problem of object recognition can be arbitrarily
difficult, but so is the general problem of text recognition: How can a
computer determine if a downward stroke is a one, or a lowercase L, or an
uppercase I? And yet the text-recognition captchas have been broken -- not
because the problem is easy but because captcha breakers have exploited
artifacts of individual captchas to get a correct answer a modest percentage
of the time. The 3D captcha (as the article author described it) is highly
constrained -- a small library of objects in static poses -- so it has
similarly exploitable artifacts.

The captcha-breaking computer has no need to infer what an object would look
like from another view if someone has already manually reproduced the library
of models; in that case the problem reduces to identifying which models from
the library are in the picture and what angle they are being viewed from.
Although the problem is no doubt difficult, the silhouette strategy I
described is similar to other published object recognition approaches known to
work, e.g.:

<http://citeseer.ist.psu.edu/bandlow98recognising.html>

And the approach doesn't need to work perfectly: the captcha breaker is only
interested in improving his chances of guessing correctly. If an automated
approach only guesses correctly even 20% of the time, the captcha is
effectively broken.

In the case where two images of objects are very similar -- like your CRT vs.
LCD example -- even a human would have difficulty differentiating. By
definition that makes these objects bad for the captcha, so the captcha author
would either leave them out of the library of objects, or he would need to
make the captcha more tolerant of human error, which makes things easier for
the captcha-breaker.

~~~
zepolen
Agreed, however the article already acknowledges that approach (read near the
end about the flower).

Luckily, unlike text, which follows a very constrained set of rules, (eg an X
will always be two lines criss-crossed), the same doesn't apply to 3D objects,
where you can have 2 images of the same object that look entirely different, a
simple example being the chair, that comes in all varieties of shapes but
still easily identifiable to a human.

So this would automatically require human input in respects to identifying the
object, you can't create a program that would 'learn' new objects, at least,
not yet.

Also, the silouhette strategy can only be applied when a shape remains
relatively constant, moving the camera a little to the left would render a
completely new silouhette.

Add that the bot would still need to be told how to answer the arbritrary 'How
many legs does the chair that the man is sitting on have?' questions.

The fact that so much human input is required just to identify /one/ object in
the captcha, the fact that once that object has been compromised it is trivial
to switch in another one (which is impossible in text captcha because there
are only 26+10 amount of characters that the whole world knows) means that
this is a damn effective captcha.

------
jordyhoyt
Imagine what cracking this will do for the world of image processing! :)

------
bprater
Not sure how this is entirely different that using a library of images that
are tagged with common words. Is it a cat? dog? frog? giraffe?

~~~
jcl
The 3D captcha generates new images rather than using a library, so a captcha
breaker needs to use a more sophisticated approach than comparing the current
captcha image to previously seen images.

The difficulty of the image library captcha depends on the size of its
library, while the difficulty of the 3D captcha depends on the fact that it's
much easier for a computer to go from a 3D model to a 2D image rather than the
other way around.

------
jrockway
Considering we have image recognition algorithms that can control cars driving
on the road, I doubt this would be that difficult to break.

It would, however, be very difficult for anyone to correctly guess ;)

~~~
breck
_Considering we have image recognition algorithms that can control cars
driving on the road, I doubt this would be that difficult to break._

We do?

~~~
jrockway
<http://en.wikipedia.org/wiki/DARPA_Grand_Challenge>

------
dkasper
Seems like it definitely violates Krug's "Don't make me think" principle, but
if it's the best we've got then I guess it's ok for now.

