Hacker News new | past | comments | ask | show | jobs | submit login

A few years ago, or so i think, people went all crazy talking about a replacement for captcha's: Show a range of images, and make the user pick the image described by a block of text.

How come nobody adopted that approach?

Because the math doesn't work. Most "next-gen" captcha fundamentally fail (by orders of magnitude) one of the many pillars that make captchas scale....

1. Is it trivial for a human to answer correctly? This affects growth.

2. Can humans do it quickly? This affects growth.

3. How is the random guess-rate? This better be abysmal.

4. How good is the “opposing” technology?

5. How is the guess rate of a sophisticated attacker, using said technology?

6. How much human input is required to create your captcha? You better be asymptotically better than human-solving the captcha.

7. What are the cultural and accessibility issues?

8. The user may have a slow computer.

I remember suggestions of using computing power to slow down guess-rates. Probably related to bitcoins. However, it doesn't work since some users don't seek better computer performance.

"Anyone can invent a security system that he himself cannot break." - http://www.schneier.com/blog/archives/2011/04/schneiers_law....

Any CAPTCHA scheme that can be solved by enumeration of all possible answers is a failure, because there are cost effective ways to hit a CAPTCHA over and over again, with cheap humans, and build the enumeration table. This is where the "pick the image with a cute thing" in it scheme falls down. In this case, once the enumeration of description -> image(s) is determined, you lose.

Any scheme that involves humans some how creating tags or labeling images or writing text will generally be enumerable as well, because they can trivially out-manpower you.

Also, many CAPTCHA schemes use a model of spammer in which the spammer isn't permitted to be clever. If there is a pattern, in the real world the spammer is "allowed" to exploit it. There are 2^64 different ways to add two 32-bit numbers to each other, but that doesn't mean that you can beat a spammer just by asking the user to do a simple addition, because when I say "enumerate" I mean it more in the computer science sense, not the literal sense. They can and will create something that parses the problem and does it, so for instance for my stupid "add two random 32-bit numbers" example the CAPTCHA is actually easier for a computer than a human.

CAPTCHAs are hard and getting steadily harder... at least, if you require them to work. Security theater is easy.

If you only have a limited collection of images to pick from, then bots could get decent scores by picking at random. A better approach might be to ask users to pick matching images (ie. 2^N possible choices).

What would the system use for its corpus of images and text descriptions? The corpus would have to be significantly larger than what any given attacker could manually identify. Once an attacker has manually identified an image+text combination, they could store the combination and use it to solve any future CAPTCHAs with the same image+text.

Mainly because, to quote Spolsky, Users don't read instructions.

If the captcha is ANYTHING other than immediately obvious, a signficant number users will not be able to pass it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact