

Google's reCAPTCHA briefly cracked - Mitt
http://www.h-online.com/security/news/item/Google-s-reCAPTCHA-briefly-cracked-1586689.html

======
jack-r-abbit
I always assumed that when given the option to hear the CAPTCHA it was going
to just read me the muddled words I see. Using just 58 different spoken words
doesn't seem anywhere near as secure as the text images... which I guess is
why it was cracked.

~~~
bri3d
The idea behind reCAPTCHA is that it occasionally throws in a word (or street
sign, or occasional fragment of total garbage) that it doesn't know. Hence,
not really possible to just text-to-speech the visual part of the captcha to
generate an audio one, as often the visual captcha contains an unknown.

However, if they're sampling radio programs to use as background noise,
reCAPTCHA could become a learning tool for radio programs as well.

A reliable algorithm for splitting a source stream into high-probability
individual-word portions would need to be devised - then, just like the visual
version, combine one known word and one unknown word, combine with the
background distortion, and this brand of attack is essentially defeated.

I suspect that spoken-word digitization is at least a potential future plan,
although I also suspect the audio version of captchas are requested
infrequently enough that learning new words might take a long time.

~~~
ericabiz
> The idea behind reCAPTCHA is that it occasionally throws in a word (or
> street sign, or occasional fragment of total garbage) that it doesn't know.

Yes, but you don't have to input those. Try it for yourself and see. (I
_never_ input them.)

When you do input them, it helps Google digitize books, street numbers, etc.--
sort of like a "Mechanical Turk"--but it's completely optional. So you could
devise a text-to-speech algorithm, but only for the word Google's CAPTCHA
actually knows. (It's probably in the works as we speak.)

------
zobzu
I find it always amusing that stuff gets "cracked for a moment". You know, as
if the vulnerability wasn't there before, then popped up and boom fixed!

No. The reality is: the vulnerability has been _public_ for a brief moment.
And that's true for all the software in the world. Sometimes really good
exploits stay secret for years and years. Sometimes (often?) they get fixed by
a side effect before even going public, after a year or two.

So not. You're not secure.

------
trun
Very impressive stuff. For what it's worth, I played around with the new audio
reCAPTCHA a bit.

    
    
        - I heard 50+ distinct words, plus the numbers 1-9
        - Numbers are not written out anymore (e.g. "nine" does not work, it must be "9")
        - You must get 9 out of 10 words correct to complete the CAPTCHA
    

So they're still working with a very small dictionary of words, and with a
little practice it's not hard to nail these consistently (as a human). The
extra noise makes it a little more difficult to split the samples, but that's
a solvable problem. I bet they can re-break it with their existing program and
not too much additional effort to retrain the system.

------
mmuro
This is just another example that image CAPTCHA's are a weak spam deterrent
and they are great example of dark UI patterns.

The benefits of using this kind of system are shrouded in the user's cost.

~~~
jack-r-abbit
I won't argue that the benefits are at the user's cost since I pretty much
always have to request a new set of images as the ones I get tend to be so
distorted I can't make out a letter.

But how is _this_ (cracking _audio_ CAPTCHA) an example of how _image_ CAPTCHA
is a weak spam deterrent? I don't see the connection.

------
sixcorners
What happens if you train a neural network too much?

~~~
keeperofdakeys
There is always the tank story/myth <http://neil.fraser.name/writing/tank/>.
Of course, that is more about how a neural network might not be training
itself in the way you expect.

------
trebuch3t
I think it's impressive that an algorithm can crack the CAPTCHAs. I get them
wrong most of the time.

~~~
zobzu
We get to a point where programs actually sometimes guess it right more often
than the user. I do get recaptcha wrong about 50% of the time and I guess I'm
not the only one.

It's sad that I'm not exaggerating. I get the captcha on YT everyday because
apparently my subnet attacks YT a lot.

Now, It's not Google's fault, but the whole captcha thing is getting more and
more broken.

~~~
jack-r-abbit
I don't usually get it wrong.. but that is because I usually request new
images if I'm not pretty sure I have it right. Sometimes the distortion is
absolutely ridiculous.

