

Hacking and defeating Google's reCAPTCHA with a 99% accuracy - ahmadss
http://www.dc949.org/projects/stiltwalker/
"Note: In the hours before our presentation/release, Google pushed a new version of reCAPTCHA which fully nerfs our attack."
======
apendleton
One exciting thing about this: the entire model of reCaptcha (at least the
text ones; I assume the audio ones are similar) is to make people do useful
work when solving captchas by having them complete tasks that they consider
too hard for computers to do well (in the text reCaptcha case, OCR). If
someone writes software that can defeat the captcha, it does mean the security
model is broken, but it also means the state of OCR technology (or audio
recognition or whatever) has been advanced, and the digitization of books that
had previously required human intervention can now be accomplished by
automated means. In other words, spammers are incidentally creating the tools
to expand the scope of digital human knowledge. Win-win, really.

~~~
fuelfive
Unfortunately, this attack does nothing to advance the state of the art in OCR
(or audio recognition). It's basically the same story as every other CAPTCHA
attack to date: take advantage of some accidental statistical regularity in
the generation function. As soon as this kind of flaw is discovered, it only
takes a few hours for the generation code to be patched in such a way that
completely prevents this sort of attack from working.

~~~
fghh45sdfhr3
So either the code is easy to patch, or we DO advance. Win/Win?

~~~
xibernetik
Not really... Even if the code is difficult to patch, speech/audio recognition
doesn't advance much when an attacker figures out how to remove the (non-
random) noise added by a machine over the sound file. Actual speech
recognition relies on the ability to filter out background noise - which is a
lot more complex/random - added by surroundings, not a machine.

It's very difficult to generate some sort of noise via algorithm that a)
humans can filter out and b) can't be removed by some algorithm. As a result,
audio captchas are a huge vulnerability and the weakest link in almost any
captcha system, although you can't get rid of them by law.

Hypotheticals aside, the code was easy to patch - note the footnote: > In the
hours before our presentation/release, Google pushed a new version of
reCAPTCHA which fully nerfs our attack.

~~~
d2vid
Could one take real recorded noise and add that rather than noise generated
via algorithm? Wouldn't that force attackers to solve a real problem (removing
background noise from an speech sample)?

~~~
robryan
Yeah, you would think they could record thousands of hours of real world noise
then randomly use sections of it on each audio captcha.

~~~
A1kmm
If the attacker manages to obtain all the random noise, they could index every
window in the noise in a k-d-tree and perform an efficient nearest neighbour
search for the exact background from the CAPTCHA audio, and then simply
subtract the background, giving perfect segmentation in O(log(N)) asymptotic
average time complexity for N windows (at 64kHz and 2000 hours of audio,
N=460800000, log N = 19.95).

------
drharris
After so much work, gotta love the footnote here: "Note: In the hours before
our presentation/release, Google pushed a new version of reCAPTCHA which fully
nerfs our attack."

~~~
barik
This isn't a huge problem for security people. Indeed, at most academic
security conferences, by the time you present your work it has usually already
been broken and/or countered.

To goal of good research (and one that differentiates researchers from
criminals) is to present a proof of concept and to _advance the state of
security_. The fact that security is a perpetual arms race is incidental.

Well, at least that's what security researchers tell themselves anyway to
avoid going mad. :)

~~~
JackC
What makes it funny is that the bulk of the article isn't "here's how we did
it and what we learned," but "here's what you need to do to get our code
running on Ubuntu." Then you get to the footnote: "PS: It's pointless to get
our code running on Ubuntu."

I spent the whole article wondering why they were so interested not only in
presenting a proof of concept, but in getting as many people as possible
actively breaking captchas. Then I got to the end and switched to wondering
whether the whole thing is an elaborate prank.

~~~
barik
Yes, I thought this particular link was perhaps not the best way to present
the work, mainly because the interesting part is actually this: "We
accomplished this with a combination of Machine Learning, hashing methods,
keyspace reduction tactics, and taking advantage of an overall limited number
of captchas. Specifically, Stiltwalker goes head to head against reCAPTCHA'S
audio captcha system and defeats all but a sliver of it's challenges."

On the other hand, it looks like they provide a corpus
([http://www.dc949.org/projects/stiltwalker/stiltwalker-
corpus...](http://www.dc949.org/projects/stiltwalker/stiltwalker-corpus.tar))
[1.5 GB!] that you can still use to run the program.

------
omonra
This may be very interesting to crack, but who is responsible for Google
making their CAPTCHA almost impossible for human to decipher now? I seriously
have to click 5 times before even seeing anything resembling letters I can
parse

~~~
verroq
You only have the type one word correct, and they make sure one word is
readable so with some practice you can easily recognise which word is the key-
word.

~~~
ohgodthecat3
He isn't speaking of reCAPTCHA but google's nearly impossible to read captchas
that take way too much time to decipher.

If you'd like to see it you can usually get it by putting in a bad password to
a gmail account too many times (though I don't know if that has other
consequences).

Edit: Here is an image example <http://www.techian.com/wp-
content/uploads/captcha.png>

~~~
sp332
I wonder what they do for non-Latin locales?

------
throwaway1979
Google's captcha system is horrid. I've mentioned this to people on the
accessibility team but to no avail. They used to have a wheel chair icon next
to the bloody scrambled text. I taught a computer class to seniors and it was
painful watching them deal with the account sign up process (also, I thought
it was insulting asking a mobile senior to click on the wheel chair icon ...
to the designer ... FU!). Clicking on the wheel chair would give audio that
barely made any sense to me. The whole process was stupid.

Like many others, I can barely get through their captcha service. I'm actually
happy people circumvented it. Maybe someone will think it through this time
around.

~~~
tmh88j
I also thought that was insulting with the wheel chair icon. A person in a
wheel chair has problems walking, not (necessarily) their vision. How about
"Help" in text?...less confusion and possible anger

~~~
excuse-me
Since the point of the audio version is not to be hit with lawsuits under the
ADA - perhaps it should just be a little icon of a lawyer?

~~~
PassTheAmmo
I imaging that whole sites could be designed using only your proposed lawyer
icon, possibly with some additional icon representing political correctness.

------
s_henry_paulson
Here's the Ars Technica article which does much better job explaining the
system:

[http://arstechnica.com/security/2012/05/google-recaptcha-
bro...](http://arstechnica.com/security/2012/05/google-recaptcha-brought-to-
its-knees/)

~~~
ahmadss
agreed, thanks for posting the ars link.

the youtube video in the OP does a great job of explaining and the thinking
behind the attack, and even though it's an hour long, it's worth the watch.

~~~
seanp2k2
Video tutorials are awful if they're the only option. I read very quickly, and
videos are typically paced at the lowest "average" comprehension level, so
they're a painfully inefficient way to relay technical information.

Plain text wins again, at least for one-way technical communication.

~~~
eli
I tend to agree with you, but FYI you sound pretty arrogant when you say it
that way.

~~~
dunmalg
How about when I say I hate videos of what should be written text because
years of exposure to gunfire in the Army has impaired my hearing?

I will never understand why people think that the crappy audio channel of a
youtube video of some guy "umming" and gulping through a poorly prepared
speech, all recorded on a $5 microphone, is a suitable way of passing
information.

------
dutchbrit
I actually tried hacking reCaptha via audio and the Google Speech to text API
a few days ago. It didn't work unfortunately, it really frustrates me at times
when I have to refresh reCaptcha 10 times to actually be able to read the damn
thing!!

~~~
robotmay
What problems did you run into when trying to use the speech API? It's quite
an interesting avenue for bypassing these in the future as speech recognition
gets better. Also highly entertaining when you use Google's own service
against them.

~~~
dutchbrit
Exactly my thought, with using their own service! The thing was, is that the
background noise made the API not recognise the actual words (I think - since
you can't really "debug"), but maybe if you reduce the noise, it might be
possible - I wanted to give that a go, but haven't had a chance yet. I just
don't know of any ways to automatically reduce noise.

~~~
coldpie
I suspect you haven't gotten around to it, but did you study the noise at all?
According to Wikipedia
<[http://en.wikipedia.org/wiki/Voice_frequency>](http://en.wikipedia.org/wiki/Voice_frequency>),
85-255 Hz is the usual range of human speech. You could try a band-pass filter
with that pass band, and probably a pretty high roll-off (this will cut off
sounds like S and X, but I suspect voice recognition software is often capable
of dealing with those, given how often they fail to be represented well in
discrete audio).

Audacity can do this, and there's almost certainly automatible tools out
there. Worth a shot, anyway.

~~~
dutchbrit
I found Audicity when I was playing around, but I didn't get round to actually
trying it. Didn't know about the frequency so thanks!!!

~~~
goostavos
The coursera Machine Learning course references an algorithm that can separate
speech from noise, which a fairly high degree of accuracy. May be worth
looking into. Noise removal tools like audacity tend to introduce "watery" or
"bubbly" effects when pushed hard.

------
danso
In systems that are less secured than Google, the audio catchpa seems trivial
to break...I think I've seen one on court sites that read a combination of
numbers from 1 to 9 with some variance in the vocal speed. I'm not an audio
engineer but that seems fairly trivial to crack (though maybe their visual
catchpa would be easier...I dunno, not an expert in OCR either).

It's a good lesson in a form of social engineering. Sites have to provide this
alternative access for the visually impaired...yet I bet the
resources/creativity put behind it is not at the same level as the kind put
into the catchpa used by 99% of the userbase. Furthermore, the most important
client -- your boss -- is likely to not be blind him/herself, which eliminates
that extra critical layer of oversight.

------
Graphon
> Note: In the hours before our presentation/release, Google pushed a new
> version of reCAPTCHA which fully nerfs our attack.

I take it that "fully nerfs" means this defeat of recaptcha is no longer
useful?

------
roguecoder
Pretty sure I don't have 99% accuracy at solving reCAPTCHAs. Perhaps it's
become a CAPITCHA, Completely Automated Public Inverse Turing test to tell
Computers and Humans Apart...

------
joelthelion
What are we going to do once all CAPTCHAS are completely broken?

~~~
neotek
CAPTCHAs are ultimately already completely broken - they don't really slow
anyone down except ordinary users, because spammers are using human labour to
solve CAPTCHAs at the rate of one or two dollars per thousand solved.

------
verroq
Their testing and high success rates on the public Google reCaptcha test
probably tipped off a Google employee internally.

------
kedyr2
captchas are very annoying. it is surprising that they have lasted this long.
what would be the best alternatives?

~~~
TillE
Obscurity. If you're not a significant target, you can get by with simpler
methods.

Deciphering obfuscated text is one of very few verifiable tasks that humans
can do which computers cannot.

Alternatively, you can look at other methods of identity verification. If you
ask for $5 to create an account, you'll have few spam problems.

------
monsterix
Now here is some serious hacker's news! Was getting nagged by 'what NewEgg
can/can't do' posts off late.

------
wendsday
A company that relies on a bot that accesses others' resources to make money
and at the same time relies on reCAPTCHA to frustrate other bots from
accessing it own resources.

It may be easy to do today, but, going forward, how do we determine which bots
are "good" and which ones are "bad"?

Clearly, simply being a "bot" does not imply "bad" intent. If it did then we
should all be blocking search engine bots. Yet this is what reCAPTCHA does: it
blocks not based on intent, but based on the characteristic of being a "bot".

~~~
akincisor
It's easy. One bot reads a text file and only accesses the resources outlined
in the file. The other maliciously tries to eat up resources without
restraint. Good - Bad.

~~~
wendsday
Like I said, it's relatively easy today. But how about going forward?

There will be lots more bots and lots more usages. Things might not be so
simple. For example, some sites might exclude all search engine bots except a
chosen few despite the fact they all honor robots.txt and behave essentially
the same.

(BTW, if by "text file" you mean robots.txt, isn't that an _exclusion_ list?
You seem to be saying it's an inclusion list.)

