

The Algorithm That Can Crack CAPTCHA (and is changing AI) - urs2102
http://www.fastcolabs.com/3021474/open-company/the-algorithm-that-thinks-like-a-human?utm_source=facebook

======
fchollet
Being familiar with Dileep's past research: the algorithm used by Vicarious
here is very likely based on the Hierarchical-Temporal Memory (HTM) technology
developed by Hawkins, Dileep and the folks at Numenta [1].

HTM is documented in a Numenta white paper [2] and in an older form in
Dileep's PhD thesis [3]. Numenta has also open-sourced an implementation of
HTM, NuPIC [4].

It is important to note that, while HTM is a very interesting approach to
machine learning, it has never been shown to beat (or even concurrence) the
state of the art on any problem.

I have been following Numenta/Grok's progress closely because they seem to me
to have the most potential for pushing this tech forward. Also I like their
open-source strategy compared to the 100% secret approach of Vicarious.
Vicarious has an impressively bombastic PR (that you can see in action in the
above article) but basically nothing to show for it (until now, that is, if
this breaking of CAPTCHA turns out to be real and the be more reliable than
existing state-of-the-art CAPTCHA-breaking algorithms).

Long before they had any working technology, Vicarious founders had been
throwing around quotes such as: "[our goal is] in five years to make a human-
level vision system. Anything a human can recognize visually our algorithm
would recognize visually." (2011) [5]

And today, after creating a CAPTCHA-breaking program (which has already been
done in the past; I'd like to see a benchmark comparison), they're basically
claiming they've created a human brain.

They also claim they will have a strong AI by 2026 [6]. Bombastic much?

[1] [https://groksolutions.com/landing-
page.html](https://groksolutions.com/landing-page.html)

[2]
[http://numenta.org/resources/HTM_CorticalLearningAlgorithms....](http://numenta.org/resources/HTM_CorticalLearningAlgorithms.pdf)

[3]
[http://alpha.tmit.bme.hu/speech/docs/education/02_DileepThes...](http://alpha.tmit.bme.hu/speech/docs/education/02_DileepThesis.pdf)

[4] [http://numenta.org/nupic.html](http://numenta.org/nupic.html)

[5] [http://blogs.wsj.com/venturecapital/2011/02/10/vicarious-
sys...](http://blogs.wsj.com/venturecapital/2011/02/10/vicarious-systems-says-
its-artificial-intelligence-is-the-real-deal/)

[6]
[http://www.artificialbrains.com/vicarious](http://www.artificialbrains.com/vicarious)

------
ColinWright
I point you here:

[https://news.ycombinator.com/item?id=6633515](https://news.ycombinator.com/item?id=6633515)

Quoting:

The popularity of this story evidences how universally hated are CAPTCHAs.
here are some of the submissions:

[https://news.ycombinator.com/item?id=6625245](https://news.ycombinator.com/item?id=6625245)
(forbes.com)

[https://news.ycombinator.com/item?id=6625247](https://news.ycombinator.com/item?id=6625247)
(kurzweilai.net)

[https://news.ycombinator.com/item?id=6625351](https://news.ycombinator.com/item?id=6625351)
(technologyreview.com) <\- Main discussion

[https://news.ycombinator.com/item?id=6626405](https://news.ycombinator.com/item?id=6626405)
(vimeo.com) <\- video of process in action

[https://news.ycombinator.com/item?id=6627848](https://news.ycombinator.com/item?id=6627848)
(vicariousinc.tumblr.com)

[https://news.ycombinator.com/item?id=6628086](https://news.ycombinator.com/item?id=6628086)
(cbc.ca)

[https://news.ycombinator.com/item?id=6628092](https://news.ycombinator.com/item?id=6628092)
(wired.com)

[https://news.ycombinator.com/item?id=6629173](https://news.ycombinator.com/item?id=6629173)
(wired.com)

[https://news.ycombinator.com/item?id=6629559](https://news.ycombinator.com/item?id=6629559)
(newscientist.com)

[https://news.ycombinator.com/item?id=6629656](https://news.ycombinator.com/item?id=6629656)
(dailydot.com)

[https://news.ycombinator.com/item?id=6629708](https://news.ycombinator.com/item?id=6629708)
(mashable.com)

========

There are two other comments worth reading.

[https://news.ycombinator.com/item?id=6629173](https://news.ycombinator.com/item?id=6629173):

    
    
        That's a lot of fancy words to say that they overfit
        to their training data.
    
        That makes this sound like a very typical result in
        supervised machine learning (if it's a result at all).
        They have used an algorithm to learn a brittle heuristic
        that works in the cases it was trained to work on.

------
andrewcooke
anyone have any actual information on the algorithm? this is just fluff.

~~~
BrandonMarc
Agreed. The article gives a good primer on how CAPTCHAs work, how basic
_versions_ of them used to be easily solve-able, and how / why a general-
purpose solution - like this is described to be - would be amazing ... and a
bit of background on how such a solution would be similar to human
"processing" anyway.

What the article doesn't describe is any detail of how it works. Then again
... the creators of such a revolutionary technology, if true, would want to
keep it protected in various ways, not to mention out of adversaries' hands
(adversaries = spammers, criminals, other companies, other researchers).

------
vezzy-fnord
Is this really some master algorithm that can crack all kinds of CAPTCHAs?

CAPTCHA is pretty much an umbrella term for a variety of anti-spam protection
and human verification procedures, many of them having nothing in common
beyond the name.

Thus "cracking CAPTCHA" is meaningless unless you specify what type(s).

You'd need textual analysis, sentiment analysis, expression evaluation, OCR,
full-scale image recognition, puzzle solving and a variety of other techniques
to scratch most implementations. So it can't be one algorithm. It'd have to be
many.

Then in most cases it'd probably be simpler to figure out some client-side
exploit in the way CAPTCHAs handle and send data. Find a weak link, much like
DC949 did with reCAPTCHA.

