
Using WolframAlpha to Hack Text CAPTCHA - joelvh
http://joelvanhorn.com/2010/11/10/using-wolframalpha-to-hack-text-captcha/
======
jluxenberg
Was curious about the "text captcha" service. It's a collection of questions
with MD5 sums of acceptable answers.

They provide an API, but I think this is a case of a project being a "service"
to keep the database of questions from being free. There's no technical reason
for this to be a service, and it's not a terribly complicated product that
would be difficult to scale. It's a static database!

Might be neat to create an open-source bank of these CAPTCHA questions. Maybe
I'll throw something together this weekend.

~~~
elliottcarlson
Wouldn't an open-source bank of CAPTCHA questions open the door for an open-
source bank of answers to these questions?

~~~
jluxenberg
No, that's the reason the answers are hashed. You can't get the answer from
the hash, since a hash is a one-way function. This is the same reason you
never store passwords in your database in plaintext, but rather hash them
first.

~~~
elliottcarlson
That's not the issue - as soon as you make a list of questions available for
the world, all it takes is one spammer to create a matching list of answers
and they can go to town. By providing that list of answers as open source you
are making it easier for someone to create the counter part answer database.

~~~
nodata
<http://xkcd.com/810/>

~~~
eru
<http://xkcdexplained.com/post/1395792545/constructive>

~~~
archgoon
Captchas are not restricted to reddit and social news site, despite what your
link claims.

~~~
eru
Oh, that was just to make fun of xkcd. I don't think xkcd links are really
that good an addition to a discussion here.

------
notyourwork
This is a very interesting application of WolframAlpha but it appears to be
purely luck when "success" was the result. Using things such as "2nd item in
a..." or "7th digit in..." work in a lot of cases but lets talk about a few.

"2nd fruit in bear apple goat orange" would result in apple because it is
looking for second in a list and neglects context of fruit.

"7th digit in abc123def456ghi789 " would result in d when it should be 7.
Again not understanding context and merely looking at logical construction.

~~~
pygy_
_> "2nd fruit in bear apple goat orange"[1], "7th digit in
abc123def456ghi789"[2]_

It barfs on these ones, but not like you predict (it's actually worse). _"The
2nd colour in purple, belly, yellow, arm, white and blue"[3]_ gives back
yellow, though, so it's not that stupid.

[1]
[http://www.wolframalpha.com/input/?i=The+2nd+fruit+in+bear+a...](http://www.wolframalpha.com/input/?i=The+2nd+fruit+in+bear+apple+goat+orange)

[2]
[http://www.wolframalpha.com/input/?i=7th+digit+in+abc123def4...](http://www.wolframalpha.com/input/?i=7th+digit+in+abc123def456ghi789)

[3]
[http://www.wolframalpha.com/input/?i=The+2nd+colour+in+purpl...](http://www.wolframalpha.com/input/?i=The+2nd+colour+in+purple,+belly,+yellow,+arm,+white+and+blue+)

~~~
mquander
Oh, not that stupid, huh?

Query: _The 2nd colour in purple, belly, yellow, arm, white and blue_

Answer: _yellow_

Query: _The 3rd colour in purple, belly, yellow, arm, white and blue_

Answer: _yellow_

Query: _The 7th colour in purple, belly, yellow, arm, white and blue_

Answer: _yellow_

Query: _The bluest colour in purple, belly, yellow, arm, white and blue_

Answer: _yellow_

:-)

~~~
VeXocide
I wonder what its favorite colour is ;-)

~~~
MarcusAurelius
Try it:
[http://www.wolframalpha.com/input/?i=what+is+your+favorite+c...](http://www.wolframalpha.com/input/?i=what+is+your+favorite+color%3F)

~~~
moeffju
That's according to Python. Ask about the favorite color of Wolfram Alpha:

[http://www.wolframalpha.com/input/?i=what+is+your+favorite+c...](http://www.wolframalpha.com/input/?i=what+is+your+favorite+color%3F&a=*DPClash.MiscellaneousE.what+is+your+favorite+color-_*WhatIsYourFavoriteColorWolframAlpha-)

------
robtuley
Great discussion.

To clarify, there is not a precomputed DB of an enormous number of questions
although this would be possible to derive and does occur on a lower level for
caching performance purposes. The total count comes from
permutation/probability maths based on the question construction algorithm --
when you request a question it generates one which means I can extend the pool
quite easily without re-generating a monster cache table.

It is impressive how good Wolfram is at decoding logic, I'll have to have to
think about the question construction but I can't make the question too
confusing for a real person to solve. As someone mentioned, maybe more
abstract questions would be stronger but the difficultly of course lies in
generating them. I certainly think logic questions are weaker than a decently
obscured/randomised image captcha, but they come with other advantages and
work in text-only contexts (e.g. IM-type challenges).

------
gojomo
Some ALMOSTs could be turned into SUCCESSes with a few postprocessing rules-
of-thumb, like:

\- the CAPTCHA usually wants a single word or number

\- the desired word is usually the rarer or later one

~~~
elliottcarlson
It's still a game of guesswork. Generally if you fail the CAPTCHA you will be
offered a new one; and any good system should lock you out after a certain
amount of failures.

~~~
jerf
Assume spammers are using botnets. The problem of locking them out is as hard
as detecting them in the first place.

------
codefisher
At the end of the day this is why I don't use CAPTCHA's on my site, they can
be broken by a bot that is smart enough. The better option is to use something
to analyse the contents of the spam to decide what it really is, and there are
some tools that are really good at doing this. Heck I even found for one form
that banning '<http://> (and notifying users with JavaScript if they typed it)
stopped 100% of the spam I was getting.

------
phwd
I am not sure I understand what he is saying. All the "results" that he is
talking about are the interpreted inputs for the application to process.

So for," What is seven hundred and forty four as a number? ", the interpreted
input is a NumberQ function taking the main part "seven hundred and forty four
as a number" and evaluating whether it is a number or not. The real result is
true.

The zoologist one has already been talked about. The rest other than the 7th
digit question are all false.

There are many different choices for the inputs, for example with the colour
question

The 2nd colour in purple, yellow, arm, white and blue is?

There seems to be some popularity going on. The first choice as input is
yellow and the second choice is blue. To further test replacing yellow with
black leads to blue as the first choice. Then again even if you were to use
the interpreted inputs you would have to determine the syntax for wolfram
which last time I checked is not available and is basically a guess the syntax
game.

<http://webapps.stackexchange.com/q/1322/40>

If someone would care to enlighten me on how this could actually work I would
greatly appreciate it, otherwise this method does not seem like it will work.
Nice creativity though.

~~~
gridspy
Even a 1/10 hit rate is sufficient. You get 10 questions from the website,
return the results and one of them leads to your spam comment being posted.
You then repeat that process thousands of times.

------
elliottcarlson
After testing alternate versions of the three successes (to see if those were
based on luck), only two remain to be successful; Changing "The 2nd colour in
purple, yellow, arm, white and blue is?" to "The 2nd colour in purple, arm,
yellow, house, white and blue is?" causes the question to fail.

~~~
joelvh
Good point. That gives a good indicator as to how the algorithm works. Not
necessarily based on colors, but rather words in a list...? Maybe the
construction of Text CAPTCHA sentences needs to be chosen carefully when
thinking like an algorithm....

------
zith
If someone would like to try with some more questions and can't bother to
write a screen scraper for the demo page, I scraped a few questions a while
back:

<http://www.filedump.net/dumped/quesions1285428813.txt>

~~~
rndmcnlly0
Thanks for doing some extraction!

Using your file I've been building up a solver for these questions in Prolog
(using DCGs for parsing and simple predicates for the common sense facts and
answer calculation). It's nowhere near "done", but it does get 45% of the
questions now after only a few hours and 450 lines of code.

The trick to factoring someone else's generative space is to spot the
symmetries and build your self a little domain-specific language for
explaining those symmetries to your program.

Here's some snippets:

    
    
        tomorrow(wednesday,thursday).
    
        food(butter).
    
        body_part(arm).
        plenty(arm).
        above_waist(arm).
    
        ordinal(2) --> lit('2nd').
    
        question( tomorrow -> Answer) -->
            p(['Tomorrow is ',token(Tomorrow),'. If this is true, what is today?']),
            { tomorrow(Answer,Tomorrow) }.
    
        question(count_2(P) -> Answer) -->
            p(['The list ',token_list(L),' contains how many ',pred(P),'?']).
            {include(P,L,Goods), length(Goods,Answer)}.
    
        pred_name('something each person has more than one of',plenty).

------
liuliu
The idea: randomly pick one word in the input and make it to be the answer. It
seems from the demo page that most of the answers is already in the question.
Suppose 50% of questions that have answers in them, and suppose the average
length of the question is about 10 words, you have 5% chances to get it right.
and with a computer, this is a quite easy one.

------
aneth
I'm mostly impressed that someone found a use for WolframAlpha.

~~~
drewse
There are many applications...

[http://www.usereffect.com/topic/5-useful-uses-for-wolfram-
al...](http://www.usereffect.com/topic/5-useful-uses-for-wolfram-alpha)

[http://mashable.com/2009/05/19/wolfram-alpha-better-than-
goo...](http://mashable.com/2009/05/19/wolfram-alpha-better-than-google/)

<http://blog.wolframalpha.com/>

