

Breaking the Silk Road's Captcha - DavidChouinard
https://github.com/mieko/sr-captcha/blob/gh-pages/index.md

======
lstyls
This a great writeup and super easy to follow along with. The figures are
really nice!

One observation: training a neural net to classify segmented characters is
probably overkill. The author observed that the font never changed, but never
ended up exploiting this fact. After the very effective preprocessing,
thresholding, etc the characters are almost identical to the 'average'
representations the author generated!

I bet it would be enough simply to classify an unknown character by the letter
that it shows highest correlation with.

~~~
malgorithms
Agreed!

Also, given the successful character extraction, and the knowledge that (1)
the font doesn't change, and (2) they're just translated and rotated, I think
performing those operations on the individual characters could've yielded a
pretty perfect success. Simply try a whole bunch of shifts and rotations on a
given character until it matches a reference, almost exactly.

~~~
mieko
This is all true. The first thing I built was an awesome hammer, and I had a
large amount of nails to deal with.

In retrospect, I'm sure there was a good deal of more focused, surgical
solutions that'd save some overhead.

------
patio11
FYI: Captchas are generally considered "broken" at between 1% and 10% rates of
success with automated approaches, because attackers can run hundreds of
thousands of requests, generally "for free" at the margin. There is no
practical difference in the amount of abuse suffered by a site with a 90%
captcha and a 9% captcha -- the first one just requires 10X as many HTTP
requests to abuse.

This is one of the unfortunate "math favors the bad guy" consequences in a lot
of anti-abuse filtering tasks. (Anti-spam research has similar problems, which
is why the main innovation wasn't making filters better but radically
increasing the cost of getting caught, via burning the reputation of the
offending IP. IP addresses are a lot more expensive to acquire in quantity
than packets.)

------
mieko
Author here. Here's the corresponding proggit thread:
[http://www.reddit.com/r/programming/comments/2hisfk/breaking...](http://www.reddit.com/r/programming/comments/2hisfk/breaking_the_silk_roads_captcha/)

------
oftenwrong
That was a surprisingly simple and easy-to-follow write-up. I will have to try
some captcha-breaking for myself soon.

------
goldmouth
Very cool and well-written post.

I've created many similar programs to defeat captcha's. I would classify this
as a medium severity bug, you would still need to brute force the passwords on
a terribly slow and intermittent connection.

------
praeivis
Silk Road used ReCaptcha long ago and it finished bad:
[http://krebsonsecurity.com/2014/09/dread-pirate-sunk-by-
leak...](http://krebsonsecurity.com/2014/09/dread-pirate-sunk-by-leaky-
captcha/)

~~~
bencoder
No it didn't, I believe that image is just an example of "a captcha" for the
article

------
magerleagues
I feel like a much smarter programmer after reading that.

------
usrname
And more + reddit captcha

[https://github.com/dawjan/Open_Me/tree/master/Captcha%20Crac...](https://github.com/dawjan/Open_Me/tree/master/Captcha%20Crack)

Also /.. is php tor

------
blueintegral
Could that last step be considered a kind of Levenshtein distance measurement?

------
krispyfi
The lesson? Include a developer API with your site, so people don't have to
undermine your security to use it.

------
_RPM
I believe that the Silk Road was built on the CodeIgniter frameowkr for PHP.

------
ultramancool
Why wouldn't you just pay a captcha breaking service to get a near-100%
success rate? Less noticable for botting and $10 will buy you around 10k
captchas on antigate or deathbycaptcha. Don't really need to log in and out
that much, so that'd probably be plenty.

~~~
gwern
What a question to ask on HN of all places.

~~~
ultramancool
I'm to assume people on HN have no desire to do something productive or useful
with their time?

