Hacker News new | past | comments | ask | show | jobs | submit login
John Resig dissects the neural network javascript OCR captcha code (ejohn.org)
90 points by mark_h on Jan 24, 2009 | hide | past | web | favorite | 14 comments

If I'm interpreting this correctly (which I may not be, since there is some ambiguity in the wording) Resig seems to have a bit of a misunderstanding of how neural networks work.

He says that an edge weight in the neural network means "At pixel 9x6 the letter A is 58% likely to be filling in that pixel." That isn't how neural networks work (although, it is fairly close to how bayesian belief nets work). In a neural network the edge weights of the graph don't obviously correspond to anything - they are simply chosen (most commonly through the back propagation training algorithm) to minimize the error between the desired output and actual output on the training data.

The weights in a neural network are simply the coefficients for terms in an equation that, when plotted, produces a curve that's a good fit for a series of data points. Unfortunately, that explanation is not very sexy ;-) That's one of the big problems with neural networks - they're effectively just black boxes that incidentally produce pretty good answers. They are (in most of their standard incarnations) really just a fairly simple technique for regression analysis.

Incidentally, I got to see Resig speak about jQuery yesterday, and he's clearly a very clever guy. I say this with all due respect, and fully expect that this is just a misunderstanding due to me misinterpreting the sentence I quoted in my second paragraph.

I think you can be a very clever guy and still have misunderstandings of how neural networks work. They aren't contradictory!

Heh, yeah, I'm almost certain that I'm just wrong in my interpretation - I may know JavaScript but I'm still a neural network newb. I've updated that paragraph to, hopefully, be a little more correct.


Offtopic ok but: Would be handy if anchor links downmodded would also appear grey. Now they really stand out.


Also offtopic, are .biz and .info domains used for anything that isn't a scam? Anyone seen a legitimate website hosted at .biz/.info?

Is it just me that has a mental filter "Never click on .biz .info .name"

off the top of my head, http://z80.info/ http://mta.info/ http://beesbuzz.biz/blog/

edit: I still think .info should have been .nfo and been restricted to ANSI art.

My personal website is a .name

I assure you I'm not trying to sell anyone anything, I just couldn't get the .com

I definitely agree with .biz though. Just looking at it invokes a scammy sensation. There are lots of good .info sites though... regular-expressions.info and magiccards.info come to mind.

My personal website was .info

Luckily I've since been able to get a decent .com (and .co.uk)


codecon.info. erica.biz.

As a very offtopic question, is there a central registry of all the mainstream-ish companies that spam or do other unscrupulous things? Just so I can make a point of avoiding giving them money ...

I often considered creating such a directory, but there is a problem: what if a competitor tries to hurt a business by spamming in it's name? Therefore I think such a directory is not a good idea, or at least there would have to be mechanisms to prevent that issue.

I used to have a domain which had its URL included in a completely unrelated spam. A blacklisting organisation (voluntary, I think) contacted my ISP to tell them I was going on their list unless I could explain what had happened. I spent a very long, depressing email conversation explaining to them exactly what database poisoning was, and why the spammers would include unrelated URLs in a spam email. So many ways to fail at that kind of endeavour...

(The URL was for my open source project. Clearly, they didn't even check the webpage before issuing their threat.)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact