Hacker News new | past | comments | ask | show | jobs | submit login

This is great. I particularly like that they also automatically generated dirty versions for their training set, because that's exactly what I ended up doing for my dissertation project (a computer vision system [1] that automatically referees Scrabble boards). I also used dictionary analysis and the classifier's own confusion matrix to boost its accuracy.

If you're also interested in real time OCR like this, I did a write up [2] of the approach that worked well for my project. It only needed to recognize Scrabble fonts, but it could be extended to more fonts by using more training examples.

[1] http://brm.io/kwyjibo/

[2] http://brm.io/real-time-ocr/




It seems your dissertation paper is behind something password protected [1]. It would be nice to see that too.

Can't get [1]https://www.dcs.shef.ac.uk/intranet/teaching/campus/projects...


Hmm looks like they have, well here's another link to it: https://dl.dropboxusercontent.com/u/1672291/scrabble-referee...


Glad there's prior art on that. I had a small project where I iterated all the fonts on the system and used them to generate glyph training images. The next step was to dirty them up, but I never continued the project.

More generally, I really like the idea of generating controlled synthetic images and then messing them up for regularization.


Funny, just read an article today proposing the same feature detection algorithm (the one you called 'grid merge'). Have you tried applying these techniques on scanned/photographed documents?


Could you link to it please?

I've not tried it on anything else, but I remember thinking that it has a lot of potential uses. Also I only used it on gray-scale features, but I'm sure it could make use of full RGB too. I'll have to try it some time!


"We also investigated hierarchical features where the image is overlaid with a grid of cell size c × c and pixels withins each cell are added up. This is same as downsampling the image and using the raw pixels in the downsampled image as features." (p. 3)

http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-15...


Sounds similar to one level of a pyramid:

https://en.wikipedia.org/wiki/Pyramid_(image_processing)


excellent project. as a scrabble player, i'm very interested - it would be a great way to run a blitz tournament, for instance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: