

Open Source OCR in JavaScript - aram
http://antimatter15.com/ocrad.js/demo.html

======
jfoster
I am a bit surprised at how low the accuracy seems to be. Does anyone know if
this is typical just of OCR done in JS, or OCR in general? I am aware that at
least one or two implementations are extremely good (eg. Google ones) but are
those complete outliers?

~~~
SIGALRM
in my humble experience using OCR programs, there is always a considerable
amount inaccuracy. no matter what font I use or font size, I always either end
up proof reading the scanned document or just typing it by hand. the letter
"O" is almost always translated by the OCR as a "0" or a zero is translated as
an "O". it can be pretty frustrating.

~~~
Zergy
I used the ABBYY orc engine to digitize printed documents (idk why they
couldn't just keep around the file used to print) and it was quite accurate.
At worst one out of a couple hundred would have enough issues where
readability was an issue.

~~~
gfosco
Similar experience here, when building a mobile app that did OCR +
translations.. As long as the source image was in decent shape, ABBYY did very
well. It's also incredibly expensive.

------
wdmeldon
I liked that the demo shows the program failing. Nice to see the capabilities
AND the limitations displayed front an center. Definitely impressive.

------
cheshire137
I kept giggling at its poor recognition. It's comically bad, but I think it's
a step in the right direction. It was very fast at incorrectly identifying
letters. If only it were very fast and mostly correct.

~~~
azakai
It is good at recognizing machine-generated text - hit the blue arrow - and
not that good at human-scribbled text with a mouse, I find.

I assume you were testing hand-written text?

------
systematical
Hand writing my name Chris was difficult for it to pick up. It kept thinking
my "C" was an "L" and putting spaces in between letters. Also determined my
"S" was an underscore. Still pretty cool. Thanks!

~~~
alistairjcbrown
Looks like underscore character ("_") is used when the letter can't be
determined - so in fact it had no idea what your "S" was ^_^

------
alistairjcbrown
Interesting - I've had the Project Naptha
([http://projectnaptha.com/](http://projectnaptha.com/)) Chrome extension
installed without really looking under the hood. Turns out it has Ocrad.js and
Tesseract as two engine options - it uses them to automaticaly convert images
on the page to selectable text.

~~~
paulirish
Yup! And Naptha and Ocrad.js are both authored by antimatter15.

------
bignis
It demos well, but then I tried a simple test - a photograph of some text
([http://imgur.com/TCnGlZG](http://imgur.com/TCnGlZG)), Ocrad.js utterly
failed at it, almost all letters were incorrect.

------
fnordsensei
Love the idea of it. However, I threw some random Swedish at it, and it didn't
fare too well. [http://imgur.com/nZLtoj5](http://imgur.com/nZLtoj5) Kudos for
the effort though!

------
PhrosTT
I looked at this recently to try to pick some values off a high res png of a
pdf. That was a little too ambitious for this library. It's probably good for
smaller images with a few words.

------
mlinksva
It'd be nice to be able to invoke this from within PDF.js.

------
walterbell
How does Ocrad compare to Abbyy in quality?

~~~
unhammer
[http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software...](http://www.splitbrain.org/blog/2010-06/15-linux_ocr_software_comparison)
is a very simple comparison (linked from the post).

So with a small enough test set, abbyy is infinitely better

------
mrfusion
Would this be an easy way to get OCR into an iPhone app with phone gap?

Could it operate on a live video feed?

~~~
gry
It might be easy, but until iOS 8 is released, non-Safari JS still takes a
performance hit. [1] You may want to take a look at the Tesseract library and
Objective-C wrapper. [2]

[1] [http://9to5mac.com/2014/06/03/ios-8-webkit-changes-
finally-a...](http://9to5mac.com/2014/06/03/ios-8-webkit-changes-finally-
allow-all-apps-to-have-the-same-performance-as-safari/) [2]
[https://github.com/ldiqual/tesseract-
ios](https://github.com/ldiqual/tesseract-ios)

edit: Looking closer at this lib, impressive. Might give it a go.

------
jeffehobbs
This is hot. Nice work.

