
Ocropus - Google Code - sarosh
http://code.google.com/p/ocropus/
======
andyjenn
I tried these out for converting image-only PDFs, but the installation was
getting ever more complex and unstable. I made a time-is-money decision and
purchased Vividata - *nix based command line utility - which is a shame as I
would've liked to have got these working. Almost all the other products were
Windows based - urrgh! If anyone has a different experience, I'd be pleased to
hear from them...

------
mleonhard
I wonder if there's a market for an OCR web service?

------
Tichy
Does it work? Last time I tried a Google OCR it didn't work at all (forgot the
name, something with t).

~~~
jm4
You're probably thinking of Tesseract.

I've actually been doing some work with various OCR tools the past few days.
Tesseract and Ocropus didn't work very well for me, but I don't think mine is
the intended use. I have a need to identify images that contain text (which
could be on top of a complex background) and most OCR applications are better
suited for reading scanned documents. So far GOCR has produced the best
results for me.

I found an article a couple weeks ago comparing a few OCR applications for
scanning documents that someone might find interesting:
<http://www.linux.com/feature/138511>

OCR quality is heavily dependent on font type, font size and contrast, but for
many typed documents Ocropus and Tesseract do a pretty decent job.

~~~
Tichy
In that article, tesseract seems to work the best. Maybe I should give it
another try.

