>It provided us the coordinates of all the texts and all we had to do was look for texts similar to an Account number and IFSC from a cheque book. Using some regex it was easy to find closely matching strings
Could you explain what you mean by this ? We are trying to read shopping receipts, but I have ZERO background in image processing... so have been trying to figure out what to do. I have been trying to use Google Vision API though.
>The one which worked best for us was a custom designed filter using Otsu’s Thresholding principle.
Is this where you pre-preocess the image to make it readable ? How does one do it - are these specialized tools or can I do this in python (like http://www.scipy-lectures.org/packages/scikit-image/auto_exa...)
> The Tesseract 4.00 neural network subsystem is integrated into Tesseract as a line recognizer.
The LSTM is used in layout analysis, not in character recognition.
If you want native and complete access to tesseract's API you can use tesserocr: https://github.com/sirfz/tesserocr
The best - and most expensive - solution is still Abbyy OCR. They provide an SDK than can be used locally.
A new local OCR solution is Anyline.io, but I have not used them yet.
How did you get Copyfish to play nice with Zhongwen/Perapera? I've tried it with Chrome and Firefox and nothing seems to get them to pick up on the OCR text.
It seems likely that Google is doing something similar.
I have no formal training in CV, so my impression is that recognition is relatively easy, the hard thing is the preprocessing need in order to normalize images.
Once you have thresholded text boxes that are quite legible, you can train your CNN's and LSTMS to read text from images.
I use Abbyy with WINE. But a native Linux shell version of Abbyy is available: http://www.ocr4linux.com/en:start
In short: It's a python code where you press one button and it will take a screen shot, crop the image, decode it, and type in at over 900+ rpm.
To see how it is in action without the OCR functions:
Check out this paper (2011) for a good summary of the pros and cons: https://research.google.com/pubs/pub36984.html
However, this does not mean that all functionality will be available from Python, especially when code generation is not enough.
The image stitching library for example hits an assertion failure when called from Python. Disabling the check appears to work, but then you get warnings about incorrect reference counts.