The problem is, regardless of the confidence number, you can scan and mark document for grammatical errors.
In VLM/LLM powered methods, the missing/misred data will be hallucinated and you can't know whether something scanned correctly or not. I personally scan and OCR tons of personal documents, I prefer "gibberish" rather than "hallucinations", because they're easier to catch.
We had this problem before [0], on some Xerox scanners and copiers. Results will be disastrous. It's not a question of if, but when.
I personally tried Gemini and OpenAI's models for OCR, and no, I won't continue using them further.
In VLM/LLM powered methods, the missing/misred data will be hallucinated and you can't know whether something scanned correctly or not. I personally scan and OCR tons of personal documents, I prefer "gibberish" rather than "hallucinations", because they're easier to catch.
We had this problem before [0], on some Xerox scanners and copiers. Results will be disastrous. It's not a question of if, but when.
I personally tried Gemini and OpenAI's models for OCR, and no, I won't continue using them further.
[0]: https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...