Is this just a training issue? They just need to train a model specifically for ...

ritvikpandey21 · 2025-02-07T22:03:36 1738965816

we don’t think so - we’ve fine tuned most of the SOTA language models available today on table datasets, documents with complex layouts, and while they do perform better, seems like they’re still prone to the same hallucinations. these frontier models have pretty much already been trained on most of the internet at this point, and tons of publically available documents.

sumedh · 2025-02-07T22:17:35 1738966655

How does your solution compare to AWS textract?

anon373839 · 2025-02-07T22:43:44 1738968224

They probably do this already. But the problem is more fundamental: there are simply no process guarantees or guardrails inside a generative model to constrain the failure modes.