we don’t think so - we’ve fine tuned most of the SOTA language models available today on table datasets, documents with complex layouts, and while they do perform better, seems like they’re still prone to the same hallucinations. these frontier models have pretty much already been trained on most of the internet at this point, and tons of publically available documents.
They probably do this already. But the problem is more fundamental: there are simply no process guarantees or guardrails inside a generative model to constrain the failure modes.