Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Mistral OCR made multiple mistakes in extracting this [1] document. It is a two-page-long PDF in Arabic from the Saudi Central Bank. The following errors were observed:

- Referenced Vision 2030 as Vision 2.0. - Failed to extract the table; instead, it hallucinated and extracted the text in a different format. - Failed to extract the number and date of the circular.

I tested the same document with ChatGPT, Claude, Grok, and Gemini. Only Claude 3.7 extracted the complete document, while all others failed badly. You can read my analysis here [2].

1. https://rulebook.sama.gov.sa/sites/default/files/en_net_file... 2. https://shekhargulati.com/2025/03/05/claude-3-7-sonnet-is-go...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: