Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hey I wrote the Omni benchmark. I think you might be misreading the methodology on our side. Order on page does not matter in our accuracy scoring. In fact we are only scoring on JSON extraction as a measurement of accuracy. Which is order independent.

We chose this method for all the same reasons you highlight. Text similarity based measurements are very subject to bias, and don't correlate super well with accuracy. I covered the same concepts in the "The case against text-similarity"[1] section of our writeup.

[1] https://getomni.ai/ocr-benchmark




I'll dig deeper into your code, but scanning your post does look like your are addressing this. That's great.

If I do find anything, I'll share with you for comments before I publish the post.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: