Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A review and how-to guide for Microsoft Form Recognizer (crosstab.io)
34 points by ctk_brian on July 13, 2021 | hide | past | favorite | 8 comments


I had no Idea this tech existed.

Imo there are some veeeeeery profitable business opportunities if this can be integrated into certain platforms for certain industries.

I was thinking about how to make something like thos. Looks like google, Amazon, and Microsoft all beat me to it lmao


The trick is in handling the failures which will be high. So it's a question of whether the application/customer would tolerate something significantly less than 100% (and presumably fix it themselves) and/or your solution would include the manual mop up for when the automation fails. (i.e. there need to be business processes to fill in the gaps since it can't be completely automated end-to-end)

Also, if you have a specific industry/application to target you should be able to achieve better results than these services at a lower operational cost. This is a very common business need with solutions dating back to the early PC days and it doesn't look like they've come up with some unique solution. They've just bundled some rather tedious to develop functionality into convenient frameworks and services and marked it up for their trouble... but there's still quite a bit of assembly required.


See to me if you had client side employees dedicated to correcting errors then back propogate that error correction to the OCR software youd get pretty damn close to 100%


Form Recognizer's custom forms feature requires no more than 5 forms to get a reasonable model in most cases. The forms should be of the same layout though. For example, 1040, W8, W9, etc. or any custom form with the same content and layout. Disclaimer: I am a PM in the Form Recognizer team. Happy to answer questions.


Ah, did I miss that caveat in the documentation somewhere?

What's the use case for that, though? If the documents are highly homogeneous, why would I need a service--let alone an AI service--to extract the data? I could just specify the locations of the fields a priori on a 1040 (for example).


  We expected Form Recognizer would reward the overhead of training and managing custom models with superior performance, but we were disappointed. In our testing, Form Recognizer's accuracy was poor and its response times were way too slow for synchronous use cases.
Ouch.


> To evaluate Form Recognizer, I split the data randomly into 26 training documents and 25 test documents.

Training on just 26 documents seems woefully inadequate. I'm not a data scientist and have only cursory exposure to ML, but I'm not surprised to see terrible results with such a small training set.


Agreed, but ground truth labeling is a lot of work! The thing is, Form Recognizer has a hard limit of 500 total pages (not documents) in the training set.

I'm skeptical it's possible to achieve good performance with an unsupervised model with only 500 pages, unless those documents are very similar. In which case, why would you need a service like Form Recognizer at all?

From a product perspective, it just makes no sense to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: