
ID Card Digitization and Information Extraction using Deep Learning - ole_gooner
https://nanonets.com/blog/id-card-digitization-deep-learning/
======
alexcnwy
This is a decent OCR / structured data extraction literature review but is
absolutely not "building an ID card reader from scratch with deep learning".

It's also very hand-wavey on the details of how to actually use graph
convolutional networks to extract structured ID card data. For example what
"bounding box information" is used in your node representations? What is the
architecture of your biLSTM?

This seems very much more like a promotion for your API than useful
information on how to build a system that extracts data from ID cards.

~~~
jumpskiphop
Sure, the blog might have missed out on the finer details of the different
architectures. We intended to give an overview of some of the techniques used
to build such information extraction models, we will definitely dive deeper
into one of the architectures/model as a second part to this blog.

------
rosswilson
The article has quite a nice introduction to deep learning concepts, but the
headline claim of building an ID card reader from scratch is little more than
"use our API".

~~~
dang
Ok, we've changed the title from "Building an ID card reader from scratch with
deep learning" to the what the article's title says.

------
tjpnz
I got asked to spec out something for a nightclub ~10 years ago for extracting
text from drivers licenses and 18+ cards. I didn't go ahead with the job
(client wasn't paying much and the way he wanted to use it was ethically and
legally grey) but I did prototype something and recall getting good results
from the Python OCR libraries available at the time. What advantages would you
get from a deep learning approach compared with what was available back then?

~~~
jumpskiphop
the problem with extracting information is not just limited to getting OCR
results. the bigger problem while building something like this is extracting
the fields and understanding the structure of the document automatically.
using some python OCR libraries, you'd probably get text results for a drivers
license or a passport separately and process these results on separate rules
written for each. with deep learning a non-template solution seems possible
which will figure out which ID it is, where the name, address, relevant
numbers are and put them in a structure.

~~~
headbansown
In the case of U.S. drivers licenses, there are standards for the 2D barcode
that would make it very straightforward to parse:
[https://www.aamva.org/uploadedFiles/MainSite/Content/Solutio...](https://www.aamva.org/uploadedFiles/MainSite/Content/SolutionsBestPractices/BestPracticesModelLegislation\(1\)/BarCodeDataEncodingReqmtsBestPractice.pdf)

------
prats226
"CNNs versus GCNs" is not necessarily correct? You will need to apply GCN on
top of CNN to get the structure out of otherwise unstructured text?

------
jumpskiphop
the article reviews all the recent deep learning based approaches to
digitization and OCR along with an explanation of how graph neural networks
work and how they can be applied to the problem of ID card digitization.

------
mister_hn
but why not building a card reader by reading the chip content (more accurate
than an image)?

~~~
darkhorn
The title says "ID card". First you need to read the printed texts so that you
can be able to read the chip of ID cards.

