Years ago I tried hacking something together like this - primarily to read labels on packaged food products - but the primary roadblock I hit was with the FOSS OCR solutions not being anywhere near good enough to be reliable.
Mind you this was a few years ago and I was primarily testing with pytesseract. I would be curious if this team actually used the Google OCR API or an internally tuned one that isn't GA, and how that differs FOSS Tesseract.
Good to remember blog posts like this for all those who claim Google isn't innovating or investing in search. This is the type of infrastructure that goes in to extracting useful information from the web.