Base64.ai – Extract text, data, photos and more from all types of docs

opheliate · on Feb 10, 2021

Somewhat confused by the naming choice here. Naming your company after something as fundamental as base64 encoding seems to inevitably lead to confusion down the line.

oebilgen · on Feb 10, 2021

Sorry you find it confusing. Our vision is to provide AI services for everything in base64 format; images, videos, sounds, etc.

arthurcolle · on Feb 10, 2021

Doesn't matter. Facebook will buy them for 2B in another year and a half and will get folded into the mix

voiper1 · on Feb 10, 2021

I looked into OCR a while ago for some hundreds of thousands of pages of PDF. All hosted offerings would end up costing quite a bit.

After looking at options and few tests, I figured I'd use https://github.com/jbarlow83/OCRmyPDF It converts the PDF to an image for Tesseract and then recreates the PDF with the text copy-able.

It won't identify the address part of a driver's license, but that wasn't necessary for this project.

monkeydust · on Feb 10, 2021

Few years back I worked with someone to build an Android OCR app.

At the time there were not may apps out there and we partnered with a 3rd party service who did the OCR off the app so our quality of conversion (at the time) was close to state of the art from a mobile once people got comfortable with this method (which of course not everyone did).

We made some decent money as a side project from it but I also started to appreciate the sheer complexity of OCR.

We spent a lot of time fine tuning pre-processing before hitting the OCR engine (e.g. orientation, shading) small changes here made huge impact to performance. We also built various prompts to guide the user on how to take the photo to help. Managing expectations was something we were very conscious off and it was tough.

The unexpected use (but rewarding) use case was when we found people who were blind started to use the app to help with their daily lives - only a few but it was making a real impact to them so we priortized a few features to this segment knowing we were drifting away from maximizing revenue but we were cool with this as it was not a primary income source.

In the end we all moved to other things, more apps / services came on the market, google lens became a thing so we decided to sunset the product and did our best to manage customers through this process.

A rewarding experience overall - lots of lessons were learn that I have used elsewhere in my life since and ticked off' Build an app that made thousands of $' of my bucket list (which yea I should probably review!).

Darkphibre · on Feb 10, 2021

Interesting!

I've been thinking of running OCR on video frames. I'd also like to do speech-to-text extraction for searching my archives later (have about 4TB of video to trawl through, and desire text-based search capabilities). It's an interesting space to explore, but everything's been moving to web-service at a cost-prohibitive model.

voiper1 · on Feb 10, 2021

Should be able to use ffmpeg[0] to extract a single frame each second/keyframe (doubtful it's worth doing every single frame) and then pass it to tesseract.

For speech to text.. if english, try mozilla's deepspeech? https://github.com/mozilla/DeepSpeech

Might be fun to try.

[0] https://stackoverflow.com/questions/27568254/how-to-extract-...

Darkphibre · on Feb 10, 2021

Yup, was planning to use ffmpeg (or, more likely, OpenCV), and a subset of the frames.

Thanks so much for the tip on DeepSpeech!

oebilgen · on Feb 10, 2021

@Darkphibre; we are happy to provide you an AI that takes in a video and outputs OCR and speech-to-text. With Base64.ai, you don't have to worry about the implementation details, and focus on your projects. Let's have a meeting to discuss more? https://base64.ai/meeting

ghgr · on Feb 10, 2021

For speech-to-text extraction you can try Silero [1].

Free software (AGPL-3.0 License), fast, highly accurate and extremely simple to deploy (I have no affiliation with them).

[1] https://github.com/snakers4/silero-models

Darkphibre · on Feb 10, 2021

Thanks for the heads up! Will definitely check it out.

danielmorozoff · on Feb 10, 2021

If you’re looking to index/ process video - maybe we can help. Checkout Vidrovr (https://vidrovr.com)

Full disclosure im one of the founders.

m3nu · on Feb 10, 2021

It's not really working. Tried 2 English PDF invoices. Normal format. One came back empty, the other only had the amount right.

I'm assuming they only trained on some specific documents (passport of country X, etc) and all others don't work.

If someone processes the same document all the time, then my invoice2data project may work better and is open source. It's based on Regx, rather than machine learning: https://github.com/invoice-x/invoice2data

solarkraft · on Feb 10, 2021

Stay tuned for my new biotech startup, http.ai.

What was the process resulting in this name?

oebilgen · on Feb 10, 2021

http.ai is a cool name too! Our vision is to provide AI services for everything in base64 format; images, videos, sounds, etc.

robarr · on Feb 10, 2021

What about the liabilities of sharing data with a third party? Your are sending all kind of data to a third party processor.

Edit: I am not being critical, I am really asking.

lionkor · on Feb 10, 2021

All I can find on this is

> Base64.ai SOC 2 compliancecertifies our bank-level security standards. Our API does not store your data to prevent possible data breaches. All API traffic must be authenticated and encrypted over HTTPS.

Sounds... Good enough? I mean, for what it is, it sounds like it's at least trying.

nemoniac · on Feb 10, 2021

In Europe that falls far short of the requirements of GDPR law for personal data.

jgtrosh · on Feb 10, 2021

that's not a question with an answer, it's a negative point for any third party processor such as this one.

alierkurt · on Feb 10, 2021

@robarr that's a fair question to ask. Briefly, Base64.ai neither stores the images you sent, nor their extracted data. We provide the power and extensibility of the cloud without the risks of a data breach. Base64.ai complies with GDPR requirements too. Our SOC-2 compliance report details the extend of security measures we take for your data. Happy to share the report for your review under MNDA.

Farbklex · on Feb 10, 2021

Does your solution have any unique features or benefits in comparison to existing solutions like Acuant, MicroBlink or Regula? Those already classify various documents and extract the data pretty well.

https://www.acuant.com/idscan-data-capture-software/

https://microblink.com/products/blinkid

https://api.regulaforensics.com/

oebilgen · on Feb 10, 2021

They are good too, but we have products and services that match their offerings at a fraction of their cost.

We also offer products that they don't provide. Our AI is capable of analyzing sound data (speech to text). It is extensible to add your custom forms and document types. We provide a cloud API and RPA components for UiPath, Bardeen and other RPA providers. We built Base64.ai so that you won't need a new vendor for new document types and platforms.

Happy to meet over Zoom if you want to learn more https://base64.ai/meeting

emmelaich · on Feb 10, 2021

Also filingdb - https://filingdb.com/b/pdf-text-extraction

markdown · on Feb 10, 2021

A dollar per page? :O

mike_d · on Feb 10, 2021

Yeah, this will never take off unless they can get pricing below 1c per call.

Manual data entry for an _entire page_ of text is about 15c, or 10c at volume.

kazinator · on Feb 10, 2021

Maybe the plan is to compete on latency? Say someone wants to regularly extract content from the same kind of document, and wants it fast, like 500 milliseconds.

purplecats · on Feb 10, 2021

perhaps its cheaper than paying for 401k/benefits etc

djohnston · on Feb 10, 2021

generally people doing these sorts of tasks arent full time employees, rather contractors

alierkurt · on Feb 10, 2021

We have startup plans that start free and runs at 10 cents/page after volume discounts. We also offer prices in local currencies. Happy to work on a deal that works for you.

We are a pure AI company, i.e. there is no human-in-the-loop. We are and strive to be more accurate than manual labor, and our processing time is 1 second rather than minutes-to-hours. Also our AI is naturally unbiased and does not discriminate.

markdown · on Feb 10, 2021

> We are a pure AI company, i.e. there is no human-in-the-loop.

In that case, shouldn't it a fraction of a penny rather than a whole dollar? Automation is supposed mean lower costs.

aabhay · on Feb 10, 2021

That’s more expensive than manual data entry!

pseudosavant · on Feb 10, 2021

And has a 1 second response. That is worth something.

markdown · on Feb 10, 2021

But not a dollar.

oebilgen · on Feb 10, 2021

Appreciate the feedback. How much do you think would be fair?

woadwarrior01 · on Feb 10, 2021

The pricing makes me wonder if it's an AAI (Artificial Artificial Intelligence) service?

eejjjj82 · on Feb 10, 2021

with this pricing model I'd expect they're just reselling something like the GCP OCR APIs, most likely with some domain specific value adds

alierkurt · on Feb 10, 2021

yes but we have to cover the costs as well :) we're flexible on pricing based on the volumes. what you see there differs based on the needs but we always aim to find a common price point for all parties.

hartem_ · on Feb 10, 2021

Base64.ai has nailed time to value for customers. It’s pretty straightforward to integrate with and their extensive list of models makes it really easy to process a wide variety of document types. We used it Bardeen.ai and couldn’t have been happier. Kudos for a great service!

alierkurt · on Feb 10, 2021

Base64.ai is a cloud API that can extract data, photos, and signatures from all types of documents. We have prebuilt models for IDs, driver licenses, passports, visas, invoices, and many more document types. The integration is only a single API call.

Radim · on Feb 10, 2021

Ali, you may want to add some "About us" page.

Sending such sensitive data "into the cloud" is no joke, for any company.

alierkurt · on Feb 10, 2021

thanks for the feedback. as replied earlier, Base64.ai neither stores the images you sent, nor the extracted data. We provide the power and extensibility of the cloud without the risks of a data breach. Base64.ai complies with GDPR requirements too. Our SOC-2 compliance report details the extend of security measures we take for your data. Happy to share the report for your review under MNDA.

ZeroCool2u · on Feb 10, 2021

We've been working with Google Cloud on a very difficult data extraction problem for about 6 months now. Seeing very impressive results with their DocumentAI service. One of my teammates is planning to try this out on some of our data this afternoon though!

oebilgen · on Feb 10, 2021

Thank you! We're here to help. Please pick a time in our calendar https://base64.ai/meeting

_joel · on Feb 10, 2021

Really confusing name

cochne · on Feb 10, 2021

Sad that it doesn't seem work for HTML! Maybe I will try taking a screenshot... Otherwise cool though, looks very promising.

je42 · on Feb 10, 2021

just tried the Android app. very slow. didn't return any result for simple basic text card (black text on white paper).

je42 · on Feb 10, 2021

another text. some warranty info of some product in multiple languages was recognized as "drivers license":

"First Name": "400 MHz ~2433,5MH:"

"Issuing authority": "0MHz~2833.5MH:"

may be the requirements about the documents the system can accurately recognize need to be explained in the app.

oebilgen · on Feb 10, 2021

If you want our AI to learn warranty info documents, we're happy to work together. Let's meet http://base64.ai/meeting

m3nu · on Feb 10, 2021

Maybe a kind to artificial AI with lots of manual verification and templates? Hence the price.

llarsson · on Feb 10, 2021

It says that it does not store the submitted data. If true, then it's essentially just a trained model that we get to invoke for a dollar per API call to get the output from.

oebilgen · on Feb 10, 2021

This is accurate. We train every document models upfront and make it available via the API. We believe well-trained, high-quality models don't need retraining, just like humans don't need to re-learn reading every day.

tzfld · on Feb 10, 2021

The demos are not working for me. Not finish processing.