Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: EndType – Extract structured data from images, video and PDFs (endtype.com)
20 points by timm37 10 months ago | hide | past | favorite | 5 comments



Hey everyone. As AI gets better and better and multimodal I believe one of the most common use cases will extracting structured data from unstructured files. So things like shipping labels, bank statements, invoices, patents, etc.

I plan to release workflows soon which will simply take any file via email or form and save the structured content on a spreadsheet/csv or a new PDF.

Let me know if you would be interested in trying the workflows and if you have a use case to extract/organize different files.


I got one. Say I gave it a corpus of structured[1] files that follow Schema X, then I gave it a pile of outputs (PDF, HTML) generated from that corpus, where StructureFileName.xml = StructuredFileName.pdf. Could you see this doohickey being able to take in just the PDF/HTML/Word output, then output its best guess at chucking that into a Schema X file?

Pretty much everyone I work with are XML fetishists, and adore hard coded ontologies and taxonomies forged with many years of blood and sweat. I'm a bit more pragmatic and technology-minded. Even before AI I was pretty sure that using Python ML to generate a graph of keywords was a hell of a lot more useful than handcrafted ontology - doesn't cost hundreds of thousands of dollars in billable hours either. Now, with this stuff, we can get around hard coding all that structure itself, and maybe have source documents that normal people can read without about five zeroes worth of bespoke tools.

[1] And when I say "structured" I mean *completely frickin bananas".


Most likely you will be able to do this right away.

If not we could easily fine tune a custom model for this task, particularly if you already have a bunch of input/outputs.

Can you send me an email at support [at] endtype.com?


Great! now do this for commercial notifications sent to my email. Things like bank transfer. Usps deliveries. Shopping delivery notifications. Food delivery.


I'm actually adding "workflows" which allow the platform to ingest files via api, a cusom email address and hosted forms.

You could set a forward rule on your email and auto-forward all those to the [workflow_id]@endtype.email.

If you are interested I can setup an email for you, send me an email at support [at] endtype.com




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: