Ask HN: How to host an AI model that processes PDFs into structured data?

jonahbenton · 2024-11-08T21:14:34 1731100474

Jan is a good starting point. Desktop app that will guide you through downloading models you can run locally. Then

A similar tool, local desktop app, is LMStudio. Same deal.

rco8786 · 2024-11-08T23:09:25 1731107365

Thanks so much! Any resources for deploying something like this into a cloud/datacenter and standing up an API?

jonahbenton · 2024-11-09T05:16:51 1731129411

They both can be told via command line to run headless and start a server that is OpenAI compatible and can receive documents. Can be run in docker. The API modes are nevertheless not intended for concurrent use as a true server. Another project called VLLM, can be found on github, is intended to support concurrent OpenAI compatible API service, multiplexing over the available GPU.

jaredsohn · 2024-11-09T10:50:26 1731149426

How expensive is it to run these in the cloud? Last time I looked into this it looked like you have to pay significantly in server costs to be able to run a local LLM.

constantinum · 2024-11-09T14:20:41 1731162041

Unstract can be a good starting point. https://github.com/Zipstack/unstract

Refer this > https://unstract.com/blog/extract-table-from-pdf/

And this > https://unstract.com/blog/comparing-approaches-for-using-llm...

thundergolfer · 2024-11-11T19:58:36 1731355116

Modal.com can support this. A couple existing PDF extraction startups use it. You could have a custom endpoint up in minutes. It’s serverless so will be low cost for your level of traffic.

BOOSTERHIDROGEN · 2024-11-09T02:18:00 1731118680

Docling to parse pdf into markdown.