Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How to host an AI model that processes PDFs into structured data?
6 points by rco8786 3 months ago | hide | past | favorite | 7 comments
I'm hoping there is a guide or tutorial out there for someone like me.

I'm an experienced software engineer, mostly backend. I have a general understanding of how AI works, and have used all the various tools (ChatGPT, Copilot, etc).

Where I'm totally ignorant is in how to self host my own models. In particular, I am looking to self host a model that can read PDFs and parse out structured data.

Any good starting points?




Jan is a good starting point. Desktop app that will guide you through downloading models you can run locally. Then

https://jan.ai/docs/tools/retrieval

A similar tool, local desktop app, is LMStudio. Same deal.

https://lmstudio.ai/docs/basics/rag


Thanks so much! Any resources for deploying something like this into a cloud/datacenter and standing up an API?


They both can be told via command line to run headless and start a server that is OpenAI compatible and can receive documents. Can be run in docker. The API modes are nevertheless not intended for concurrent use as a true server. Another project called VLLM, can be found on github, is intended to support concurrent OpenAI compatible API service, multiplexing over the available GPU.


How expensive is it to run these in the cloud? Last time I looked into this it looked like you have to pay significantly in server costs to be able to run a local LLM.



Modal.com can support this. A couple existing PDF extraction startups use it. You could have a custom endpoint up in minutes. It’s serverless so will be low cost for your level of traffic.


Docling to parse pdf into markdown.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: