Hacker News new | past | comments | ask | show | jobs | submit login

I want to do this but for 30GB of PDFs



I steered a friend towards Paperless (and away from an LLM solution) as a way of searching/accessing GBs of architectural PDFs recently - so far, it’s apparently working well for them.

https://github.com/paperless-ngx/paperless-ngx


I have been playing with it for a while but I miss a conversational interface where I can interrogate the PDF's and summarize them or let's say, find all the main events per year in a corpus of text and build a time-line of said events (context legal case with tons of text data to parse)


Hi @rmdes, Sagar here from Joyspace AI. I recently made a Show HN post[0] around documents search engine.

We can do this very easily for you. We can provide Search output with context that you can further feed to an LLM for processing to extract events. Let me know if you are interested.

You can get in touch with me at sagar at joyspace dot ai.

[0] https://news.ycombinator.com/item?id=39980902


this shouldn't be too hard




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: