I steered a friend towards Paperless (and away from an LLM solution) as a way of searching/accessing GBs of architectural PDFs recently - so far, it’s apparently working well for them.
I have been playing with it for a while but I miss a conversational interface where I can interrogate the PDF's and summarize them or let's say, find all the main events per year in a corpus of text and build a time-line of said events (context legal case with tons of text data to parse)
https://github.com/paperless-ngx/paperless-ngx