A few weeks ago I asked in Hacker News "I'm in the middle of a graduate degree and am reading lots of papers, how could I get ChatGPT to use my whole library as context when answering questions?"
And I was told, basically, "It's really easy! Just First you just extract all of the text from the PDFs into arxiv, parse to separate content from style, then store that in a a DuckDB database, with zstd compression, then just use some encoder model to process all of these texts into Qdrant database. Then use Vicuna or Guanaco 30b GPTQ, with langcgain, and....."
I was like, ok... guess I won't be asking ChatGPT where I can find which paper talked about which thing after all.
The simple answer is because I don't know how to do a semantic search on a bunch of documents, nor do "plain old vector distances." It's not my field.
The longer answer is that I think that it will also be useful to have the background knowledge (however hallucinogenic) that ChatGPT has. I'd like to be able to have a conversation specifically grounded in the papers, and if I ask about a topic that isn't specifically mentioned by those words in the article, I'd like it to be able to say "none of the articles talk about X, but they do mention Y, which is related in this way." I'm not sure if this is expecting too much of it.
>This is a minimal package for doing question and answering from PDFs or text files (which can be raw HTML). It strives to give very good answers, with no hallucinations, by grounding responses with in-text citations.
We built https://github.com/trypromptly/LLMStack to serve exactly this persona. A low-code platform to quickly build RAG pipelines and other LLM applications.
A few weeks ago I asked in Hacker News "I'm in the middle of a graduate degree and am reading lots of papers, how could I get ChatGPT to use my whole library as context when answering questions?"
And I was told, basically, "It's really easy! Just First you just extract all of the text from the PDFs into arxiv, parse to separate content from style, then store that in a a DuckDB database, with zstd compression, then just use some encoder model to process all of these texts into Qdrant database. Then use Vicuna or Guanaco 30b GPTQ, with langcgain, and....."
I was like, ok... guess I won't be asking ChatGPT where I can find which paper talked about which thing after all.