Genuine question: at what point does the term RAG lose its meaning? Seems like L...

parsimo2010 · 2024-11-07T22:16:00 1731017760

RAG is a search step in an attempt to put relevant context into a prompt before performing inference. You are “augmenting” the prompt by “retrieving” information from a data set before giving it to an LLM to “generate” a response. The data set may be the internet, or a code base, or text files. The typical examples online uses an embedding model and a vector database for the search step, but doing a web query before inference is also RAG. Perplexity.ai is a RAG (but fairly good quality). I would argue that Codebuff’s directory tree search to find relevant files is a search step. It’s not the same as a similarity search on vector embeddings, and it’s not PageRank, but it is a search step.

Things that aren’t RAG, but are also ways to get a LLM to “know” things that it didn’t know prior:

1. Fine-tuning with your custom training data, since it modifies the model weights instead of adding context. 2. LoRA with your custom training data, since it adds a few layers on top of a foundation model. 3. Stuffing all your context into the prompt, since there is no search step being performed.

brandonchen · 2024-11-07T22:45:09 1731019509

Gotcha – so broadly encompasses how we give external context to the LLM. Appreciate the extra note about vector databases, that's where I've heard this term used most, but I'm glad to know it extends beyond that. Thanks for explaining!

petesergeant · 2024-11-07T22:14:42 1731017682

Not RAG: asking the LLM to generate using its internal weights only

RAG: providing the LLM with contextual data you’ve pulled from outside its weights that you believe relate to a query

brandonchen · 2024-11-07T22:45:36 1731019536

Nice, super simple. We're definitely fitting into this definition of RAG then!

michaelmior · 2024-11-08T16:50:13 1731084613

I think parsimo2010 gave a good definition. If you're pulling context from somewhere using some search process to include as input to the LLM, I would call that RAG.

So I would not consider something like using a system prompt (which does add context, but does not involve search) would not be RAG. Also, using an LLM to generate search terms before returning query results would not be RAG because the output of the search is not input to the LLM.

I would also probably not categorize a system similar to Codebuff that just adds the entire repository as context to be RAG since there's not really a search process involved. I could see that being a bit of a grey area though.