We've built this over the last few weeks to leverage vector search and LLMs (this is backed by GPT-3.5, though we're also testing Flan-T5) to answer question over large sets of documents with references. Currently, we've ingested the documentation for React and some key adjacent libraries (Redux, React-Redux, React-Router, MUI). This allows you to ask various natural language questions and the output is hopefully a relevant answer with code examples if applicable, while sourcing the original docs whenever possible.
We're working on adding up more documentations and have more "general" questions (e.g., query your own notion documentation). Any feedback is appreciated at this stage, let us know what you think and if there are any libs you'd like to see added!
This came up as a pet project as we were eager to put some LLM work into production. Currently piggy-backing off APIs, though the results with certain self-hosted models could be worse (but could be better), as he mentioned, we've ran some experiments with Flan-T5.
We have a bunch of current plans as to where to keep building on this, beyond improving the model/engine/prompting/etc... We're very keen to integrate more libs that we use (FastAPI and Pandas come to mind, in terms of sprawling doc), and adding a "Search through StackOverflow Answers" button, though testing how well our similarity look-up works on that front.
On the non-code/technical aspects, everything we've tried has been pretty encouraging, but has a bunch of different challenges. For documentation questions, we're trying to "ground" the model knowledge -- it probably knows a lot of react, but can't quite reference this exact bit of the documentation, or will use an outdated something, or this, or that... So we're trying to re-center and improve the knowledge of the original model and the way it serves it.
When we test the approach on in-house documentation (such as Notion), the problem is a bit different: in a lot of cases, _all_ the relevant information should be in the context, and we very much don't want the model to rely on whatever latent knowledge it has of "What was agreed with Joe about the framework swap?". We're not seeing much of it anymore, but even with synthetic data, we had some interesting situations where the relevant context wasn't found and some safeguards failed, so an entirely made-up Joe was purported as having approved to swap to Angular.
Happy to answer questions and discuss more about this -- LLM as the "logic" layer of document retrieval is definitely fascinating.