With LiteLLM the application can easily switch between different models to pick to most optimized model for Knowledge Graph generation and Knowledge Retrieval.
It practically supports every models out there, LiteLLM supports over 100 large language model services, including OpenAI, Claude, Gemini, WatsonsX, Mistral, Azure OpenAI, Sagemaker....
We see three major issues that different projects encounter:
1. Knowledge Graph quality - if you don't have a clean well defined Knowledge Graph then the end result will not be good.
2. Multi Graphs support - you want to break the large Knowledge Graph to small per domain Knowledge Graphs which really helps the LLM work with the data.
3. User/Agent memories - You want each user have a dedicated & effective long term memory AKA personal Knowledge Graph completely private and secured.
4. Latency/performance - you have to have a low latency Graph Database that can provide a good user experience.
If it's your personal assassinate and is helping you for months it means pretty fast it will start forget the details and only have a vogue view of you and your preferences. So instead of being you personal assassinate it practically cluster your personality and give you general help with no reliance on real historical data.
It looks great! Utilizing Knowledge Graph to store long term memory is probably the most accurate solution compared to using only Vector Store (same as with GraphRAG vs Vector RAG).
I think an important thing to point here that long term memory vs RAG doesn't represent the organizational knowledge but the chat history which should be private to the end user and shouldn't be kept in a completely isolated graph than the rest of the users
When using a graph database you can build a knowledge Graph out of the long term memory. Storing it only in a vector database means that you'll only find things that are similar to the user question and miss a lot of information that is an aggregation of different memories.
We've done a very similar procedure just with FalkorDB as a Graph Database.
Notice if you already have a schema/ontology it might be easier might you might miss some entities in the text you did realize exist.
So in our in GraphRAG-SDK we are running two phases, the first is sampling the data to suggest a schema and the second is using this schema to ground the LLM to this schema (as you suggested)
We took that one step further with the GraphRAG-SDK - https://github.com/FalkorDB/GraphRAG-SDK
reply