Hello HN! We (Steven, Hendrik and Stefan) built a real-time identity resolution system that can handle hundreds of millions of customer records, and recently launched a LangChain integration to use it as a RAG source for LLMs.
We built this while working at a European credit bureau, where we needed to deduplicate and match millions of monthly record updates from various sources. Traditional approaches using graph databases and Spark couldn't handle the scale, so we built our own solution using AWS Serverless.
Each identity is stored as an individual graph structure, using rules-based and ML matching. Performance: <300ms ingest (tested to 5,000/sec), <150ms search regardless of graph size. Several fintech companies use it for fraud detection, KYC, and customer 360.
Unlike vector databases which can blur similar entities together, IdentityRAG maintains distinct customer identities while pulling data from multiple systems - even when customer details differ across databases.
You can try it out with our sample chatbot in the Github repo (linked above). Free to sign up, we charge based on number of unified customer records (it is free for playing and testing). We would love to hear your comments and questions.
There is also a demo video in the repo and you can find more details about us here: https://tilores.io/
We built this while working at a European credit bureau, where we needed to deduplicate and match millions of monthly record updates from various sources. Traditional approaches using graph databases and Spark couldn't handle the scale, so we built our own solution using AWS Serverless.
Each identity is stored as an individual graph structure, using rules-based and ML matching. Performance: <300ms ingest (tested to 5,000/sec), <150ms search regardless of graph size. Several fintech companies use it for fraud detection, KYC, and customer 360.
Unlike vector databases which can blur similar entities together, IdentityRAG maintains distinct customer identities while pulling data from multiple systems - even when customer details differ across databases.
You can try it out with our sample chatbot in the Github repo (linked above). Free to sign up, we charge based on number of unified customer records (it is free for playing and testing). We would love to hear your comments and questions.
There is also a demo video in the repo and you can find more details about us here: https://tilores.io/
reply