I’ve spent the past year and a half constructing a Retrieval Augmented Generation (RAG) chatbot for a Fortune 500 manufacturing company, integrating over 50 million records across a dozen databases. Despite that scale, the system can return relevant info in 10–30 seconds, and it’s now at 90% five-star user approval internally.
After tons of trial and error—embedding huge datasets, mixing vector + text search, handling concurrency, and dodging hallucinations, I decided to document it all in a book. It’ll be live on Manning.com’s Early Access soon (March 27th). If you’re tackling large-scale RAG or have questions about my approach (the struggles, the successes), feel free to ask. I’m happy to share lessons, config ideas, or gotchas so you can avoid the pitfalls I hit along the way.
reply