We just built a production grade multilingual RAG system and wrote up the design and what we learned in “Multilingual RAG That Actually Works: A Practical Architecture.”
Many multilingual RAG setups break in familiar ways: embedding spaces that do not line up, chunks cut mid sentence, and translation that mangles names or IDs. We tried a different path. We keep the reasoning in one base language, translate only at the edges, and guard important terms with glossaries.
We outlined four common patterns for embedding chatbots, from simple bubbles to multi-tab hubs, and looked at where each approach fits best.
Would love to hear what patterns the HN community has seen work well (or fail) in real-world use.
Author here. We run chatbots with a smart router: classify noise/PII first, call retrieval+LLM only when needed, and use deterministic flows for actions. This keeps latency and cost down and makes behavior explainable. What would you add or change?
Author here (HoverBot team). How we keep PII out of LLMs in our chatbots: detect fast -> replace with typed placeholders -> call the model -> unmask on our server. We also cover region pinning and vendor privacy controls. Keen to hear trade-offs, failure modes, and better designs.
Great practical takes - agentic chatbots work only with real data and feedback loops. HoverBot is built this way: it automates RAG pipelines, includes confidence thresholds, and supports human-in-loop override. You can run vertical assistants (support, sales, HR) from a single dashboard with real-world reliability.
Tough experience, but it doesn’t have to be. With HoverBot, if the bot is unsure, it knows when to back off and bring in a human right away. And with built-in conversation analytics, you can keep fine tuning your flows so things just keep getting better.
Many multilingual RAG setups break in familiar ways: embedding spaces that do not line up, chunks cut mid sentence, and translation that mangles names or IDs. We tried a different path. We keep the reasoning in one base language, translate only at the edges, and guard important terms with glossaries.