open-source experiment applying chaos-engineering ideas to LLM agents.
Agent development feels fast, but reliability is mostly assumed. Agents depend on prompts, tools, APIs, and implicit coordination. When something breaks, behavior degrades in subtle ways and we usually find out too late.
Balagan Agent intentionally injects controlled “chaos” into agent workflows to surface failure modes early:
- Tool failures, latency, partial responses
- Prompt drift and unexpected decisions
- Hidden assumptions in sequencing and coordination
The goal is not load testing, but understanding how fragile an agent really is and where guardrails are needed.
This started as a side project to explore whether chaos-style testing makes sense for agents, similar in spirit to what Chaos Monkey did for distributed systems.
reply