AgentBreak
Chaos proxy for testing how your agents handle failures.
Your agent works great — until the LLM times out, returns garbage, or an MCP tool fails. AgentBreak lets you test for that before production.
It sits between your agent and the real API, injecting faults like latency spikes, HTTP errors, and malformed responses so you can see how your agent actually handles failure.
Agent --> AgentBreak (localhost:5005) --> Real LLM / MCP server
^
injects faults based on your scenarios
Install
30-second demo
agentbreak init # creates .agentbreak/ with default configs
agentbreak serve # start the chaos proxy on port 5005
Point your agent at http://localhost:5005:
# OpenAI
export OPENAI_BASE_URL=http://localhost:5005/v1
# Anthropic
export ANTHROPIC_BASE_URL=http://localhost:5005
Run your agent, then check how it did:
No code changes needed — just swap the base URL.
What can it do?
- Simulate failures — HTTP errors, latency spikes, timeouts, malformed JSON, schema violations
- Target specific things — scope faults to a model (
gpt-4o) or MCP tool (search_docs) - Control timing — faults on every request, randomly, or in periodic bursts
- Score resilience — get a 0-100 score with pass/degraded/fail outcome
- Track over time — compare runs to see if your agent is getting more resilient
- Test MCP — proxy and fault-inject MCP tool calls, resource reads, and prompt gets
Claude Code
If you use Claude Code, AgentBreak has a plugin:
Then use /agentbreak:init, /agentbreak:create-tests, and /agentbreak:run-tests — Claude walks you through codebase analysis, scenario generation, and resilience reporting.
What's coming
- Security scenarios (prompt injection, adversarial inputs)
- MCP server chaos (tool call validation, poisoned responses)
- Pattern-based and skill-based attacks
- Deprecated library and model deprecation simulation
See the full roadmap.
Next steps
- Quickstart — full walkthrough
- Scenarios reference — all fault kinds, schedules, and match filters
- Testing methodology — how to design effective chaos tests
- CI/CD integration — run chaos tests in GitHub Actions, GitLab CI, etc.