Kalibr is an autonomous routing system that learns which agent execution paths actually succeed in production and routes traffic there in real time.
Most teams hardcode model choices based on benchmarks. Benchmarks don't reflect your data, your prompts, or your definition of success.
Kalibr runs continuous experiments in production:
No manual A/B tests. No spreadsheet tracking. No "we should try Claude for this."
A path isn't just a model - it's a complete execution configuration:
model + tool + parameters = path
Examples:
Kalibr learns which full configuration works best for each goal.
from kalibr import Router
router = Router(
goal="book_meeting",
paths=[
{"model": "gpt-4o", "tools": ["calendar_api"]},
{"model": "gpt-4o", "tools": ["google_calendar"]},
{"model": "claude-sonnet-4-20250514", "tools": ["calendar_api"]}
]
)
response = router.completion(messages=[...])
router.report(success=True)
import { Router } from '@kalibr/sdk';
const router = new Router({
goal: 'book_meeting',
paths: [
{ model: 'gpt-4o', tools: ['calendar_api'] },
{ model: 'gpt-4o', tools: ['google_calendar'] },
{ model: 'claude-sonnet-4-20250514', tools: ['calendar_api'] }
],
});
const response = await router.completion(messages);
await router.report(true);
Kalibr picks the path, makes the call, and learns from the outcome.
Get Started →Get Kalibr working in 5 minutes.
Goals, paths, outcomes, how routing works.
Statistical methods, exploration, the trust invariant.
Router, completion(), report(), get_policy().
Graceful degradation, trend monitoring, debugging.
Common questions.
Proof that Kalibr routes around failures automatically.
Use Kalibr with CrewAI, LangChain, and OpenAI Agents.
Common errors and how to fix them.