Basalt is the end-to-end platform to evaluate, monitor and improve your agents.
AB test your entire agentic chain directly from your code, evaluate and compare results.
Use AI evaluators to look for errors. Create your own or browse through our 50+ templates.
Run your prompts on multiple test cases and automatically improve them with our AI co-pilot.
Trace your agents & monitor usage directly in production
Evaluate your workflow at each step to ensure Agent quality
Set criteria and get alerts for production errors
Co-pilot
Multi-model
SDK
Versioning
Co-pilot guidance
Co-pilot improvements