AI Agent Evals

We've launched Evals—a powerful testing environment that lets you validate how your AI agents handle different scenarios before deploying them to real customers.
With Evals, you can now:
- Create evaluation suites: Organize your tests into comprehensive suites that cover all critical aspects of your AI agent's performance.
- Test individual responses: Use "Next Reply" evaluations to verify your agent provides appropriate responses in specific conversation contexts based on your success criteria.
- Validate tool usage: Set up "Tool Invocation" tests to ensure your agent calls the right tools at the right times during customer interactions.
- Simulate complete conversations: Run full "Conversation" evaluations following defined scripts to test your agent's ability to handle multi-turn interactions from start to finish.
Each evaluation type helps you catch issues early and ensure your AI agents deliver consistent, accurate responses across every customer interaction.
To start testing your agents, navigate to Evaluation Suites on the left navigation bar, create a new Suite, then add your first evaluation to begin simulating real-world scenarios.

