Defining Success Criteria

Success looks different for different tasks. A research agent needs accuracy and completeness. A customer service agent needs speed and satisfaction. A code generation agent needs correctness and security. Start by understanding your task type, then define metrics that matter for that specific context.

Interactive: Task-Specific Metrics Explorer

Select a task type to see recommended success metrics:

Research Agent

Gathers and synthesizes information from multiple sources

Primary Metrics (Must Track):

✓Factual accuracy

✓Source credibility

✓Completeness

Secondary Metrics (Nice to Have):

•Time to completion

•Citation quality

•Readability

Interactive: Custom Success Criteria Builder

Define your own success metrics with target thresholds:

Metric Name

Target Threshold

Priority

💡

Start with 3-5 Core Metrics

Don't try to measure everything at once. Start with 3-5 critical metrics that directly reflect whether your agent is doing its job. You can always add more metrics later, but starting focused keeps evaluation manageable and actionable.

Task Success Metrics

Your Progress

Defining Success Criteria

Interactive: Task-Specific Metrics Explorer

Research Agent

Interactive: Custom Success Criteria Builder