Error Handling in Tools

Build resilient AI agents through robust error handling and graceful degradation

Recovery Strategies: Bouncing Back

Once you've categorized an error, you need a recovery strategy. The right strategy depends on the error typeβ€”retry transient failures, use fallbacks for service outages, fail fast for permanent errors.

Interactive: Exponential Backoff Simulator

Watch how retry attempts increase wait time between tries

Recovery Strategy Comparison

πŸ”„
Retry with Backoff
Exponentially increase wait time between retries
When to use:
Transient errors (network, timeout)
Example:
1s β†’ 2s β†’ 4s β†’ 8s
βœ“ Handles temporary issues automatically
⚠ Can delay results, may waste resources
πŸ”€
Fallback/Circuit Breaker
Switch to alternative tool or cached data
When to use:
Service degradation, repeated failures
Example:
Primary API down β†’ Use secondary API
βœ“ Maintains functionality during outages
⚠ Requires alternative data sources
⬇️
Graceful Degradation
Return partial results, skip failed components
When to use:
Non-critical feature failures
Example:
Image generation fails β†’ Return text-only
βœ“ User gets some value vs total failure
⚠ Reduced functionality may confuse users
⚑
Fail Fast
Return error immediately without retry
When to use:
Permanent errors, validation failures
Example:
404 Not Found β†’ Return error message
βœ“ Quick feedback, no wasted resources
⚠ No recovery for transient issues

Retry Best Practices

Set Maximum Retries
Prevent infinite retry loops
Implementation:
Max 3-5 attempts before giving up
Use Exponential Backoff
Give systems time to recover
Implementation:
1s, 2s, 4s, 8s between attempts
Add Jitter
Prevent thundering herd
Implementation:
delay = baseDelay * 2^n + random(0, 1000ms)
Log All Attempts
Essential for debugging patterns
Implementation:
Log: attempt, delay, error, outcome