Home/Agentic AI/Error Handling in Tools/Implementation Patterns

Error Handling in Tools

Build resilient AI agents through robust error handling and graceful degradation

Your Progress

0 / 5 completed

Introduction

Error Categories

Recovery Strategies

Implementation Patterns

Key Takeaways

Implementation Patterns: Code That Works

Now let's see how to implement these strategies in production code. These patterns are battle-tested in real-world systems handling millions of requests.

Interactive: Code Pattern Explorer

Basic Retry with Exponential Backoff

Standard retry pattern with increasing delays

async def call_tool_with_retry(
    tool_name: str,
    params: dict,
    max_retries: int = 3
) -> dict:
    """Execute tool with exponential backoff"""
    base_delay = 1.0  # Start with 1 second
    
    for attempt in range(max_retries):
        try:
            result = await execute_tool(tool_name, params)
            logger.info(f"✓ Tool '{tool_name}' succeeded on attempt {attempt + 1}")
            return result
            
        except TransientError as e:
            if attempt == max_retries - 1:
                logger.error(f"✗ All retries exhausted for '{tool_name}'")
                raise
            
            # Exponential backoff with jitter
            delay = base_delay * (2 ** attempt)
            jitter = random.uniform(0, 0.1 * delay)
            wait_time = delay + jitter
            
            logger.warning(
                f"⚠ Attempt {attempt + 1} failed. "
                f"Retrying in {wait_time:.2f}s..."
            )
            await asyncio.sleep(wait_time)
            
        except PermanentError as e:
            # Don't retry permanent errors
            logger.error(f"✗ Permanent error: {e}")
            raise

Logging Best Practices

Proper logging is essential for debugging errors in production

ERRORUnrecoverable failures, exceptions

logger.error("Tool execution failed after all retries")

WARNINGRecoverable issues, fallback used

logger.warning("Using fallback after primary tool failed")

INFONormal operations, successful retries

logger.info("Tool succeeded on retry attempt 2")

DEBUGDetailed trace information

logger.debug("Retry backoff delay: 2.5s")

Production Readiness Checklist

✓Implement retry logic with exponential backoff

✓Add circuit breakers for external services

✓Define fallback strategies for critical tools

✓Log all errors with context (tool name, params, attempt number)

✓Set up monitoring and alerts for error rates

✓Test error scenarios in staging environment

✓Document error handling behavior for your team

←Previous: Recovery Strategies