Home/Agentic AI/Custom Framework/Production Patterns

Building Your Own Framework

Master designing and building custom agentic AI frameworks from scratch

Your Progress

0 / 5 completed

Production-Ready Patterns

Taking your custom framework to production requires robust observability, error handling, and deployment strategies. Here are essential patterns.

Structured Logging

Production Logger

import logging
import json
from datetime import datetime
from typing import Any, Dict

class AgentLogger:
    def __init__(self, agent_id: str):
        self.agent_id = agent_id
        self.logger = logging.getLogger(f"agent.{agent_id}")
        self.logger.setLevel(logging.INFO)
        
        # JSON formatter for structured logs
        handler = logging.StreamHandler()
        handler.setFormatter(self._json_formatter)
        self.logger.addHandler(handler)
    
    def _json_formatter(self, record):
        log_data = {
            "timestamp": datetime.utcnow().isoformat(),
            "agent_id": self.agent_id,
            "level": record.levelname,
            "message": record.getMessage(),
            "trace_id": getattr(record, "trace_id", None)
        }
        return json.dumps(log_data)
    
    def log_action(self, action: str, tool: str, result: Any):
        """Log agent actions with context"""
        self.logger.info(
            "Agent action",
            extra={
                "action": action,
                "tool": tool,
                "result_type": type(result).__name__,
                "success": not isinstance(result, dict) or "error" not in result
            }
        )
    
    def log_error(self, error: Exception, context: Dict):
        """Log errors with full context"""
        self.logger.error(
            f"Error: {str(error)}",
            extra={
                "error_type": type(error).__name__,
                "context": context,
                "traceback": True
            }
        )

📊 Monitoring & Metrics

•Latency: Track time per agent loop, tool execution
•Token usage: Monitor LLM costs per request
•Error rate: Track tool failures, LLM errors
•Loop iterations: Detect infinite loops early

🛡️ Error Handling

•Retry with backoff: Exponential retry for transient errors
•Circuit breakers: Stop calling failing services
•Fallback responses: Default behavior when tools fail
•Graceful degradation: Partial results better than none

Rate Limiting & Cost Control

Token Budget Management

from collections import defaultdict
from datetime import datetime, timedelta

class CostController:
    def __init__(self, daily_token_limit: int = 1_000_000):
        self.daily_token_limit = daily_token_limit
        self.usage: Dict[str, int] = defaultdict(int)
        self.last_reset = datetime.now()
    
    def check_budget(self, estimated_tokens: int) -> bool:
        """Check if request is within budget"""
        self._reset_if_new_day()
        
        today = datetime.now().strftime("%Y-%m-%d")
        current_usage = self.usage[today]
        
        if current_usage + estimated_tokens > self.daily_token_limit:
            return False
        
        return True
    
    def record_usage(self, tokens: int):
        """Record token usage"""
        today = datetime.now().strftime("%Y-%m-%d")
        self.usage[today] += tokens
    
    def _reset_if_new_day(self):
        """Reset usage at midnight"""
        now = datetime.now()
        if (now - self.last_reset).days >= 1:
            # Keep only last 7 days
            cutoff = (now - timedelta(days=7)).strftime("%Y-%m-%d")
            self.usage = {
                k: v for k, v in self.usage.items() if k >= cutoff
            }
            self.last_reset = now
    
    def get_usage_stats(self) -> Dict:
        """Get current usage statistics"""
        today = datetime.now().strftime("%Y-%m-%d")
        return {
            "today_usage": self.usage[today],
            "limit": self.daily_token_limit,
            "remaining": self.daily_token_limit - self.usage[today],
            "percent_used": (self.usage[today] / self.daily_token_limit) * 100
        }

Deployment Strategies

🐳 Docker

Benefits:

Consistent environments
Easy scaling
Portable deployment

Use for:

Microservices, cloud platforms

☁️ Serverless

Benefits:

Auto-scaling
Pay per use
No server management

Use for:

Burst workloads, low traffic

🖥️ VMs/K8s

Benefits:

Full control
Complex orchestration
High availability

Use for:

Production systems, large scale

Configuration Management

Environment-Based Config

import os
from pydantic import BaseSettings

class AgentConfig(BaseSettings):
    # LLM Configuration
    model: str = "gpt-4"
    max_tokens: int = 2000
    temperature: float = 0.7
    
    # Agent Configuration
    max_iterations: int = 10
    timeout_seconds: int = 30
    
    # Cost Controls
    daily_token_limit: int = 1_000_000
    max_cost_per_request: float = 1.0
    
    # Observability
    log_level: str = "INFO"
    enable_tracing: bool = True
    
    # API Keys (from environment)
    openai_api_key: str
    
    class Config:
        env_file = ".env"
        env_file_encoding = "utf-8"

# Load configuration
config = AgentConfig()

# Use in agent
agent = Agent(
    model=config.model,
    max_iterations=config.max_iterations,
    timeout=config.timeout_seconds
)

🚀 Production Checklist

Must Have:

✓Structured logging with trace IDs
✓Error handling and retries
✓Token/cost budgets
✓Health check endpoints

Should Have:

+Distributed tracing (OpenTelemetry)
+Metrics dashboard (Grafana)
+Alert on error rates, costs
+Load testing and benchmarks

← Tool IntegrationPrev