Implementing Guardrails

Build robust safety mechanisms to protect your AI agents from misuse and failures

Implementation Patterns

Here are practical code patterns for implementing guardrails in your AI agents. These examples show real-world patterns you can adapt to your use case.

Code Examples

Input Validation Guardrail

Basic input validation checking length, injection patterns, and PII

def validate_input(user_input: str) -> tuple[bool, str]:
    """Validate and sanitize user input before processing"""
    
    # Check length
    if len(user_input) > 5000:
        return False, "Input exceeds maximum length (5000 chars)"
    
    # Check for SQL injection patterns
    sql_patterns = ['DROP TABLE', 'DELETE FROM', '--', ';']
    if any(pattern.lower() in user_input.lower() for pattern in sql_patterns):
        return False, "Potential SQL injection detected"
    
    # Check for prompt injection
    injection_keywords = ['ignore previous', 'you are now', 'new instructions']
    if any(kw.lower() in user_input.lower() for kw in injection_keywords):
        return False, "Potential prompt injection detected"
    
    # Check for PII
    import re
    if re.search(r'\d{3}-\d{2}-\d{4}', user_input):  # SSN pattern
        return False, "PII detected in input (SSN)"
    
    return True, "Input valid"

Implementation Checklist

  • Start simple: Begin with basic length/format validation before complex rules
  • Layer guardrails: Use chain pattern to compose multiple checks
  • Log failures: Record when/why guardrails block requests for analysis
  • Test thoroughly: Include adversarial test cases in your test suite
  • Monitor performance: Track guardrail latency and false positive rates
  • Iterate based on data: Adjust rules based on real-world blocking patterns
💡
Pro Tip: Use Libraries

Don't build everything from scratch. Use established libraries like guardrails-ai,nemoguardrails, and llm-guardfor production-grade implementations with extensive rule sets and optimizations.