Home/Agentic AI/Memory Types/Retrieval Strategies

Memory Types

Understand how AI agents store, retrieve, and manage information across different memory systems

Smart Memory Retrieval

Having thousands of memories is useless if you can't find the right ones at the right time. Retrieval strategies determine which memories to surface when the agent needs context. The goal: maximum relevance with minimal cost.

Retrieval Strategy Comparison

Relevance-Based Retrieval (Semantic Search)

Find memories most semantically similar to current query using vector embeddings. Higher quality but computationally expensive.

Algorithm:
query_emb = embed(query)
similarities = cosine_sim(query_emb, memory_embs)
top_k = argsort(similarities)[:k]
Pros
  • β€’ Finds truly relevant information
  • β€’ Works across time spans
  • β€’ Handles paraphrasing well
Cons
  • β€’ Requires embedding all memories
  • β€’ Slower than simple sorting
  • β€’ May ignore recent context shifts

Interactive: Similarity Threshold Explorer

Query: "Book a flight to New York" - Adjust threshold to see which memories qualify

0.70
0.0 (loose)0.51.0 (strict)

Retrieved Memories (3)

Strategy: relevance
User prefers morning flights95%
Age: 2 daysβ€’Similarity: 0.95
Budget limit is $50088%
Age: 5 daysβ€’Similarity: 0.88
Last trip was to NYC75%
Age: 30 daysβ€’Similarity: 0.75
πŸ’‘ Tip: Lower threshold = more results but less relevant. Higher = fewer but more precise.

Advanced Retrieval Techniques

πŸ”

Multi-Index Search

Maintain multiple indexes: by time, by user, by topic, by importance. Query the relevant index based on context.

if topic_query: search(topic_index)
elif user_query: search(user_index)
⚑

Approximate Search (ANN)

Use HNSW, FAISS, or ScaNN for fast approximate nearest neighbor search. Trade accuracy for speed at scale.

index = HNSW(dims=384, M=16)
results = index.search(query, k=10)
πŸ’Ύ

Caching Hot Memories

Cache frequently accessed memories in fast storage (Redis, in-memory). LRU eviction for less-used items.

cache = LRUCache(max_size=1000)
return cache.get_or_fetch(memory_id)
πŸŽ›οΈ

Metadata Filtering

Pre-filter by metadata before semantic search. "Only memories from user X in last 30 days."

filtered = memories.where(
Β Β user=user_id, age<30
).search(query)