Vector Databases for Memory
Master how AI agents use vector databases to store, search, and retrieve embeddings for semantic memory
Your Progress
0 / 5 completedSimilarity Metrics & Search
Once data is stored as embeddings, vector databases enable similarity search: finding vectors closest to a query vector. The choice of distance metric determines how "similarity" is calculated.
Three primary metrics: Cosine Similarity (direction), Dot Product (magnitude + direction), and Euclidean Distance (geometric distance).
Interactive: Similarity Metric Calculator
Adjust vectors and switch metrics to see how similarity scores change.
Vector 1: Query
Vector 2: Document
Similarity Score
0.9600
Higher = More Similar (Range: -1 to 1)
Vector 1:[0.80, 0.60]
Vector 2:[0.60, 0.80]
📐 Comparing Distance Metrics
📏Cosine Similarity
Measures angle between vectors (direction, not magnitude)
✓ Range: -1 (opposite) to 1 (identical)
✓ Best for: Text, normalized embeddings
✓ Most common for semantic search
⚡Dot Product
Combines direction and magnitude
✓ Range: Unbounded
✓ Best for: Normalized vectors, speed
✓ Faster than cosine (no division)
📍Euclidean Distance
Straight-line distance in space
✓ Range: 0 (identical) to ∞
✓ Best for: Coordinate data, images
✓ Lower values = more similar
🔍 Nearest Neighbor Search
Vector databases find the K nearest neighbors to a query vector. Here's how semantic clustering works:
Example: Query = "cat" [0.8, 0.3]
cat[0.80, 0.30]
100.0%
kitten[0.75, 0.35]
99.7%
dog[0.70, 0.40]
98.7%
computer[0.20, 0.90]
54.6%
laptop[0.25, 0.85]
60.1%
Observation: Animal terms (cat, kitten, dog) cluster together with high similarity. Tech terms (computer, laptop) form a separate cluster. This is semantic organization!
⚙️ Practical Considerations
1.
Vector Normalization: Normalize embeddings to unit length for consistent cosine similarity scores (many models do this automatically).
2.
Top K Results: Retrieve the K most similar vectors (e.g., top 5). Balance recall (get enough) vs relevance (not too many).
3.
Similarity Threshold: Filter results below a minimum similarity score (e.g., 0.7) to ensure quality.
4.
Metadata Filtering: Combine vector search with attribute filters ("Find similar documents WHERE author = 'Alice'").