Long-Term Memory

Master how AI agents store and retrieve knowledge across sessions using persistent memory systems

Vector Databases: Semantic Search at Scale

Traditional databases search for exact matchesβ€”you ask for "password reset," you get documents with those exact words. But AI agents need semantic search: understanding that "forgot credentials" means the same thing as "password reset."

Enter vector databasesβ€”systems designed to store and search high-dimensional vectors (embeddings) that capture meaning.

πŸ“ How Embeddings Work

Traditional Keyword Search

πŸ”
Query:
"password reset"
Looks for exact words: ["password", "reset"]
❌
Misses: "forgot credentials", "login help", "account recovery"

Vector Search (Semantic)

🧠
Query Embedding:
[0.75, 0.55, 0.35, ...]
Finds semantically similar vectors
βœ…
Matches: "forgot credentials" (0.92), "login help" (0.85), "account recovery" (0.88)
πŸ”’ What is an Embedding?

An embedding is a list of numbers (a vector) that represents the meaning of text. Similar meanings produce similar vectors. Models like OpenAI's text-embedding-ada-002 convert text β†’ 1536-dimensional vectors.

Interactive: Similarity Search

πŸ” Query:
"How do I reset my password?"
Embedding: [0.75, 0.55, 0.35]
0.70
0.0 (All)0.5 (Medium)1.0 (Exact)
πŸ“„ Search Results (3 / 3 documents)
Password Reset Guide
How to reset your password: Click forgot password, enter email...
100%
similarity
Embedding:[0.8, 0.6, 0.3]
Account Security Tips
Best practices for securing your account: use 2FA, strong passwords...
100%
similarity
Embedding:[0.7, 0.5, 0.4]
Recipe: Chocolate Cake
Ingredients: flour, sugar, cocoa powder. Bake at 350Β°F for 30 minutes...
90%
similarity
Embedding:[0.1, 0.2, 0.1]
πŸ’‘ Notice:

The "Recipe: Chocolate Cake" document has very low similarity to the password reset query because the meaning is completely different, even though it might share some words like "enter" or "click." Vector search understands context.

πŸ—„οΈ Popular Vector Databases

⚑

Pinecone

Fully managed, serverless

  • β€’ Easy to set up
  • β€’ Auto-scales
  • β€’ Pay-as-you-go
🐘

pgvector

PostgreSQL extension

  • β€’ Use existing Postgres
  • β€’ No new infrastructure
  • β€’ Good for small/medium scale
πŸ”·

Weaviate

Open-source, GraphQL API

  • β€’ Built-in vectorization
  • β€’ Hybrid search
  • β€’ Self-hosted or cloud
🌟

Chroma

Lightweight, Python-first

  • β€’ Easy local development
  • β€’ LangChain integration
  • β€’ Great for prototyping

πŸ”„ RAG: Retrieval-Augmented Generation

Vector databases power RAG systemsβ€”the most common pattern for giving AI agents long-term memory.

1
User asks a question
"What's our refund policy?"
2
Embed the question
embed("What's our refund policy?") β†’ [0.12, 0.45, ...]
3
Search vector DB
Find top 3 most similar documents
4
Inject into LLM prompt
"Answer using these docs: [retrieved context]"
5
Generate grounded answer
"According to our policy, refunds are available within 30 days..."

πŸ’‘ Key Insight

Vector databases enable agents to search by meaning, not just keywords. This is how ChatGPT with plugins, Notion AI, and customer support bots can answer questions about your specific documentsβ€”they retrieve relevant context, then generate answers.

←Previous