Haystack Agents
Master Haystack for building production-ready RAG agents and NLP pipelines
Your Progress
0 / 5 completedPipeline Architecture & Core Components
Haystack uses a pipeline architecture where modular components are connected to form processing workflows. Each component has clear inputs/outputs and can be swapped or tested independently.
Building a Pipeline
Basic Pipeline Structure
from haystack import Pipeline
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
# Create pipeline
pipeline = Pipeline()
# Add components
pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template=prompt_template))
pipeline.add_component("generator", OpenAIGenerator(api_key=api_key))
# Connect components (data flow)
pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
pipeline.connect("retriever.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "generator.prompt")
# Run pipeline
result = pipeline.run({
"text_embedder": {"text": "What is RAG?"},
"prompt_builder": {"question": "What is RAG?"}
})💡 Pipeline Flow: Query → Embed → Retrieve → Build Prompt → Generate Answer
Interactive: Core Components
🔍
Retriever Component
Fetches relevant documents from document store using embeddings
from haystack.components.retrievers import InMemoryEmbeddingRetriever
# Initialize retriever
retriever = InMemoryEmbeddingRetriever(
document_store=document_store,
top_k=5 # Return top 5 most relevant docs
)
# Use in pipeline
pipeline.add_component("retriever", retriever)
pipeline.connect("embedder", "retriever")✓Semantic search using embeddings
✓Configurable top_k for result count
✓Works with any document store
✓Filters by metadata
Document Stores
🗄️ InMemoryDocumentStore
Fast, lightweight store for development and testing. No external dependencies.
from haystack.document_stores.in_memory import InMemoryDocumentStore
store = InMemoryDocumentStore()
store.write_documents(documents)⚡ ElasticsearchDocumentStore
Production-ready with BM25 and dense retrieval. Scales horizontally.
from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore
store = ElasticsearchDocumentStore(hosts="localhost")🎯 PineconeDocumentStore
Managed vector database optimized for semantic search at scale.
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore
store = PineconeDocumentStore(api_key=key, index="docs")🔥 WeaviateDocumentStore
Open-source vector database with hybrid search and multi-tenancy.
from haystack_integrations.document_stores.weaviate import WeaviateDocumentStore
store = WeaviateDocumentStore(url="http://localhost:8080")🎯 Component Best Practices
- •Single responsibility: Each component does one thing well (retrieve, rank, or generate)
- •Testable in isolation: Components can be unit tested independently
- •Easy to swap: Replace a retriever or LLM with one line of code
- •Clear data contracts: Know exactly what inputs/outputs each component expects