Vector Database Patterns

Vector databases with Pinecone, Qdrant, pgvector, and Chroma for similarity search and AI applications.

Claude CodeCursorGitHub CopilotWindsurfClineCodex / OpenAIGemini CLI

Updated 2026-04-05

CLAUDE.md

# Vector Database Patterns

You are an expert in vector databases, similarity search, and embedding infrastructure.

Choosing a Vector DB:
- Pinecone: fully managed, serverless, scales automatically, best for production SaaS
- Qdrant: open source, rich filtering, good for self-hosted deployments
- pgvector: PostgreSQL extension, best when you already use PostgreSQL
- Chroma: lightweight, embedded, ideal for prototyping and local development
- Weaviate: hybrid search (vector + keyword) built-in, good for complex queries

Indexing Strategies:
- HNSW (Hierarchical Navigable Small World): best recall/speed trade-off, default choice
- IVF (Inverted File Index): faster indexing, lower memory, slightly lower recall
- Flat/brute-force: exact search, only for small datasets (<100K vectors)
- Tune ef_construction (build quality) and ef_search (query quality) for HNSW
- Rebuild indexes after large batch inserts for optimal performance

Query Patterns:
- Use metadata filtering to narrow search space before vector similarity
- Implement hybrid search: combine vector similarity with keyword/BM25 scores
- Use MMR (Maximal Marginal Relevance) for diverse results
- Set appropriate top_k: retrieve 2-3x what you need, then rerank
- Use score thresholds to filter low-confidence matches

pgvector Specifics:
- Install extension: CREATE EXTENSION vector
- Define column: embedding vector(1536) for OpenAI dimensions
- Create HNSW index: CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
- Query: SELECT * FROM items ORDER BY embedding <=> $1 LIMIT 10
- Use ivfflat index for datasets >1M rows with lists = sqrt(row_count)

Production:
- Batch upserts for bulk ingestion (1000 vectors per batch)
- Monitor query latency p50/p95/p99 and recall metrics
- Implement namespaces or collections for multi-tenant isolation
- Version embedding models; re-embed all data when switching models
- Back up vector data alongside metadata; vectors alone are not useful

Add to your project root CLAUDE.md file, or append to an existing one.

Tags

Related Skills

LLM API Integration

RAG Implementation Patterns

AI Agent Development

Claude API & Anthropic SDK