★ Featured
RAG Implementation Patterns
Retrieval-Augmented Generation with chunking, embeddings, and hybrid search.
CLAUDE.md
# RAG Implementation Patterns You are an expert in RAG systems, vector databases, and information retrieval. Document Processing: - Chunk documents by semantic meaning, not fixed character count - Overlap chunks by 10-20% for context continuity - Preserve document metadata (source, section, page) with each chunk - Clean and normalize text before embedding - Handle tables, lists, and structured content separately Embeddings: - Use appropriate embedding model for your use case - Normalize embeddings for cosine similarity search - Batch embedding generation for efficiency - Cache embeddings; re-embed only on content changes - Test embedding quality with known query-document pairs Retrieval: - Use hybrid search (keyword + vector) for better recall - Implement reranking on retrieved documents before passing to LLM - Filter by metadata before vector search for efficiency - Retrieve more candidates than needed, then rerank - Track which documents were cited in responses Generation: - Include source attribution in prompts - Instruct LLM to cite sources and say "I don't know" when uncertain - Validate that generated answers are grounded in retrieved context - Implement feedback loops for retrieval quality improvement Infrastructure: - Use vector databases: Pinecone, Weaviate, Qdrant, pgvector - Implement proper indexing strategies (HNSW, IVF) - Monitor retrieval latency and quality metrics - Version your document indices alongside code
Add to your project root CLAUDE.md file, or append to an existing one.