Advanced SQL Query Patterns

Advanced SQL patterns including window functions, CTEs, indexing strategies, and query optimization.

Claude CodeCursorGitHub CopilotWindsurfClineCodex / OpenAIGemini CLI

Updated 2026-04-05

CLAUDE.md

# Advanced SQL Query Patterns

You are an expert in SQL across major databases (PostgreSQL, MySQL, SQLite, SQL Server) with deep knowledge of query optimization.

Common Table Expressions (CTEs):
- Use CTEs for readable multi-step queries: WITH active_users AS (SELECT ...) SELECT ... FROM active_users
- Recursive CTEs for hierarchical data: WITH RECURSIVE tree AS (base UNION ALL recursive)
- Use CTEs instead of subqueries for clarity — name each logical step
- In PostgreSQL, CTEs are optimization fences (materialized by default) — use NOT MATERIALIZED hint if needed
- Chain multiple CTEs: WITH step1 AS (...), step2 AS (SELECT ... FROM step1) SELECT ...

Window Functions:
- ROW_NUMBER() for pagination and deduplication: ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC)
- RANK() / DENSE_RANK() for ranking with ties
- LAG() / LEAD() for comparing with previous/next rows (time series, deltas)
- SUM() OVER (ORDER BY date) for running totals
- NTILE(4) for quartile bucketing
- Always define PARTITION BY and ORDER BY explicitly — defaults are unpredictable

Indexing Strategy:
- Index columns used in WHERE, JOIN, ORDER BY, GROUP BY clauses
- Composite indexes: leftmost prefix rule — (a, b, c) supports queries on (a), (a, b), (a, b, c) but NOT (b, c)
- Covering indexes: include all columns in SELECT to avoid table lookups
- Partial indexes (PostgreSQL): CREATE INDEX ... WHERE status = 'active' (smaller, faster)
- Don't over-index: each index slows down INSERT/UPDATE/DELETE
- Use EXPLAIN ANALYZE to verify index usage — the optimizer may ignore your index

Query Optimization:
- Read EXPLAIN ANALYZE output: look for Seq Scan on large tables, Nested Loop on large joins
- Replace correlated subqueries with JOINs — subqueries re-execute per row
- Use EXISTS instead of IN for large subquery results: WHERE EXISTS (SELECT 1 FROM ...)
- Avoid SELECT * — fetch only needed columns (reduces I/O, enables covering indexes)
- Use LIMIT early in CTEs/subqueries to reduce intermediate result sizes
- Batch large UPDATE/DELETE operations to avoid lock contention

Anti-Patterns:
- Functions on indexed columns disable index: WHERE YEAR(created_at) = 2024 (use range instead)
- Implicit type coercion: WHERE id = '123' may skip index on integer column
- SELECT DISTINCT as a fix for bad JOINs — fix the JOIN instead
- ORDER BY RAND() for random rows — use TABLESAMPLE or application-side randomization
- Missing WHERE on UPDATE/DELETE — always test with SELECT first

Data Integrity:
- Use foreign keys for referential integrity — never rely on application code alone
- Use CHECK constraints for domain validation: CHECK (age >= 0 AND age <= 150)
- Use UNIQUE constraints for business rules: UNIQUE (email) or UNIQUE (org_id, slug)
- Use transactions for multi-statement operations: BEGIN ... COMMIT (or ROLLBACK)
- Use SELECT ... FOR UPDATE for pessimistic locking when concurrent writes are expected

Add to your project root CLAUDE.md file, or append to an existing one.

Tags

Related Skills

PostgreSQL Query Optimization

MongoDB Best Practices

Drizzle ORM TypeScript-First Patterns

TypeScript Strict Mode