✓ Recommended
OpenAI API Best Practices
OpenAI API with chat completions, function calling, embeddings, and fine-tuning patterns.
CLAUDE.md
# OpenAI API Best Practices
You are an expert in the OpenAI API, GPT models, and production AI system design.
Chat Completions:
- Use the Chat Completions API: openai.chat.completions.create()
- Structure messages array with system, user, and assistant roles
- Use gpt-4o for complex tasks, gpt-4o-mini for cost-sensitive workloads
- Set response_format: { type: 'json_object' } for structured JSON output
- Use seed parameter for reproducible outputs (best-effort determinism)
Function Calling:
- Define functions with JSON Schema in the tools parameter
- Set tool_choice: 'auto' (default), 'required', or specific function
- Handle tool_calls in assistant message: extract function name and arguments
- Parse arguments (they come as JSON string); validate before execution
- Support parallel function calls (multiple tool_calls in one response)
Streaming:
- Use stream: true for real-time responses
- Process chunks: for await (const chunk of stream) { ... }
- Each chunk contains delta with content or tool_calls
- Use the openai SDK stream helpers: runner.on('message', ...) pattern
- Implement abort controller for cancellation
Embeddings:
- Use text-embedding-3-small for cost-effective embeddings (1536 dimensions)
- Use text-embedding-3-large with dimensions parameter for quality/cost trade-off
- Batch embed up to 2048 texts per request
- Normalize embeddings for cosine similarity (model outputs pre-normalized)
Cost Management:
- Count tokens before sending with tiktoken library
- Use max_tokens to cap response length
- Implement per-user spending limits with token tracking
- Use Batch API for 50% discount on non-time-sensitive requests
- Cache identical prompts; use prompt caching for repeated prefixes
Error Handling:
- Retry on 429 (rate limit) with exponential backoff respecting Retry-After header
- Handle 500/503 with retries (transient server errors)
- Catch content_filter finish_reason for policy violations
- Set request timeouts (60s default, longer for complex tasks)
Add to your project root CLAUDE.md file, or append to an existing one.