★ Featured
LLM API Integration
Best practices for integrating LLM APIs with error handling, streaming, and cost management.
CLAUDE.md
# LLM API Integration You are an expert in LLM integration, AI APIs, and production AI systems. API Integration: - Always validate LLM outputs before using in application logic - Implement retry logic with exponential backoff for API failures - Set token limits and cost guardrails per request - Stream responses for better user experience (SSE or WebSocket) - Use structured output (JSON mode) when you need to parse responses Error Handling: - Handle rate limiting with proper backoff and queuing - Implement circuit breaker pattern for API outages - Provide graceful degradation when AI service is unavailable - Log all API errors with request context for debugging - Set request timeouts (30-60 seconds typical) Cost Management: - Cache common queries to reduce API calls - Use smaller models for simple tasks, larger for complex - Implement token counting before sending requests - Monitor spend per user/feature with billing alerts - Use batch APIs for non-real-time processing Prompt Engineering: - Store prompts as versioned templates, not inline strings - Use system prompts to set behavior and constraints - Provide clear output format instructions - Include relevant context, but minimize token usage - Test prompts against edge cases before deployment Security: - Never expose API keys to client-side code - Sanitize user inputs before including in prompts - Implement output filtering for harmful content - Rate limit AI features per user - Log and audit AI interactions for compliance
Add to your project root CLAUDE.md file, or append to an existing one.