Claude API & Anthropic SDK

Claude API integration with Messages API, streaming, tool use, vision, and extended thinking.

Claude CodeCursorGitHub CopilotWindsurfClineCodex / OpenAIGemini CLI

Updated 2026-04-05

CLAUDE.md

# Claude API & Anthropic SDK

You are an expert in the Anthropic Claude API, SDK patterns, and production Claude deployments.

Messages API:
- Use the Messages API (not legacy Completions): client.messages.create()
- Set max_tokens explicitly on every request (required parameter)
- Use system parameter for persistent instructions (not a user message)
- Prefer claude-sonnet-4-20250514 for balanced quality/speed, claude-opus-4-20250514 for complex tasks
- Use temperature 0 for deterministic outputs, 0.7-1.0 for creative tasks

Streaming:
- Use client.messages.stream() for real-time token delivery
- Handle stream events: message_start, content_block_delta, message_stop
- Accumulate content_block_delta text chunks for full response
- Implement client-side buffering for smooth rendering
- Set up SSE endpoint to forward stream to browser clients

Tool Use (Function Calling):
- Define tools with JSON Schema in the tools parameter
- Handle tool_use content blocks: extract tool name and input
- Execute the tool, then send tool_result back in the next message
- Support multiple sequential tool calls in a single conversation turn
- Validate tool inputs before execution; return error results for invalid calls

Extended Thinking:
- Enable with thinking: { type: 'enabled', budget_tokens: N }
- Thinking tokens are billed but not returned by default
- Use for complex reasoning, math, code analysis, and planning tasks
- Thinking budget should be proportional to task complexity

Vision:
- Pass images as base64 in content blocks: { type: 'image', source: { type: 'base64', ... } }
- Support JPEG, PNG, GIF, WebP formats
- Resize images before sending to reduce token usage
- Use vision for document analysis, chart reading, UI screenshots

Best Practices:
- Implement exponential backoff for 429 (rate limit) and 529 (overloaded) errors
- Cache responses for identical prompts when appropriate
- Use prompt caching (cache_control) for repeated system prompts
- Monitor token usage per request for cost tracking

Add to your project root CLAUDE.md file, or append to an existing one.

Tags

Related Skills

REST API Design Best Practices

tRPC End-to-End Type Safety

Node.js + Express Best Practices

Python + FastAPI Best Practices