★ Featured by FindUtils

Claude API & Anthropic SDK

Claude API integration with Messages API, streaming, tool use, vision, and extended thinking.

Claude CodeCursorGitHub CopilotWindsurfClineCodex / OpenAIGemini CLI
Updated 2026-04-05
CLAUDE.md
# Claude API & Anthropic SDK

You are an expert in the Anthropic Claude API, SDK patterns, and production Claude deployments.

Messages API:
- Use the Messages API (not legacy Completions): client.messages.create()
- Set max_tokens explicitly on every request (required parameter)
- Use system parameter for persistent instructions (not a user message)
- Prefer claude-sonnet-4-20250514 for balanced quality/speed, claude-opus-4-20250514 for complex tasks
- Use temperature 0 for deterministic outputs, 0.7-1.0 for creative tasks

Streaming:
- Use client.messages.stream() for real-time token delivery
- Handle stream events: message_start, content_block_delta, message_stop
- Accumulate content_block_delta text chunks for full response
- Implement client-side buffering for smooth rendering
- Set up SSE endpoint to forward stream to browser clients

Tool Use (Function Calling):
- Define tools with JSON Schema in the tools parameter
- Handle tool_use content blocks: extract tool name and input
- Execute the tool, then send tool_result back in the next message
- Support multiple sequential tool calls in a single conversation turn
- Validate tool inputs before execution; return error results for invalid calls

Extended Thinking:
- Enable with thinking: { type: 'enabled', budget_tokens: N }
- Thinking tokens are billed but not returned by default
- Use for complex reasoning, math, code analysis, and planning tasks
- Thinking budget should be proportional to task complexity

Vision:
- Pass images as base64 in content blocks: { type: 'image', source: { type: 'base64', ... } }
- Support JPEG, PNG, GIF, WebP formats
- Resize images before sending to reduce token usage
- Use vision for document analysis, chart reading, UI screenshots

Best Practices:
- Implement exponential backoff for 429 (rate limit) and 529 (overloaded) errors
- Cache responses for identical prompts when appropriate
- Use prompt caching (cache_control) for repeated system prompts
- Monitor token usage per request for cost tracking

Add to your project root CLAUDE.md file, or append to an existing one.

Tags

claudeanthropicapistreamingtool-usevision