Claude Python SDK — Quickstart Guide

Get started with the Anthropic Python SDK in 2026: install, send messages, stream responses, use prompt caching, and estimate costs.

🔥 Launch tonight — Power Prompts PDF 50p (just 50p tonight)30 battle-tested Claude Code prompts · 8 pages · paste into CLAUDE.md · price reverts to £5

The official Anthropic Python SDK (anthropic on PyPI) wraps the Messages API with typed responses, automatic retries, and streaming helpers.

Install

pip install anthropic

Requires Python 3.8+. Set your key via environment variable:

export ANTHROPIC_API_KEY="sk-ant-api03-..."

Basic message

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain prompt caching in one paragraph."}]
)
print(message.content[0].text)

The SDK reads ANTHROPIC_API_KEY from the environment automatically. Passing no api_key argument is intentional — keeps keys out of code.

Streaming responses

with client.messages.stream(
    model="claude-sonnet-4-6-20250514",
    max_tokens=512,
    messages=[{"role": "user", "content": "Write a haiku."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Prompt caching (90% off repeated context)

message = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": "You are a helpful assistant. " + long_system_prompt,
        "cache_control": {"type": "ephemeral"}   # cache this block
    }],
    messages=[{"role": "user", "content": user_query}]
)
# usage.cache_read_input_tokens shows tokens served from cache (90% discount)

Adding cache_control: ephemeral tells Anthropic to cache the block for 5 minutes. On the second request with the same cached prefix, the SDK response includes usage.cache_read_input_tokens — those tokens are billed at 10% of standard input price. See the prompt caching explainer for break-even math.

Async client

import asyncio
import anthropic

async def main():
    async with anthropic.AsyncAnthropic() as client:
        message = await client.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=256,
            messages=[{"role": "user", "content": "Hello"}]
        )
        print(message.content[0].text)

asyncio.run(main())

Estimate your Python SDK costs

Use the Claude Cost Calculator to model token counts and monthly spend before scaling. The Prompt-Pricing Recommender tells you which model tier (Haiku / Sonnet / Opus) fits each request type.

Frequently asked questions

What Python version does the Anthropic SDK require?
Python 3.8 or higher. The SDK uses modern type hints and async/await syntax throughout. If you're on 3.7 or earlier, upgrade before installing.
How do I handle rate limit errors in Python?
The SDK automatically retries on 429 (rate limit) and 529 (overload) with exponential back-off. You can configure max_retries and timeout on the Anthropic() client constructor. For sustained high volume, upgrade your usage tier in console.anthropic.com.
Is there an async version of the Anthropic Python client?
Yes — use anthropic.AsyncAnthropic() with await for all methods. The streaming equivalent is AsyncAnthropic().messages.stream(). The async client is a drop-in replacement for code that already uses asyncio.
How do I pass a system prompt in the Python SDK?
Pass system as the second keyword argument to messages.create(), either as a plain string or as a list of content blocks (use the list form when you want to add cache_control to the system prompt).

Free tools

Cost Calculator → Prompt-Pricing Recommender → Diff Summarizer → Skills Browser →

Related

Claude Opus 4.7 vs Sonnet 4.6 Pricing (2026 Comparison)How Much Does Claude Cost? (2026 API Pricing Guide)Claude Prompt Caching: 90% Cost Savings Explained (2026)Claude API Cost Calculator: Estimate Your Anthropic BillClaude vs GPT-4 Pricing: 2026 API Cost Comparison