Claude Python SDK Quickstart (Anthropic API 2026)

Claude Python SDK — Quickstart Guide

Get started with the Anthropic Python SDK in 2026: install, send messages, stream responses, use prompt caching, and estimate costs.

The official Anthropic Python SDK (anthropic on PyPI) wraps the Messages API with typed responses, automatic retries, and streaming helpers.

Install

Basic message

import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-6-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Explain prompt caching in one paragraph."}] ) print(message.content[0].text)

The SDK reads ANTHROPIC_API_KEY from the environment automatically. Passing no api_key argument is intentional — keeps keys out of code.

Streaming responses

with client.messages.stream( model="claude-sonnet-4-6-20250514", max_tokens=512, messages=[{"role": "user", "content": "Write a haiku."}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)

Prompt caching (90% off repeated context)

message = client.messages.create( model="claude-sonnet-4-6-20250514", max_tokens=1024, system=[{ "type": "text", "text": "You are a helpful assistant. " + long_system_prompt, "cache_control": {"type": "ephemeral"} # cache this block }], messages=[{"role": "user", "content": user_query}] ) # usage.cache_read_input_tokens shows tokens served from cache (90% discount)

Adding cache_control: ephemeral tells Anthropic to cache the block for 5 minutes. On the second request with the same cached prefix, the SDK response includes usage.cache_read_input_tokens — those tokens are billed at 10% of standard input price. See the prompt caching explainer for break-even math.

Async client

import asyncio import anthropic async def main(): async with anthropic.AsyncAnthropic() as client: message = await client.messages.create( model="claude-haiku-4-5-20251001", max_tokens=256, messages=[{"role": "user", "content": "Hello"}] ) print(message.content[0].text) asyncio.run(main())

Estimate your Python SDK costs

Frequently asked questions

What Python version does the Anthropic SDK require?

Python 3.8 or higher. The SDK uses modern type hints and async/await syntax throughout. If you're on 3.7 or earlier, upgrade before installing.

How do I handle rate limit errors in Python?

The SDK automatically retries on 429 (rate limit) and 529 (overload) with exponential back-off. You can configure max_retries and timeout on the Anthropic() client constructor. For sustained high volume, upgrade your usage tier in console.anthropic.com.

Is there an async version of the Anthropic Python client?

Yes — use anthropic.AsyncAnthropic() with await for all methods. The streaming equivalent is AsyncAnthropic().messages.stream(). The async client is a drop-in replacement for code that already uses asyncio.

How do I pass a system prompt in the Python SDK?

Pass system as the second keyword argument to messages.create(), either as a plain string or as a list of content blocks (use the list form when you want to add cache_control to the system prompt).

Free tools