Get started with the Anthropic Python SDK in 2026: install, send messages, stream responses, use prompt caching, and estimate costs.
The official Anthropic Python SDK (anthropic on PyPI) wraps the Messages API with typed responses, automatic retries, and streaming helpers.
pip install anthropic
Requires Python 3.8+. Set your key via environment variable:
export ANTHROPIC_API_KEY="sk-ant-api03-..."
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain prompt caching in one paragraph."}]
)
print(message.content[0].text)
The SDK reads ANTHROPIC_API_KEY from the environment automatically. Passing no api_key argument is intentional — keeps keys out of code.
with client.messages.stream(
model="claude-sonnet-4-6-20250514",
max_tokens=512,
messages=[{"role": "user", "content": "Write a haiku."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
message = client.messages.create(
model="claude-sonnet-4-6-20250514",
max_tokens=1024,
system=[{
"type": "text",
"text": "You are a helpful assistant. " + long_system_prompt,
"cache_control": {"type": "ephemeral"} # cache this block
}],
messages=[{"role": "user", "content": user_query}]
)
# usage.cache_read_input_tokens shows tokens served from cache (90% discount)
Adding cache_control: ephemeral tells Anthropic to cache the block for 5 minutes. On the second request with the same cached prefix, the SDK response includes usage.cache_read_input_tokens — those tokens are billed at 10% of standard input price. See the prompt caching explainer for break-even math.
import asyncio
import anthropic
async def main():
async with anthropic.AsyncAnthropic() as client:
message = await client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=256,
messages=[{"role": "user", "content": "Hello"}]
)
print(message.content[0].text)
asyncio.run(main())
Use the Claude Cost Calculator to model token counts and monthly spend before scaling. The Prompt-Pricing Recommender tells you which model tier (Haiku / Sonnet / Opus) fits each request type.