Claude API Async Python: AsyncAnthropic Client & asyncio Patterns

Using Claude API with Python asyncio (AsyncAnthropic)

Use AsyncAnthropic to call Claude in async Python applications. Concurrent requests with asyncio.gather, async streaming, FastAPI integration, and performance tips.

The Anthropic Python SDK ships an AsyncAnthropic client with the same interface as the sync version, but all methods return coroutines. Use it whenever your code runs inside asyncio — FastAPI, Starlette, or any event-loop-based server.

Basic async call

import asyncio import anthropic async def main(): client = anthropic.AsyncAnthropic() message = await client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": "What is the capital of France?"}] ) print(message.content[0].text) asyncio.run(main())

Concurrent requests with asyncio.gather

This is the main reason to use async — fire N requests simultaneously and collect results, rather than waiting for each one sequentially.

import asyncio, anthropic client = anthropic.AsyncAnthropic() async def summarize(text: str, label: str) -> str: msg = await client.messages.create( model="claude-haiku-4-5", max_tokens=256, messages=[{"role": "user", "content": f"Summarise in 2 sentences: {text}"}] ) return f"{label}: {msg.content[0].text}" async def main(): docs = [ ("Long document one...", "doc-1"), ("Long document two...", "doc-2"), ("Long document three...", "doc-3"), ] results = await asyncio.gather(*[summarize(text, label) for text, label in docs]) for r in results: print(r) asyncio.run(main())

Async streaming

import asyncio, anthropic client = anthropic.AsyncAnthropic() async def stream_response(): async with client.messages.stream( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": "Write a poem about asyncio."}] ) as stream: async for text in stream.text_stream: print(text, end="", flush=True) print() # newline at end asyncio.run(stream_response())

FastAPI integration

from fastapi import FastAPI from fastapi.responses import StreamingResponse import anthropic app = FastAPI() client = anthropic.AsyncAnthropic() @app.post("/summarize") async def summarize(body: dict): message = await client.messages.create( model="claude-haiku-4-5", max_tokens=512, messages=[{"role": "user", "content": f"Summarise: {body['text']}"}] ) return {"summary": message.content[0].text} @app.post("/stream") async def stream_chat(body: dict): async def generate(): async with client.messages.stream( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": body["prompt"]}] ) as stream: async for text in stream.text_stream: yield text return StreamingResponse(generate(), media_type="text/plain")

Rate-limit-aware concurrent pool

import asyncio, anthropic client = anthropic.AsyncAnthropic() async def bounded_call(sem: asyncio.Semaphore, prompt: str) -> str: async with sem: msg = await client.messages.create( model="claude-haiku-4-5", max_tokens=256, messages=[{"role": "user", "content": prompt}] ) return msg.content[0].text async def main(prompts: list[str], concurrency: int = 10): sem = asyncio.Semaphore(concurrency) return await asyncio.gather(*[bounded_call(sem, p) for p in prompts]) results = asyncio.run(main(["Summarise X", "Classify Y", "Extract Z"], concurrency=5))

Frequently asked questions

When should I use AsyncAnthropic vs Anthropic?

Use AsyncAnthropic whenever your code runs inside an async event loop (FastAPI, Starlette, asyncio scripts). Use the sync Anthropic client for scripts, Jupyter notebooks, or frameworks that don't use asyncio (Flask, Django without ASGI). Mixing sync calls inside an async event loop blocks the loop.

How many concurrent Claude API requests can I make with asyncio?

Technically unlimited from the asyncio side, but Anthropic enforces RPM and TPM rate limits per API key. Use asyncio.Semaphore to cap concurrent requests (10–50 is a typical safe range for Tier 1 keys). For even higher throughput, use the Message Batches API instead.

Can I use AsyncAnthropic with Django?

Yes, with Django 4.1+ ASGI mode or inside async views. Standard Django WSGI mode does not run an asyncio event loop — in that case use the sync client or asyncio.run() (but not in a view — it will block the thread). For Django + Claude, FastAPI or Starlette is a simpler choice.

How do I handle errors in asyncio.gather with Claude?

By default, asyncio.gather raises the first exception and cancels remaining tasks. Pass return_exceptions=True to collect all results (successful responses and exceptions) without stopping. Then filter the list: results = [r for r in raw if not isinstance(r, Exception)].

Free tools