Claude API Error Handling & Retry Logic (Python & Node.js)

Claude API Error Handling, Rate Limits & Retry Logic

Handle Anthropic API errors correctly: rate limits (529), overload (529), timeouts, and auth errors. Python and Node.js retry patterns with exponential backoff.

The Anthropic Python and TypeScript SDKs include automatic retry with exponential backoff by default. This page covers what each error code means, when to retry, and how to customize retry behavior for production workloads.

HTTP error codes

Status	Class	Meaning	Retry?
401	AuthenticationError	Invalid API key	No
403	PermissionDeniedError	Key lacks permission	No
404	NotFoundError	Bad URL or model name	No
413	BadRequestError	Request too large	No (reduce size)
429	RateLimitError	Too many requests	Yes (backoff)
500	InternalServerError	Anthropic server error	Yes
529	OverloadedError	Capacity exceeded	Yes (backoff)

Status

Class

Meaning

Retry?

401

AuthenticationError

Invalid API key

403

PermissionDeniedError

Key lacks permission

404

NotFoundError

Bad URL or model name

413

BadRequestError

Request too large

No (reduce size)

429

RateLimitError

Too many requests

Yes (backoff)

500

InternalServerError

Anthropic server error

Yes

529

OverloadedError

Capacity exceeded

Yes (backoff)

Python: default retry (built-in)

import anthropic # SDK auto-retries on 429 and 5xx with exponential backoff (2 retries by default) client = anthropic.Anthropic( max_retries=4, # increase for batch workloads timeout=60.0 # seconds (connect + read) ) message = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": "Hello"}] )

Python: catch specific errors

import anthropic from anthropic import APIStatusError, RateLimitError, APITimeoutError client = anthropic.Anthropic() try: message = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=[{"role": "user", "content": "Hello"}] ) except RateLimitError as e: print(f"Rate limited. Retry-After: {e.response.headers.get('retry-after')} s") # implement wait logic here except APITimeoutError: print("Request timed out — increase timeout or reduce max_tokens") except anthropic.OverloadedError: print("API overloaded (529) — back off and retry") except APIStatusError as e: print(f"API error {e.status_code}: {e.message}")

Python: custom exponential backoff

import time, random, anthropic client = anthropic.Anthropic(max_retries=0) # handle manually def call_with_backoff(messages, max_attempts=5): for attempt in range(max_attempts): try: return client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, messages=messages ) except (anthropic.RateLimitError, anthropic.OverloadedError) as e: if attempt == max_attempts - 1: raise wait = (2 ** attempt) + random.uniform(0, 1) print(f"Attempt {attempt+1} failed ({type(e).__name__}). Waiting {wait:.1f}s...") time.sleep(wait) result = call_with_backoff([{"role": "user", "content": "Hello"}])

Node.js: error handling

import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ maxRetries: 4, timeout: 60_000 }); try { const response = await client.messages.create({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [{ role: "user", content: "Hello" }] }); console.log(response.content[0].text); } catch (err) { if (err instanceof Anthropic.RateLimitError) { const retryAfter = err.headers?.["retry-after"]; console.error(`Rate limited. Retry after ${retryAfter}s`); } else if (err instanceof Anthropic.APITimeoutError) { console.error("Timeout — reduce max_tokens or increase timeout"); } else if (err instanceof Anthropic.OverloadedError) { console.error("Overloaded (529) — back off and retry"); } else { throw err; } }

Rate limit tiers

Anthropic enforces requests per minute (RPM) and tokens per minute (TPM) limits. Free-tier users get the lowest limits; paying users get higher defaults that increase with usage history. Check your current limits in the Anthropic Console under "Limits".

Frequently asked questions

What does a 529 error mean from the Anthropic API?

529 OverloadedError means Anthropic's servers are temporarily at capacity. It is different from 429 RateLimitError (which means you've hit your account quota). Both are safe to retry with exponential backoff. The SDK's built-in retry logic handles both automatically.

How many retries does the Anthropic SDK do by default?

The Python and Node.js SDKs retry up to 2 times by default on 429 and 5xx errors. You can increase this with max_retries=N (Python) or maxRetries: N (Node.js). Set it to 0 to disable retries and handle them yourself.

How do I read the retry-after header from a 429 error?

In Python: err.response.headers.get('retry-after'). In Node.js: err.headers['retry-after']. The value is the number of seconds to wait before retrying. If the header is absent, use exponential backoff starting at 1s.

What timeout should I set for the Claude API?

The SDK default is 600s (10 min) for Python and 10 min for Node.js. For interactive workloads, set timeout=30.0 (30s) to fail fast. For large context windows or long generations (max_tokens > 4096), keep it at 120–300s. For streaming, the timeout applies to connection setup only.

Free tools