Claude API Error 529 (Overloaded): Why It Happens and How to Handle It

Claude API Error 529 — Overloaded

What HTTP 529 means on the Anthropic API, why it spikes during peak hours, and the retry, fallback, and routing patterns that keep your service up.

A 529 Overloaded response from the Anthropic API means Claude itself is at capacity — not that your account hit a rate limit, and not that your request was malformed. It's the API's way of saying "try again in a moment." Unlike a 429 (which is per-account rate limiting and is your problem to fix), 529 is global to Anthropic's infrastructure and resolves on its own. But if you don't handle it deliberately, a 529 spike during a model launch or peak US working hours can take your service down.

The exact response

The type: "overloaded_error" body field is the canonical signal — handle on that, not just the status code, since transient infrastructure errors can also surface as 5xx with different bodies.

When 529s spike

The retry pattern

The official SDKs retry 529 (and 429, 408, 500, 502, 503, 504) automatically with exponential backoff — up to 2 retries by default. That's adequate for a brief overload but insufficient for a sustained one. Bump to 4–6 retries with jitter in production:

import { Anthropic, APIError } from "@anthropic-ai/sdk"; const client = new Anthropic({ maxRetries: 5, timeout: 90_000, }); async function askWithFallback(messages) { try { return await client.messages.create({ model: "claude-opus-4-7", max_tokens: 1024, messages, }); } catch (err) { if (err instanceof APIError && err.status === 529) { // After SDK retries exhausted: fall back to Sonnet. return await client.messages.create({ model: "claude-sonnet-4-6", max_tokens: 1024, messages, }); } throw err; } }

Three patterns that scale

What not to do

Pricing implication

Tier fallback costs you money: if 5% of Opus traffic falls back to Sonnet during peak hours, your effective bill changes. Sonnet costs $3/$15 vs Opus $15/$75 — so the fallback is cheaper, but your quality metrics may shift. Track the fallback rate and gate alerts on it crossing 10%.

Frequently asked questions

What does HTTP 529 mean on the Claude API?

529 (Overloaded) means Claude's infrastructure is at capacity globally — not that your account hit a rate limit. It's transient. The official Anthropic SDKs retry 529 automatically with exponential backoff; the recommended response is to let the SDK retry and add a tier-fallback (e.g. Opus → Sonnet) for sustained overloads.

How long does a Claude 529 overload usually last?

Most 529 spikes resolve within 30–300 seconds. Model launch days (when new Opus or Sonnet versions ship) and peak US business hours (14:00–22:00 UTC) see longer events, occasionally 5–15 minutes. Build a 60-second circuit breaker plus tier fallback to ride through these without user impact.

Should I retry on a Claude 529 error?

Yes — 529 is explicitly retriable. Use exponential backoff with full jitter and a retry ceiling of 4–6 attempts in production. The Anthropic SDKs do this automatically up to maxRetries (default 2). For latency-tolerant work, push to a retry queue rather than holding the request thread open.

Is 529 the same as a 429 rate limit on Claude?

No. 429 is per-account rate limiting — you're calling too fast and need to slow down or upgrade your tier. 529 is global overload at the model layer and is not specific to your account. Handle them differently: 429 means respect Retry-After and reduce your own concurrency; 529 means retry and fall back to a lower tier if it persists.

Free tools