Claude API Error 529 — Overloaded

What HTTP 529 means on the Anthropic API, why it spikes during peak hours, and the retry, fallback, and routing patterns that keep your service up.

🔥 Launch tonight — Power Prompts PDF 50p (just 50p tonight)30 battle-tested Claude Code prompts · 8 pages · paste into CLAUDE.md · price reverts to £5

A 529 Overloaded response from the Anthropic API means Claude itself is at capacity — not that your account hit a rate limit, and not that your request was malformed. It's the API's way of saying "try again in a moment." Unlike a 429 (which is per-account rate limiting and is your problem to fix), 529 is global to Anthropic's infrastructure and resolves on its own. But if you don't handle it deliberately, a 529 spike during a model launch or peak US working hours can take your service down.

The exact response

HTTP/1.1 529
Content-Type: application/json

{
  "type": "error",
  "error": {
    "type": "overloaded_error",
    "message": "Overloaded"
  }
}

The type: "overloaded_error" body field is the canonical signal — handle on that, not just the status code, since transient infrastructure errors can also surface as 5xx with different bodies.

When 529s spike

The retry pattern

The official SDKs retry 529 (and 429, 408, 500, 502, 503, 504) automatically with exponential backoff — up to 2 retries by default. That's adequate for a brief overload but insufficient for a sustained one. Bump to 4–6 retries with jitter in production:

import { Anthropic, APIError } from "@anthropic-ai/sdk";

const client = new Anthropic({
  maxRetries: 5,
  timeout: 90_000,
});

async function askWithFallback(messages) {
  try {
    return await client.messages.create({
      model: "claude-opus-4-7",
      max_tokens: 1024,
      messages,
    });
  } catch (err) {
    if (err instanceof APIError && err.status === 529) {
      // After SDK retries exhausted: fall back to Sonnet.
      return await client.messages.create({
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages,
      });
    }
    throw err;
  }
}

Three patterns that scale

  1. Tier fallback. If Opus 529s after retries, fall back to Sonnet. If Sonnet 529s, fall back to Haiku. Quality degrades gracefully; the user gets a response.
  2. Provider fallback. If Claude is unreachable for >60s, route to a backup (different model or a cached response). Use a circuit breaker so a long Anthropic incident doesn't keep retrying forever.
  3. Queue and replay. For non-latency-critical work, push failed requests to a retry queue (SQS, Redis Streams) with a 30–120 second visibility delay. Most 529 storms resolve within 5 minutes.

What not to do

Pricing implication

Tier fallback costs you money: if 5% of Opus traffic falls back to Sonnet during peak hours, your effective bill changes. Sonnet costs $3/$15 vs Opus $15/$75 — so the fallback is cheaper, but your quality metrics may shift. Track the fallback rate and gate alerts on it crossing 10%.

Pair this with the broader error handling and retry guide, and the production checklist.

Frequently asked questions

What does HTTP 529 mean on the Claude API?
529 (Overloaded) means Claude's infrastructure is at capacity globally — not that your account hit a rate limit. It's transient. The official Anthropic SDKs retry 529 automatically with exponential backoff; the recommended response is to let the SDK retry and add a tier-fallback (e.g. Opus → Sonnet) for sustained overloads.
How long does a Claude 529 overload usually last?
Most 529 spikes resolve within 30–300 seconds. Model launch days (when new Opus or Sonnet versions ship) and peak US business hours (14:00–22:00 UTC) see longer events, occasionally 5–15 minutes. Build a 60-second circuit breaker plus tier fallback to ride through these without user impact.
Should I retry on a Claude 529 error?
Yes — 529 is explicitly retriable. Use exponential backoff with full jitter and a retry ceiling of 4–6 attempts in production. The Anthropic SDKs do this automatically up to maxRetries (default 2). For latency-tolerant work, push to a retry queue rather than holding the request thread open.
Is 529 the same as a 429 rate limit on Claude?
No. 429 is per-account rate limiting — you're calling too fast and need to slow down or upgrade your tier. 529 is global overload at the model layer and is not specific to your account. Handle them differently: 429 means respect Retry-After and reduce your own concurrency; 529 means retry and fall back to a lower tier if it persists.

Free tools

Cost Calculator → Prompt-Pricing Recommender → Diff Summarizer → Skills Browser →

Related

Claude Opus 4.7 vs Sonnet 4.6 Pricing (2026 Comparison)How Much Does Claude Cost? (2026 API Pricing Guide)Claude Prompt Caching: 90% Cost Savings Explained (2026)Claude API Cost Calculator: Estimate Your Anthropic BillClaude vs GPT-4 Pricing: 2026 API Cost Comparison