Claude Context Window — 200K Tokens Explained

How Claude's 200K context window works in 2026: what fits, token counting, cost implications, and strategies for long-context workloads.

All three Claude tiers in 2026 — Opus 4.7, Sonnet 4.6, and Haiku 4.5 — share a 200,000-token context window. This is one of the largest offered by any commercial LLM API.

What fits in 200K tokens?

Content type	Approximate token count	Fits in 200K?
Average English novel (80K words)	~107K tokens	Yes (1.9×)
1,000-page PDF (250K words)	~333K tokens	No — split into chunks
10K-line codebase	~50–80K tokens	Yes (2.5–4×)
1-hour meeting transcript	~10–15K tokens	Yes (13–20×)
200-page technical doc	~60K tokens	Yes (3×)

Rule of thumb: 1 English word ≈ 1.3 tokens. 1 line of code ≈ 5–8 tokens.

Context window and pricing

Every token in the context window is a billable input token — including conversation history and documents. A full 200K-token context on Sonnet 4.6 costs $0.60 per request (200K × $3/M). On Opus 4.7, the same window costs $3.00 per request.

Use prompt caching for repeated large contexts: cache the static document portion once, pay 90% less on all subsequent reads. See the prompt caching guide for break-even math.

Strategies for large-context workloads

Cache the static part — prefix documents with cache_control: ephemeral. Only new user messages and model responses are billed at full rate.
Route by window size — use Haiku for short contexts (under 20K tokens), Sonnet for medium (20–100K), Opus only when maximum reasoning quality is required regardless of length.
Truncate conversation history — RAG and agentic workloads accumulate history fast. Keep only the last N turns of chat plus the cached system context.
Use the Batch API for bulk long-context jobs — 50% off standard pricing with a 24-hour SLA. Offline summarisation and extraction workloads are ideal. See the Batch API guide.

Estimate your long-context costs

Plug your expected context size and request volume into the Claude Cost Calculator to model monthly spend with and without caching enabled.

Frequently asked questions

Is Claude's 200K context window the same across all models?

Yes. Opus 4.7, Sonnet 4.6, and Haiku 4.5 all support 200K input tokens in 2026. The difference is quality and price, not context length.

Does a longer context make Claude slower?

Yes — time-to-first-token scales roughly linearly with input length. For latency-sensitive applications, keep context concise or use streaming so the user sees partial output immediately.

What happens if I exceed 200K tokens?

The API returns a 400 error (context_length_exceeded). You must truncate or summarise earlier turns. There is no automatic truncation.

How many pages of a PDF can Claude read at once?

Roughly 150–200 pages of dense English text fits in 200K tokens (assuming ~1,000–1,300 tokens per page). For larger documents, split into chunks with overlap, or use a retrieval layer to pass only the relevant sections.

Free tools

Cost Calculator → Prompt-Pricing Recommender → Diff Summarizer → Skills Browser →

Claude Opus 4.7 vs Sonnet 4.6 Pricing (2026 Comparison)How Much Does Claude Cost? (2026 API Pricing Guide)Claude Prompt Caching: 90% Cost Savings Explained (2026)Claude API Cost Calculator: Estimate Your Anthropic Bill Claude vs GPT-4 Pricing: 2026 API Cost Comparison