Claude Cost Per Million Tokens — Explained

What 'cost per million tokens' actually means for the Claude API. Per-tier breakdown, output multipliers, and a worked example.

Anthropic prices the Claude API in dollars per million tokens. This page unpacks what that means in practice and shows a worked example for a typical chat assistant.

What is a token?

A token is roughly 4 characters of English text, or about ¾ of a word. "Hello, world!" is roughly 3 tokens. The full text of Pride and Prejudice is about 165,000 tokens.

The four billable token types

Input tokens — your prompt sent to Claude.
Output tokens — what Claude generates. Always priced at 5× the input rate.
Cache write tokens — the first request that establishes a cached prefix. ~25% more than input.
Cache read tokens — subsequent reads of the cached prefix. 10% of input price.

2026 Anthropic rates per million tokens

Model	Input	Output	Cached read
Opus 4.7	$15.00	$75.00	$1.50
Sonnet 4.6	$3.00	$15.00	$0.30
Haiku 4.5	$1.00	$5.00	$0.10

Worked example — 100k requests on Sonnet 4.6

Assume a chat workload: 2,000 input tokens per request, 400 output tokens. Across 100,000 requests:

Input: 100,000 × 2,000 = 200M tokens × $3/M = $600
Output: 100,000 × 400 = 40M tokens × $15/M = $600
Total: $1,200/month, no caching.

Now cache the 1,500-token shared system prompt:

Cached portion: 100,000 × 1,500 × $0.30/M = $45 (vs. $450 uncached)
Variable input: 100,000 × 500 × $3/M = $150
Output: $600
Total: $795/month with caching — a 34% reduction.

Plug your numbers into the Claude Cost Calculator for a precise estimate including cache savings.

Frequently asked questions

Why is Claude output 5× more expensive than input?

Output generation is more computationally expensive than input processing — every output token requires a full forward pass through the model, while input tokens are processed in parallel during prompt-encoding. The 5× ratio is consistent across all three Claude tiers.

How many tokens is a typical English word?

About 1.3 tokens per English word on average. 1,000 tokens is roughly 750 words. For non-English text, ratios vary — code is often denser; logographic languages like Chinese can be 1–2 tokens per character.

Do system prompts count as input tokens?

Yes. System prompts are billed at the input rate just like user messages. This is why caching long system prompts saves so much — they're billed at 10% of input price on cached reads.

Free tools

Cost Calculator → Prompt-Pricing Recommender → Diff Summarizer → Skills Browser →

Claude Opus 4.7 vs Sonnet 4.6 Pricing (2026 Comparison)How Much Does Claude Cost? (2026 API Pricing Guide)Claude Prompt Caching: 90% Cost Savings Explained (2026)Claude API Cost Calculator: Estimate Your Anthropic Bill Claude vs GPT-4 Pricing: 2026 API Cost Comparison