Anthropic's two flagship API tiers in 2026 are Claude Opus 4.7 (most capable) and Claude Sonnet 4.6 (balanced). The gap between them is 5× on input tokens and 5× on output tokens — a substantial routing decision.
Per-million-token pricing (USD)
Model
Input
Output
Cached read
Cache write (5m)
Opus 4.7
$15
$75
$1.50
$18.75
Sonnet 4.6
$3
$15
$0.30
$3.75
When Opus is worth 5×
Multi-step agentic flows where a single bad decision compounds across iterations.
Long-context reasoning over >100k tokens — Opus retrieves and chains better.
Code generation in unfamiliar repos where Sonnet hallucinates symbol names.
When Sonnet is the right call
Templated extraction or classification with a tight schema.
Short-form generation where output tokens dominate the bill — Sonnet's $15/M output adds up fast at scale; Opus's $75/M is brutal.
High-volume retrieval-augmented chat. With prompt caching, Sonnet on a cached system prompt costs ~$0.30/M for the cached portion.
The Claude Prompt-Pricing Recommender takes a prompt and recommends Haiku, Sonnet, or Opus based on prompt complexity heuristics. Most teams find 70–85% of traffic safely routes to Sonnet or Haiku without a measurable quality drop.
Frequently asked questions
How much cheaper is Claude Sonnet 4.6 than Opus 4.7?
Sonnet 4.6 is exactly 5× cheaper on both input ($3 vs $15 per million tokens) and output ($15 vs $75 per million tokens). For most workloads the quality gap is smaller than the price gap.
Can I switch between Opus and Sonnet dynamically?
Yes. Most teams route at runtime based on prompt complexity — defaulting to Sonnet and escalating to Opus when a confidence check or task-type heuristic indicates it. The Prompt-Pricing Recommender automates this classification.
Does prompt caching change the Opus vs Sonnet comparison?
Caching helps both equally on a percentage basis (90% off cached reads), but the absolute savings are 5× larger on Opus. If your workload has a large, reused system prompt, Opus with heavy caching can approach Sonnet uncached in effective cost.