Before you ship a prompt to the Claude API, knowing its token count tells you the per-call cost across model tiers — and which model fits your latency and budget constraints.
How Claude counts tokens
Claude uses a byte-pair encoding (BPE) tokeniser similar to GPT-series models. Rough rules of thumb:
1 token ≈ 4 characters of English text
1 token ≈ 0.75 words for prose
Code tokenises at roughly 1 token per 3 characters (operators, braces, and newlines each count)
A typical 500-word paragraph = ∼670 tokens
Per-token cost at each tier (2026)
Model
Input ($/MTok)
Output ($/MTok)
Cache read ($/MTok)
Haiku 4.5
$1.00
$5.00
$0.10
Sonnet 4.6
$3.00
$15.00
$0.30
Opus 4.7
$15.00
$75.00
$1.50
Quick cost examples
Prompt size
Output
Haiku cost
Sonnet cost
Opus cost
500 tokens
200 tokens
$0.0006
$0.0018
$0.0090
2,000 tokens
500 tokens
$0.0045
$0.0135
$0.0675
10,000 tokens
1,000 tokens
$0.015
$0.045
$0.225
50,000 tokens
2,000 tokens
$0.060
$0.180
$0.900
Cache write vs. cache read
Anthropic charges a one-time cache-write fee (25% premium over input price) the first time a prompt block is cached, then 90% off on all subsequent reads within the 5-minute TTL. For system prompts used across many requests, the break-even is typically 2–3 requests.
Use the interactive token counter + cost estimator
The Claude Prompt-Pricing Recommender lets you paste any prompt and see token counts across models, live cost math (including cache scenarios), and a model-tier recommendation based on prompt complexity. No signup — runs in your browser, BYO API key for the live model call.
Using the Anthropic count_tokens API
For precise server-side token counts before sending, call the /v1/messages/count_tokens endpoint with your messages and system blocks. It returns exact counts for the model you specify — useful for batching or rate-limit planning.
POST /v1/messages/count_tokens
{
"model": "claude-sonnet-4-6-20250514",
"messages": [{"role": "user", "content": "Your prompt here"}],
"system": "Optional system prompt"
}
Frequently asked questions
How do I count tokens in a Claude prompt before sending?
Use Anthropic's /v1/messages/count_tokens endpoint, or paste your prompt into the Claude Prompt-Pricing Recommender at prompt-pricing.vercel.app — it estimates token count and cost across all three Claude tiers without an API call.
Does Claude count tokens differently than GPT-4?
Very similarly. Both use BPE tokenisation. Claude's tokeniser produces counts within 5–10% of GPT-4's for the same text. English prose and code tokenise at comparable rates across both.
Are tool call definitions included in the token count?
Yes. Tool definitions in the tools array count as input tokens on every request where they're included. A complex tool schema can add 100–500 tokens per request — factor this into cost estimates.
Do output tokens cost more than input tokens?
Yes, significantly. Output tokens are priced 5× higher than input tokens for Sonnet ($15 vs $3 per million) and Haiku ($5 vs $1). Minimising output length (tight instructions, structured output, max_tokens limits) is one of the fastest ways to cut API cost.