The Batch API discounts every Claude model by 50% in exchange for an asynchronous <24-hour SLA. Submit a JSONL file of requests, poll for completion, download results.
When batch is the right call
Eval runs — scoring outputs of another model.
Bulk extraction — pulling structured fields from a document corpus.
Workloads where one request's output feeds the next.
The actual savings
Model
Standard in/out
Batch in/out
Opus 4.7
$15 / $75
$7.50 / $37.50
Sonnet 4.6
$3 / $15
$1.50 / $7.50
Haiku 4.5
$1 / $5
$0.50 / $2.50
Stack with caching
Batch and prompt caching are independent discounts and stack. A cached prefix on a batch request still gets the 10%-of-input rate, then the batch 50% applies. Effective rate on stable prefixes: 5% of input pricing.
Cost example
Try a 1M-request eval workload in the Cost Calculator with batch on/off to see the gap.
Frequently asked questions
What is the turnaround time for the Claude Batch API?
Anthropic guarantees results within 24 hours. In practice, most batch jobs complete within 1–4 hours depending on queue length and job size.
Can I cancel a batch job after submission?
Yes — the Batch API supports a cancel endpoint. Jobs that are already in-flight (partially processed) will stop at the current point; only the processed portion is billed.
Is the Batch API available for all Claude models?
Yes, the Batch API works with Opus 4.7, Sonnet 4.6, and Haiku 4.5. All models receive the same 50% discount for batch mode. The model is specified per-request in the JSONL file, so a single batch can mix tiers.