Does Anthropic support Claude fine-tuning? Learn what custom training options exist in 2026, what to use instead, and how to get Claude to behave consistently without fine-tuning.
Short answer: Anthropic does not offer self-service fine-tuning for Claude API customers as of 2026. There is no endpoint to upload training data and produce a custom model checkpoint. This is a deliberate product decision, not an oversight.
| Technique | What it achieves | Cost |
|---|---|---|
| System prompt | Persona, tone, task constraints, output format | Included in input tokens |
| Prompt caching | Reuse a long system prompt or knowledge corpus at 10% input cost | $0.30/M cached read (Sonnet) |
| Few-shot examples | In-context demonstrations of desired output style | Input token cost only |
| Tool use / structured output | Force a specific output schema reliably | Standard API cost |
| RAG (retrieval-augmented generation) | Ground Claude on proprietary data without training | Infrastructure + API cost |
| Batch evaluation + prompt iteration | Systematic prompt engineering with test suite | Batch API at 50% standard price |
The most common reason developers want fine-tuning is to avoid sending a large system prompt on every call. Prompt caching solves this: cache a 20k-token knowledge base once, and every subsequent request pays only $0.30/M tokens (Sonnet 4.6) for the cached portion — 10× cheaper than uncached input. This achieves the "bake in your domain knowledge" goal without model training.
import anthropic
client = anthropic.Anthropic()
KNOWLEDGE_BASE = """...your 20,000 token domain context here..."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=[
{
"type": "text",
"text": KNOWLEDGE_BASE,
"cache_control": {"type": "ephemeral"} # cache this prefix
},
{
"type": "text",
"text": "You are a specialist assistant for our product. Use the above knowledge base."
}
],
messages=[{"role": "user", "content": "How do I set up X feature?"}]
)
If the goal is to have Claude answer questions about your proprietary data (docs, codebase, support tickets), RAG is more effective than fine-tuning anyway. Fine-tuning teaches style and format; it does not reliably inject facts. RAG retrieves the relevant document chunk at query time and puts it directly in context — Claude's in-context performance on retrieved text is excellent.
If you want Claude to always output JSON in a specific schema, use structured output (tool use) rather than fine-tuning. Defining a JSON Schema as a tool input and forcing a tool_use stop reason produces reliable schema-conformant output on every call without training.
If you have a genuine need for a fine-tuned checkpoint — specific domain dialect, safety overrides for your use case, or a distilled smaller model — contact Anthropic sales. Custom model agreements exist but are not self-service.
Estimate the cost of a prompt-caching or RAG-based approach with the Claude Cost Calculator, or get a model recommendation from the Prompt-Pricing Recommender.