Claude API Conversation History: Multi-Turn Chat Tutorial

How to implement multi-turn conversation history with the Claude API. Maintain context, manage token limits, and avoid common mistakes in Python and JavaScript.

The Claude API is stateless — it does not remember previous messages automatically. To build a multi-turn chat, you must send the full conversation history with every request. This page shows you how to do it correctly in Python and JavaScript.

How conversation history works

Each call to messages.create receives a messages array of alternating user and assistant turns. Claude sees the entire array and responds as if continuing the conversation.

JavaScript / TypeScript version

With a system prompt

Pass the system prompt as the top-level system parameter — not as the first message. The messages array should only contain user and assistant turns.

Managing token limits

Each new turn sends the full history. As conversations grow, you'll hit Claude's 200k context window or pay for tokens you don't need. Three strategies:

Sliding window example

Common mistakes

Frequently asked questions

Strategy	How	Trade-off
Sliding window	Keep only the last N turns: `messages[-20:]`	Cheapest; older context lost
Summarise old turns	Ask Claude to compress old history into a single summary message	Context preserved; one extra API call
Prompt caching	Mark a stable prefix (e.g. a long document context) with `cache_control`	90% cost reduction on the cached portion

Does Claude remember previous conversations automatically?

No. The Claude API is stateless — there are no server-side sessions. You must send the full conversation history in the messages array with every request. In-app memory (summarisation, retrieval) is built on top of this stateless API.

What is the maximum conversation length in Claude?

Claude's context window is 200,000 tokens. A typical back-and-forth exchange of 50 turns at 500 tokens per turn uses 25,000 tokens — well within limits. For very long sessions, use a sliding window or summarise old turns to stay under budget.

Can I mix text and images in Claude conversation history?

Yes. Each message's content field can be a string (text only) or an array of content blocks mixing TextBlock and ImageBlock. Both image types (base64 and URL) are supported in history. Note that cached image blocks are especially cost-effective in multi-turn vision workflows.

How do I persist conversation history between server restarts?

Serialize the history array to JSON and store it in a database (Redis, Postgres, or any key-value store) keyed by session ID. On each request, load the history, append the new user message, call the API, append the assistant reply, and save the updated history back.

Is sending the full history every request inefficient?

Only slightly. Anthropic's prompt caching feature reduces the effective cost of re-sending the same earlier turns. With cache_control markers, stable portions of your history (system prompt, loaded documents) cost 10% of the normal input price after the first request. The variable recent turns are always billed at full price.

Free tools