Claude Streaming TypeScript — Real-Time API Responses (2026)

Claude Streaming API with TypeScript (2026 Guide)

How to stream Claude API responses in TypeScript using the Anthropic SDK. Includes typed event handlers, SSE parsing, Next.js integration, and error handling patterns.

The Claude API supports server-sent events (SSE) streaming so your TypeScript application can display tokens as they arrive rather than waiting for the full response. This guide shows the production patterns.

Basic streaming with full TypeScript types

import Anthropic from "@anthropic-ai/sdk"; import type { MessageStreamEvent } from "@anthropic-ai/sdk/resources/messages"; const client = new Anthropic(); const stream = client.messages.stream({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [{ role: "user", content: "Explain TypeScript generics." }] }); stream.on("text", (text: string) => { process.stdout.write(text); // fires on each token }); const finalMessage = await stream.finalMessage(); console.log(" Usage:", finalMessage.usage);

Type-safe event handling

for await (const event of stream as AsyncIterable<MessageStreamEvent>) { switch (event.type) { case "message_start": console.log("Model:", event.message.model); break; case "content_block_delta": if (event.delta.type === "text_delta") { process.stdout.write(event.delta.text); } break; case "message_delta": if (event.delta.stop_reason === "end_turn") { console.log(" Tokens used:", event.usage.output_tokens); } break; } }

Next.js App Router streaming endpoint

// app/api/chat/route.ts import Anthropic from "@anthropic-ai/sdk"; import { NextRequest } from "next/server"; const client = new Anthropic(); export async function POST(req: NextRequest) { const { message } = await req.json(); const encoder = new TextEncoder(); const readable = new ReadableStream({ async start(controller) { const stream = client.messages.stream({ model: "claude-sonnet-4-6", max_tokens: 1024, messages: [{ role: "user", content: message }] }); stream.on("text", (text) => { controller.enqueue(encoder.encode(text)); }); await stream.finalMessage(); controller.close(); } }); return new Response(readable, { headers: { "Content-Type": "text/plain; charset=utf-8", "Transfer-Encoding": "chunked" } }); }

Streaming with tool use

const stream = client.messages.stream({ model: "claude-sonnet-4-6", max_tokens: 1024, tools: [myTool], messages }); for await (const event of stream) { if (event.type === "content_block_start" && event.content_block.type === "tool_use") { console.log("Tool invoked:", event.content_block.name); } if (event.type === "content_block_delta" && event.delta.type === "input_json_delta") { process.stdout.write(event.delta.partial_json); // stream the tool args } }

Abort a stream mid-flight

const controller = new AbortController(); const stream = client.messages.stream( { model: "claude-sonnet-4-6", max_tokens: 1024, messages }, { signal: controller.signal } ); setTimeout(() => controller.abort(), 3000); // cancel after 3s

Cost of streaming vs non-streaming

The Anthropic API charges identically for streaming and non-streaming requests — the same input/output token counts apply. The tradeoff is latency UX vs implementation simplicity. Use the Claude Cost Calculator to model token costs.

Frequently asked questions

What TypeScript types does the Anthropic SDK export for streaming?

The key types are MessageStreamEvent (union of all event types), MessageStream (the stream object), and MessageStreamParams. Import them from @anthropic-ai/sdk/resources/messages.

How do I stream Claude in a Next.js API route?

Create a ReadableStream that subscribes to client.messages.stream() events and enqueues each text chunk. Return it as a Response with Content-Type: text/plain. The App Router natively supports streaming responses.

Is streaming more expensive than non-streaming?

No. The Anthropic API charges the same token prices regardless of whether you use streaming. The choice is purely about user experience and server memory — streaming lets you display output progressively without buffering the full response.

Can I get the final message object after streaming?

Yes. Call await stream.finalMessage() after iterating (or after the stream closes). It returns a standard Messages object including usage.input_tokens and usage.output_tokens.

Free tools