How to call the Claude API from Go in 2026. Covers the official SDK, streaming, prompt caching, tool use, and concurrency patterns for production Go services.
Anthropic ships an official Go SDK — github.com/anthropics/anthropic-sdk-go. It tracks the Python and TypeScript SDKs in feature parity (messages, streaming, batch, prompt caching, tool use, vision). This page is a working quickstart plus the Go-idiomatic patterns for production services.
go get github.com/anthropics/anthropic-sdk-go@latest
package main
import (
"context"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/option"
)
func main() {
client := anthropic.NewClient(
option.WithAPIKey("sk-ant-..."),
)
msg, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaudeSonnet4_6),
MaxTokens: anthropic.F(int64(1024)),
Messages: anthropic.F([]anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("Explain Go interfaces in 3 bullets.")),
}),
})
if err != nil {
log.Fatal(err)
}
for _, block := range msg.Content {
if t, ok := block.AsAny().(anthropic.TextBlock); ok {
fmt.Println(t.Text)
}
}
}
stream := client.Messages.NewStreaming(ctx, anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaudeSonnet4_6),
MaxTokens: anthropic.F(int64(1024)),
Messages: anthropic.F([]anthropic.MessageParam{anthropic.NewUserMessage(anthropic.NewTextBlock("Write a haiku."))}),
})
for stream.Next() {
event := stream.Current()
if delta, ok := event.Delta.AsAny().(anthropic.RawContentBlockDeltaEventTextDelta); ok {
fmt.Print(delta.Text)
}
}
if err := stream.Err(); err != nil {
log.Fatal(err)
}
Cache a long system prompt by setting CacheControl on the prefix block:
system := []anthropic.TextBlockParam{{
Type: anthropic.F(anthropic.TextBlockParamTypeText),
Text: anthropic.F(longSystemPrompt),
CacheControl: anthropic.F(anthropic.CacheControlEphemeralParam{
Type: anthropic.F(anthropic.CacheControlEphemeralTypeEphemeral),
}),
}}
Cached reads cost 10% of input price after the first request — see the caching deep-dive.
*Client across your service.golang.org/x/sync/semaphore sized to your tier's TPM divided by average tokens-per-call so you do not stampede a 429.context.WithTimeout; the SDK respects it for both streaming and non-streaming calls.The Go SDK retries 429s and 5xx with exponential backoff out of the box. Configure with option.WithMaxRetries(int). For application-layer logic on 429, see 429 handling.
tools := []anthropic.ToolUnionParam{
anthropic.F[anthropic.ToolUnionParam](anthropic.ToolParam{
Name: anthropic.F("get_weather"),
Description: anthropic.F("Get current weather for a location."),
InputSchema: anthropic.F(anthropic.ToolInputSchemaParam{
Type: anthropic.F("object"),
Properties: map[string]interface{}{
"location": map[string]string{"type": "string"},
},
}),
}),
}
The response will contain ToolUseBlock elements; execute them and feed results back as ToolResultBlock on the next call.
Wrap calls in middleware that records Usage.InputTokens, Usage.OutputTokens, Usage.CacheCreationInputTokens, and Usage.CacheReadInputTokens from every response. Multiply by the model's per-MTok rates to get cost-per-request. Estimate workload cost ahead of time with the Claude Cost Calculator.
For TypeScript and Python equivalents see TypeScript SDK and Python SDK.