Claude Model Routing 101: The Decision Tree for Haiku, Sonnet, and Opus

Last refreshed: May 15, 2026

Claude Opus 4.7 costs $25 per million output tokens. Claude Haiku 4.5 costs $5 per million output tokens. That is a 5× difference in list price — and in practice, closer to 20× when you account for Opus 4.7’s token inflation (it generates roughly 1.0–1.35× more tokens per task than Haiku at the same list price, depending on content type).

For the majority of tasks in a typical Claude workflow, that cost difference buys you nothing. Haiku and Opus produce indistinguishable output on sorting, classification, summarization, simple Q&A, format conversion, and first-pass drafting. The performance gap is real — but it only appears on tasks that genuinely require extended reasoning, complex code generation, nuanced judgment, or maximum creative quality. Most tasks don’t. → Claude on a Budget pillar

The Decision Tree

Use Haiku 4.5 when:

Classifying or tagging items (sentiment, category, priority, topic)
Summarizing documents where the summary template is well-defined
First-pass triage — deciding which items need deeper processing
Format conversion — JSON to markdown, CSV to structured output, etc.
Simple Q&A with factual answers from provided context
Extracting structured data from unstructured text
Generating short, templated outputs (subject lines, meta descriptions, titles)
Any high-volume, time-insensitive batch job

Use Sonnet 4.6 when:

Writing full articles, reports, or long-form content
Mid-complexity code generation and debugging
Research synthesis across multiple sources
Drafting emails, proposals, or documents requiring judgment
Multi-step reasoning where Haiku loses the thread
Any task where you’ve tested Haiku and found the output quality insufficient

Use Opus 4.7 when:

Architecture decisions with significant downstream consequences
Security-sensitive code review or vulnerability analysis
Complex multi-file refactoring with interdependencies
Tasks requiring the xhigh effort level (extended chain-of-thought)
Creative work where you need maximum quality judgment
Any task where Sonnet has failed and you need the ceiling

The Cost Math at Scale

Assume a content operation running 500 Claude tasks per month. Default behavior (everything on Opus): ~500,000 output tokens × $25/M = $12.50/month at minimum. Routed behavior (300 Haiku, 150 Sonnet, 50 Opus): (300K × $5) + (150K × $15) + (50K × $25) = $1.50 + $2.25 + $1.25 = $5.00/month. That is a 60% cost reduction with identical output quality on the Haiku and Sonnet tasks.

At enterprise scale — thousands of tasks per day — the routing decision is worth six figures annually. At individual scale, it is the difference between a Claude workflow that is financially sustainable and one that quietly drains budget.

How to Implement Routing

In Claude Code: the gateway model picker

Claude Code v2.1.126 (released May 1, 2026) ships a gateway model picker that lets you configure model routing per task type within a session. Set Haiku as the default for file reading, search, and summarization; route complex reasoning to Sonnet or Opus explicitly. The configuration lives in your Claude Code settings and applies automatically.

In the API: explicit model parameter

Every Anthropic API call takes a model parameter. Build a routing function in your application layer that maps task types to model strings. The routing logic can be as simple as a conditional or as sophisticated as a classifier (ironically, run on Haiku) that reads the task description and returns the appropriate model string.

In Cowork and manual workflows: develop the habit

For non-programmatic use, routing is a habit built through one question before every Claude task: does this task actually need Opus? Run a two-week audit. For every task you run on Opus, note whether Haiku would have produced the same output. Most people discover that 60–70% of their Opus usage could move to Haiku or Sonnet with no quality loss.

Part of the Claude on a Budget series. Next: OpenRouter as the Budget Layer →

What to explore next

AI Strategy

How to Run Claude Code on Vertex AI Using Your GCP Credits

Same room

Anthropic

Anthropic Plants Its Flag in Creative Tooling — What Claude for Creative Work Means for the Adobe Era

Same room

AI in Restoration

The Documentation Layer That Makes Every Carrier Conversation Easier

You may also explore

Deep dive