Claude Opus 4.7 costs $25 per million output tokens. Claude Haiku 4.5 costs $5 per million output tokens. That is a 5× difference in list price — and in practice, closer to 20× when you account for Opus 4.7’s token inflation (it generates roughly 1.0–1.35× more tokens per task than Haiku at the same list price, depending on content type).
For the majority of tasks in a typical Claude workflow, that cost difference buys you nothing. Haiku and Opus produce indistinguishable output on sorting, classification, summarization, simple Q&A, format conversion, and first-pass drafting. The performance gap is real — but it only appears on tasks that genuinely require extended reasoning, complex code generation, nuanced judgment, or maximum creative quality. Most tasks don’t. → Claude on a Budget pillar
The Decision Tree
Use Haiku 4.5 when:
- Classifying or tagging items (sentiment, category, priority, topic)
- Summarizing documents where the summary template is well-defined
- First-pass triage — deciding which items need deeper processing
- Format conversion — JSON to markdown, CSV to structured output, etc.
- Simple Q&A with factual answers from provided context
- Extracting structured data from unstructured text
- Generating short, templated outputs (subject lines, meta descriptions, titles)
- Any high-volume, time-insensitive batch job
Use Sonnet 4.6 when:
- Writing full articles, reports, or long-form content
- Mid-complexity code generation and debugging
- Research synthesis across multiple sources
- Drafting emails, proposals, or documents requiring judgment
- Multi-step reasoning where Haiku loses the thread
- Any task where you’ve tested Haiku and found the output quality insufficient
Use Opus 4.7 when:
- Architecture decisions with significant downstream consequences
- Security-sensitive code review or vulnerability analysis
- Complex multi-file refactoring with interdependencies
- Tasks requiring the
xhigheffort level (extended chain-of-thought) - Creative work where you need maximum quality judgment
- Any task where Sonnet has failed and you need the ceiling
The Cost Math at Scale
Assume a content operation running 500 Claude tasks per month. Default behavior (everything on Opus): ~500,000 output tokens × $25/M = $12.50/month at minimum. Routed behavior (300 Haiku, 150 Sonnet, 50 Opus): (300K × $5) + (150K × $15) + (50K × $25) = $1.50 + $2.25 + $1.25 = $5.00/month. That is a 60% cost reduction with identical output quality on the Haiku and Sonnet tasks.
At enterprise scale — thousands of tasks per day — the routing decision is worth six figures annually. At individual scale, it is the difference between a Claude workflow that is financially sustainable and one that quietly drains budget.
How to Implement Routing
In Claude Code: the gateway model picker
Claude Code v2.1.126 (released May 1, 2026) ships a gateway model picker that lets you configure model routing per task type within a session. Set Haiku as the default for file reading, search, and summarization; route complex reasoning to Sonnet or Opus explicitly. The configuration lives in your Claude Code settings and applies automatically.
In the API: explicit model parameter
Every Anthropic API call takes a model parameter. Build a routing function in your application layer that maps task types to model strings. The routing logic can be as simple as a conditional or as sophisticated as a classifier (ironically, run on Haiku) that reads the task description and returns the appropriate model string.
In Cowork and manual workflows: develop the habit
For non-programmatic use, routing is a habit built through one question before every Claude task: does this task actually need Opus? Run a two-week audit. For every task you run on Opus, note whether Haiku would have produced the same output. Most people discover that 60–70% of their Opus usage could move to Haiku or Sonnet with no quality loss.
Part of the Claude on a Budget series. Next: OpenRouter as the Budget Layer →

Leave a Reply