Claude Model Routing 101: The Decision Tree for Haiku, Sonnet, and Opus

Abstract representation of Anthropic Claude model capabilities

Claude Opus 4.7 costs $25 per million output tokens. Claude Haiku 4.5 costs $5 per million output tokens. That is a 5× difference in list price — and in practice, closer to 20× when you account for Opus 4.7’s token inflation (it generates roughly 1.0–1.35× more tokens per task than Haiku at the same list price, depending on content type).

For the majority of tasks in a typical Claude workflow, that cost difference buys you nothing. Haiku and Opus produce indistinguishable output on sorting, classification, summarization, simple Q&A, format conversion, and first-pass drafting. The performance gap is real — but it only appears on tasks that genuinely require extended reasoning, complex code generation, nuanced judgment, or maximum creative quality. Most tasks don’t. Claude on a Budget pillar

The Decision Tree

Use Haiku 4.5 when:

  • Classifying or tagging items (sentiment, category, priority, topic)
  • Summarizing documents where the summary template is well-defined
  • First-pass triage — deciding which items need deeper processing
  • Format conversion — JSON to markdown, CSV to structured output, etc.
  • Simple Q&A with factual answers from provided context
  • Extracting structured data from unstructured text
  • Generating short, templated outputs (subject lines, meta descriptions, titles)
  • Any high-volume, time-insensitive batch job

Use Sonnet 4.6 when:

  • Writing full articles, reports, or long-form content
  • Mid-complexity code generation and debugging
  • Research synthesis across multiple sources
  • Drafting emails, proposals, or documents requiring judgment
  • Multi-step reasoning where Haiku loses the thread
  • Any task where you’ve tested Haiku and found the output quality insufficient

Use Opus 4.7 when:

  • Architecture decisions with significant downstream consequences
  • Security-sensitive code review or vulnerability analysis
  • Complex multi-file refactoring with interdependencies
  • Tasks requiring the xhigh effort level (extended chain-of-thought)
  • Creative work where you need maximum quality judgment
  • Any task where Sonnet has failed and you need the ceiling

The Cost Math at Scale

Assume a content operation running 500 Claude tasks per month. Default behavior (everything on Opus): ~500,000 output tokens × $25/M = $12.50/month at minimum. Routed behavior (300 Haiku, 150 Sonnet, 50 Opus): (300K × $5) + (150K × $15) + (50K × $25) = $1.50 + $2.25 + $1.25 = $5.00/month. That is a 60% cost reduction with identical output quality on the Haiku and Sonnet tasks.

At enterprise scale — thousands of tasks per day — the routing decision is worth six figures annually. At individual scale, it is the difference between a Claude workflow that is financially sustainable and one that quietly drains budget.

How to Implement Routing

In Claude Code: the gateway model picker

Claude Code v2.1.126 (released May 1, 2026) ships a gateway model picker that lets you configure model routing per task type within a session. Set Haiku as the default for file reading, search, and summarization; route complex reasoning to Sonnet or Opus explicitly. The configuration lives in your Claude Code settings and applies automatically.

In the API: explicit model parameter

Every Anthropic API call takes a model parameter. Build a routing function in your application layer that maps task types to model strings. The routing logic can be as simple as a conditional or as sophisticated as a classifier (ironically, run on Haiku) that reads the task description and returns the appropriate model string.

In Cowork and manual workflows: develop the habit

For non-programmatic use, routing is a habit built through one question before every Claude task: does this task actually need Opus? Run a two-week audit. For every task you run on Opus, note whether Haiku would have produced the same output. Most people discover that 60–70% of their Opus usage could move to Haiku or Sonnet with no quality loss.

Part of the Claude on a Budget series. Next: OpenRouter as the Budget Layer →

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *