Published: May 25, 2026 | Last fact-check: May 25, 2026 — current model lineup: Opus 4.7, Sonnet 4.6, Haiku 4.5
Quick Answer
A Claude Code router is any layer that decides which Claude model handles which request — Opus for hard reasoning, Sonnet for daily work, Haiku for fast cheap tasks. Anthropic ships some built-in routing, but the most leveraged users build their own routing rules on top to optimize cost and latency.
Built-in routing, manual model selection, and the third-party router landscape below.
“Claude Code router” is a phrase that means different things to different people in 2026, and the differences matter for what you should actually build or buy.
It can mean (1) Anthropic’s built-in logic that picks a model when you do not specify one, (2) third-party tools that route between Anthropic models and other LLMs through one Claude Code interface, or (3) custom routing rules you build yourself to match models to tasks. This guide walks through each, when each makes sense, and the trade-offs.
Why Routing Matters in the First Place
Claude is not one model. It is a family. As of 2026 the production tiers are roughly:
- Claude Opus 4.7 — $5/$25 per million tokens. Current flagship. Best for hard, ambiguous, multi-step reasoning and agentic coding.
- Claude Sonnet 4.6 — $3/$15 per million tokens. The workhorse. Within ~1 point of Opus on coding benchmarks at 40% less cost. Right answer for 80% of daily work.
- Claude Haiku 4.5 — $1/$5 per million tokens. Fast and cheap. Right answer for high-volume formulaic tasks: classification, extraction, formatting, routing, simple Q&A.
Output costs 5x input across all three tiers. Prompt caching cuts cached input costs by ~90%. Batch API cuts everything by 50% if you can wait up to 24 hours.
Using Opus for everything is wasteful. Using Haiku for everything is sloppy. Routing — matching the model to the task — is how you get the best output for the lowest cost. For someone running Claude Code several hours a day, intelligent routing is the difference between a $100/month Max bill and a $1,000/month API bill for the same work.
Anthropic’s Built-In Claude Code Routing
When you launch Claude Code without specifying a model, it picks a default. As of 2026 the default for most users is Sonnet, with Opus accessible via flags or settings, and Haiku used internally for some sub-tasks like tool selection and simple file operations.
You can override the default at session start:
# Start Claude Code with Opus for a tough refactor
claude --model claude-opus-4-7 # current flagship
# Or set it in your settings.json
{
"model": "claude-sonnet-4-6" // current workhorse
}
Anthropic also routes internally: when Claude Code uses sub-agents for parallel work, it can route those sub-agents to lighter models automatically. This routing is opaque to you and generally well-tuned. You usually do not need to think about it.
Manual Model Selection: The 80/20 Approach
For most users, manual routing beats automatic routing. The rule:
- Sonnet by default. Daily work, content drafts, code edits, file operations, debugging.
- Opus when you hit a wall. Architectural decisions, hard refactors, ambiguous specs, anything that requires real reasoning.
- Haiku for batch. Classification, taxonomy assignment, metadata generation, SEO meta descriptions, anything formulaic at volume.
This 80/20 split is achievable with two or three commands and zero infrastructure. It is the right starting point.
Third-Party Claude Code Routers
A small ecosystem has emerged around third-party routers that sit between Claude Code and the model layer. The two most common patterns:
OpenRouter and Multi-Provider Routers
OpenRouter is the most widely used third-party router. You point Claude Code at OpenRouter as the API endpoint, and OpenRouter routes your requests to Claude (or to GPT, Gemini, DeepSeek, Llama, etc.). Why use it:
- You want fallback when Anthropic has an outage.
- You want to mix Claude with other models on a per-task basis.
- You want a single billing surface across providers.
- You want BYOK (bring your own key) routing where you mix your own provider keys.
The trade-off: latency adds a few hundred milliseconds per call, and some Anthropic-specific features (prompt caching, certain beta tools) work less smoothly through the proxy.
Custom In-House Routers
Larger teams build their own routing layer. A typical pattern: a small Python or TypeScript service that inspects the incoming request, applies routing rules (length thresholds, task type detection, cost ceilings), picks a model, and forwards the call to Anthropic.
This is overkill for most individuals. It pays off when you have:
- Strict cost controls that need enforcement, not suggestion
- Multi-tenant usage where different customers get different models
- Compliance requirements that need request inspection and logging
- A real engineering team that can maintain the service
Routing Rules That Actually Work
If you are going to invest in any routing logic, these are the rules that pay back:
- By task type. Code review → Opus. New code generation → Sonnet. Format conversion → Haiku.
- By input length. Long context (40K+ tokens) where you need careful reasoning → Opus. Long context where you need extraction → Sonnet with prompt caching.
- By cost ceiling. Anything over a threshold token count gets a hard cap or downgrade.
- By time of day. Overnight batch jobs route to cheaper models. Interactive daytime work routes to your preferred quality tier.
- By failure recovery. If a Sonnet call returns a low-confidence or refused response, retry once with Opus before giving up.
Most of these rules are five lines of code each. The discipline is more about deciding the rules than implementing them.
What Anthropic Does Not Yet Ship
As of writing, Anthropic does not ship a built-in “route this query to the right model” intelligence layer in Claude Code. The model you set is the model you get for the session, with the exception of internal sub-agent routing.
This is likely to change. The shape of where Claude Code is going — more autonomy, longer sessions, more parallel agents — implies more sophisticated internal routing. For now, the routing decisions worth making are the ones you make yourself.
Costs: What Routing Actually Saves
Concrete example. An operator running a Claude Code content pipeline that:
- Drafts articles (Sonnet): 8,000 input + 4,000 output tokens per article
- Generates SEO meta and FAQ (Haiku): 2,000 + 500 tokens
- Reviews and edits (Opus): 10,000 + 2,000 tokens for trickier articles
Running everything on Opus would roughly triple the cost. Running everything on Sonnet would save vs Opus but produce noticeably weaker meta-generation than Haiku at similar quality. Routing by task type saves real money — often 40-60% versus a single-model approach — without sacrificing output quality.
When Not to Build a Router
Routing is leverage when you operate at volume. If you run Claude Code casually — a couple of hours a day, one task at a time — you do not need a router. You need to learn the three models well enough to pick the right one by feel. Build a router only when (a) cost is a real line item in your budget, (b) you are running multiple workflows that have genuinely different model needs, or (c) you want fallback infrastructure for resilience.
Frequently Asked Questions
What is a Claude Code router?
A Claude Code router is any layer — Anthropic’s built-in defaults, a third-party tool like OpenRouter, or custom code — that decides which Claude model handles a given request.
Does Claude Code have built-in routing?
Partial. Claude Code picks a default model (Sonnet) and routes internal sub-agent tasks to lighter models. It does not automatically promote your main session to Opus when a task gets hard.
What’s the difference between OpenRouter and a custom router?
OpenRouter is a hosted multi-provider gateway with billing and fallback built in. A custom router is something you build to enforce your own rules. OpenRouter is right for most teams. Custom routers are right for teams with strict requirements.
Should I use OpenRouter with Claude Code?
Useful if you want fallback, multi-provider mixing, or unified billing. Less useful if you only use Claude and want Anthropic-specific features like prompt caching to work optimally.
How do I pick the right Claude model for a task?
Default Sonnet. Opus for hard reasoning, architectural decisions, ambiguous specs. Haiku for high-volume formulaic tasks (classification, formatting, metadata).
How much can routing save me?
For volume users, 40-60% versus running everything on Opus, with no measurable drop in output quality if the routing rules are sensible.
Is there a cost to routing through OpenRouter?
OpenRouter adds a small markup on token pricing in exchange for the routing and aggregation features. For most users this is acceptable; for very high volume, going direct to Anthropic is cheaper.
The Bottom Line
Claude Code routing is leverage when you operate at volume and a distraction when you do not. Start by learning the three Claude models by feel and picking manually. Add OpenRouter if you want fallback. Build a custom router only when cost or compliance actually justifies the engineering. The router is not the goal; the right model on the right task is the goal.
Leave a Reply