Everything you need to know about Anthropic’s new frontier tier — pricing, context window, model comparisons, and how to route the right work to the right model.
Updated June 2026 · ~14 min read · Includes interactive calculators
What Is Claude Fable 5?
Claude Fable 5 is Anthropic’s new frontier model tier — positioned above Opus in the lineup and designed for tasks where raw capability, extended reasoning depth, and massive context handling matter more than cost. Where Opus 4.8 set the bar for complex multi-step reasoning, Fable 5 raises it with a 1-million-token context window, enhanced agentic autonomy, and improved performance on long-horizon software engineering, research synthesis, and cross-domain analysis tasks.
The “Fable” naming signals a new generation of model architecture rather than an incremental update. Anthropic positions it as the model you reach for when a task exceeds what Opus can do reliably — not as a replacement for Opus, Sonnet, or Haiku in their respective cost tiers.
Quick Facts — Claude Fable 5
Context Window
1M
tokens (~750K words)
Max Output
32K
tokens per response
Input Price
$10
per million tokens
Output Price
$50
per million tokens
Cache Write
$12.50
per million tokens
Cache Read
$1.00
per million tokens
Key positioning: Fable 5 is the model for tasks where Opus 4.8 produces reliable but imperfect results — long codebase audits, full-document analysis, complex multi-agent orchestration, and strategic synthesis across large corpora. For most production workflows, Sonnet remains the value pick.
Full Model Lineup Comparison
Here’s how the complete 2026 Claude lineup stacks up across every dimension that matters for production usage:
Model
Input $/M
Output $/M
Context
Max Out
Vision
Tool Use
Extended Think
Best For
◆ Fable 5
$10
$50
1M
32K
✓
✓
✓ Deep
Max-capability tasks, 1M+ context
◆ Opus 4.8
$5
$25
200K
32K
✓
✓
✓
Complex reasoning, agentic workflows
◆ Sonnet 4.6
$3
$15
200K
16K
✓
✓
✓
Production apps, content at scale
◆ Haiku 4.5
$1
$5
200K
8K
✓
✓
—
High-volume, latency-sensitive tasks
Prices are per million tokens. Cache read is 90% cheaper than standard input across all models. Batch API provides an additional 50% discount on both input and output.
Capability Matrix — What Each Model Can Do
Capability
Fable 5
Opus 4.8
Sonnet 4.6
Haiku 4.5
Full codebase analysis (>500K tokens)
✓ Native
⚠ Chunked
✗
✗
Extended thinking / chain-of-thought
✓ Deep
✓
✓
✗
Multi-step agentic orchestration
✓ Best
✓
Good
Limited
Computer use
✓
✓
✓
✗
MCP tool integration
✓
✓
✓
✓
Prompt caching
✓
✓
✓
✓
Batch API (50% discount)
✓
✓
✓
✓
PDF / document analysis
✓
✓
✓
Limited
Real-time streaming
✓
✓
✓
✓
Structured JSON output
✓
✓
✓
✓
Interactive Cost Calculator
Estimate your monthly API spend across the full model lineup. Enter your token volumes below — the calculator models prompt caching and Batch API discounts automatically.
Token Cost Calculator
Estimated Monthly Cost
$0.00
Which Claude Model Should You Use?
Answer three questions to get a model recommendation tailored to your use case.
Model Picker — 3 Questions
1. How large is your context? (document/codebase size)
Under 50K tokens
50K–200K tokens
200K–1M tokens
2. How complex is the task?
Simple / structured (classify, extract, format)
Moderate (draft, summarize, QA)
Complex (reason, plan, code, orchestrate)
3. How cost-sensitive is this workload?
Very — high volume, every cent counts
Moderate — quality matters more than cost
Not sensitive — quality and capability first
How We Actually Use Each Model
These are real production workflows mapped to the right tier — built from running Claude in content operations, publishing automation, and knowledge management at scale. No hypotheticals.
Haiku 4.5 — High Volume
Daily SEO Refresh Pipeline
25-post-per-day SEO metadata refresh
Article classification and tag assignment
Structured data extraction from web pages
Keyword density checks across large post archives
Link validation and redirect flagging
Sonnet 4.6 — Production Default
Editorial Content at Scale
Desk article writing (1,200–2,500 words)
Content brief execution from keyword clusters
FAQ and schema markup generation
Cross-site content adaptation and localization
Monthly client update drafts and summaries
Opus 4.8 — Complex Reasoning
Workers & Deep Refreshes
Agentic Notion Workers (multi-step pipelines)
Deep content refresh with competitive gap analysis
Multi-database synthesis and reporting
Strategy documents requiring extended reasoning
Code generation for automation scripts
Fable 5 — Max Capability
Portfolio Audits & Strategy
Full-site content audits (500+ posts in single context)
Cross-domain strategy synthesis across large corpora
Complex multi-agent orchestration at the flagship tier
Long-horizon planning requiring deep reasoning depth
Codebase-wide analysis and architecture review
Routing principle: The right model is the cheapest one that reliably completes the task. Haiku handles volume. Sonnet handles production. Opus handles complexity. Fable 5 handles scale + complexity together — specifically the cases where you’d need Opus and more context than Opus can hold.
The Economics: Routed vs All-Fable
Smart model routing is where API costs get controlled. Here’s a real-world comparison of a mixed content-and-automation workload at scale — routed vs running everything on Fable 5.
Workload
Monthly Volume
Routed Model
Routed Cost
All-Fable 5 Cost
Savings
SEO metadata batch refresh
750 posts/mo
Haiku 4.5 + Batch
$1.20
$18.75
93% less
Article drafting
90 articles/mo
Sonnet 4.6
$8.10
$67.50
88% less
Agentic worker runs
200 runs/mo
Opus 4.8
$22.50
$45.00
50% less
Full-site portfolio audits
4 audits/mo
Fable 5
$24.00
$24.00
—
Total
—
Routed
$55.80
$155.25
64% less
Stacking Discounts: Caching + Batch API
Two discount mechanisms compound independently:
Prompt caching: Cache your system prompt and shared context once. Subsequent requests pay ~10% of the input price for cache reads. On Fable 5, that’s $1.00/M instead of $10.00/M on cached tokens — a 90% reduction on your largest cost lever.
Batch API: Submit requests asynchronously (results within 24 hours) for a flat 50% discount on both input and output. Works on all four models. Best for non-real-time workloads like overnight refreshes, audits, or bulk classification.
Stacked: Caching + Batch combined can bring effective Fable 5 input cost from $10/M to ~$0.50/M on cached tokens — making it economically viable for high-volume tasks that previously only fit Haiku’s budget.
Claude Fable 5 sits above Opus 4.8 in the lineup. The primary difference is context window size — Fable 5 offers 1 million tokens vs Opus 4.8’s 200K — and the depth of extended reasoning for highly complex tasks. Opus 4.8 remains the right choice for most complex agentic workflows at half the cost. Fable 5 is best when you need both maximum context and maximum reasoning depth simultaneously, or when a task has routinely hit the limits of what Opus can do reliably.
Claude Fable 5 is priced at $10 per million input tokens and $50 per million output tokens — 2× Opus 4.8 ($5/$25), 3.3× Sonnet 4.6 ($3/$15), and 10× Haiku 4.5 ($1/$5). Prompt caching drops the effective input cost to $1.00/M on cache reads, and the Batch API adds a 50% discount on all tokens for non-real-time workloads. Stacking both discounts makes Fable 5 viable for higher-volume use cases than the base price suggests.
Claude Fable 5 has a 1-million-token context window — approximately 750,000 words or roughly 1,500 pages of text. This is 5× the context window of Opus 4.8, Sonnet 4.6, and Haiku 4.5 (all 200K). In practice, a 1M context window lets you pass entire codebases, long research corpora, or full document archives in a single API call without chunking or retrieval workarounds. For more on context window mechanics, see our full context window guide.
Yes. Claude Fable 5 is available through the Anthropic API using the model ID claude-fable-5-20260101 (check the Anthropic documentation for the exact identifier). It supports the same API surface as the rest of the Claude family — streaming, tool use, prompt caching, vision, the Batch API, and MCP server integration. Access requires an Anthropic API account with Fable 5 enabled on your usage tier.
Fable 5 is available in Claude.ai on the Pro and Team plans. The interface lets you select it from the model picker when starting a conversation. Like Opus, Fable 5 in claude.ai has message limits that reset on a rolling window — it’s designed for individual complex tasks rather than high-volume API workloads. For production-scale usage, the API with the Batch API discount is the more economical path.
Yes — and Fable 5’s extended thinking is the deepest in the lineup. Where Opus 4.8 supports extended thinking for complex reasoning tasks, Fable 5 uses a more capable reasoning engine designed for tasks that require longer chains of inference, more working memory, and more reliable self-correction. It’s particularly effective on math, logic, long-horizon planning, and tasks where the model needs to hold and manipulate many interdependent concepts simultaneously.
For most content production — articles, blog posts, social copy, summaries, SEO content — Sonnet 4.6 is the right call. It produces high-quality output at 3.3× less cost than Fable 5, and for typical content lengths (500–3,000 words), the quality difference is minimal. Reach for Fable 5 when you need to synthesize across a very large corpus (e.g., auditing 200+ posts simultaneously), when the content requires deep domain reasoning that benefits from extended thinking, or when the task involves both large-context ingestion and complex output generation in a single pass.
Three levers in order of impact: (1) Model routing — only use Fable 5 when the task genuinely requires it; route everything else to Opus, Sonnet, or Haiku based on complexity and volume. (2) Prompt caching — structure your system prompt and shared context so it can be cached; cache reads cost $1.00/M instead of $10.00/M on Fable 5. (3) Batch API — submit non-real-time workloads via the Batch API for a flat 50% discount. Stacking all three — routing + caching + batch — can reduce effective per-task costs by 85–95% compared to unoptimized Fable 5 calls.
More Claude Guides from Tygart Media
We run Claude in production every day. These are the guides that come from using it, not just writing about it.
If you are bootstrapping a tech startup in 2026, navigating the LLM ecosystem is no longer about finding the smartest model—it’s about finding the most cost-effective architecture that actually ships code. We have built this bespoke concierge roadmap to guide you through the Tygart Media resources you need right now.
📍 Stop 1: The Economics of Routing
Before you write a single line of code, you need to understand your margins. Anthropic recently made a massive move in the B2B space that directly impacts your AWS burn rate. Read this first:Anthropic Slashes Claude 4.6 Haiku API Pricing by 40%
📍 Stop 2: Validating the Intelligence
Now that you know Haiku is cheap, you need to verify if Sonnet is smart enough for your core reasoning tasks. Bookmark our living leaderboard to see exactly where Claude 4.6 stands against GPT-5. Check the stats:Claude 4.6 vs GPT-5: The 2026 Leaderboard
📍 Stop 3: Shipping the Front-End
With your architecture chosen, it’s time to build. If you are using React, you must prevent the model from generating “lazy” partial files that break your CI/CD pipelines. Implement this workflow:The Top Claude 4.6 Prompt for React Developers This Week
📍 Stop 4: The Final Automation
If you want to see exactly how we implemented Claude 4.6 in a real-world production environment to completely automate our editorial newsroom, we documented the entire architecture in public. Read the case study:How We Automated Our Newsroom Using Claude 4.6
This roadmap was autonomously generated by the Tygart Media Omni-Brain to connect you with the specific intelligence you need. Check back for future roadmap updates.
Anthropic Slashes Claude 4.6 Haiku API Pricing by 40%
In a massive bid for enterprise B2B market share, Anthropic has officially slashed the input token costs for Claude 4.6 Haiku.
Old Price: $0.25 / 1M Input Tokens
New Price: $0.15 / 1M Input Tokens
What this means for CTOs
If you are running high-volume log parsing, customer support routing, or massive RAG (Retrieval-Augmented Generation) pipelines, switching your routing logic from OpenAI’s GPT-4o-mini to Claude 4.6 Haiku will instantly slash your monthly AWS Bedrock bill while maintaining state-of-the-art speed.
The Claude “Artifacts” Wrapper is Coming to the Core API
Anthropic’s “Artifacts” feature—which allows Claude to instantly render and preview code, diagrams, and UI elements in a side panel—has revolutionized the ChatGPT-style web interface. But for developers building their own applications using the Claude API, they’ve been forced to build those UI rendering wrappers from scratch.
According to emerging chatter on X (Twitter), that is about to change.
Social Radar Intel: “Rumors circulating that the Artifacts UI wrapper is finally coming to the core API next week. If developers can render interactive React components directly inside their own chat UIs using Claude, it’s game over for generic wrappers.”
Why This Matters for Builders
If Anthropic exposes the Artifacts rendering engine natively through the API, it significantly lowers the barrier to entry for building rich, interactive AI tools. You will no longer need a senior front-end engineer to parse JSON and render a React component on the fly; the API will handle the interactive framing.
The Tygart Verdict: We are keeping a close eye on the official Anthropic changelog over the next two weeks. If this drops, expect a flood of “wrapper” apps to pivot or die.
If you bought a Claude Code subscription in March or April and felt like you were hitting the 5-hour wall every single afternoon, you weren’t imagining it. Anthropic spent six months tightening Claude Code’s quotas — and then, over two weeks in May 2026, gave most of them back. The rate-limit math that drove plan-selection advice on the internet through April is now obsolete. Here’s what actually changed, what the numbers look like today, and how to think about Pro versus Max if you’re picking a plan this week.
What Anthropic actually did
On May 6, 2026, Anthropic doubled the 5-hour rate limits on Claude Code across every paid plan — Pro, Max 5x, Max 20x, Team Premium, and seat-based Enterprise. In the same announcement, they removed the peak-hour throttle that had been quietly halving available quota for Pro and Max users during weekday business hours. They also lifted API-side rate limits on the Opus tier.
One week later, on May 13, 2026, they followed up with a 50% increase to the weekly cap across the same plans. Unlike the 5-hour change, that weekly bump carries an expiration date: July 13, 2026, unless extended. Treat it as a temporary boost, not a permanent feature.
The trigger Anthropic pointed to is a deal that brings the full capacity of the Colossus 1 data center in Memphis online — over 300 megawatts and roughly 220,000 NVIDIA GPUs. That detail matters less than the practical one: capacity-driven throttling that had been the dominant constraint since late 2025 has loosened.
The new numbers, by plan
The shape of the plan ladder hasn’t changed — Pro at $20, Max 5x at $100, Max 20x at $200, Team Premium at $100/seat with a 5-seat minimum. What changed is what each tier actually delivers per window.
Pro ($20/mo): Roughly 90 prompts per 5-hour window now (up from a number that, in practice, was hovering around 45 once the peak-hour throttle kicked in). No peak penalty. Weekly cap is 50% higher through July 13.
Max 5x ($100/mo): Same doubled 5-hour window. Weekly Opus 4.7 budget moved from approximately 50 hours to approximately 75.
Max 20x ($200/mo): Doubled 5-hour window. Weekly Opus 4.7 budget moved from approximately 200 hours to approximately 300.
Team Premium ($100/seat/mo, annual; $125 monthly): Mirrors Max 5x quotas at the seat level. 5-seat minimum still applies.
Two numbers that haven’t changed: the API pay-as-you-go pricing for the underlying models (claude-sonnet-4-6 at roughly $3 per million input tokens and $15 per million output; claude-opus-4-7 at roughly $5 in and $25 out), and the existence of the weekly cap itself. The weekly cap is still the thing that kills Max users mid-Friday.
What this changes about plan selection
Most of the “which plan should I buy” guides written before May 6 over-recommend Max 5x because they were sizing it against artificially compressed Pro limits. With a doubled 5-hour cap and no peak throttle, Pro at $20 is now genuinely enough for a developer doing focused coding sessions a few hours a day — something that wasn’t reliably true a month ago.
The Max 5x case still holds, but it’s narrower now. The honest test: if you regularly burn through your Pro 5-hour window before lunch, or if you run two or three concurrent Claude Code sessions on different repos, $100 still pays for itself. If you don’t, Pro will hold.
Max 20x is increasingly a workflow choice rather than a quota choice. The doubled limits made Max 5x sufficient for almost every solo workflow I can describe. Where 20x still earns its price is multi-agent workflows, where a coordinator-and-workers pattern can burn three to seven times the tokens of a single-agent session because every teammate maintains its own context window.
The hidden costs that didn’t change
The rate-limit relief is real, but several gotchas that drove “Claude Code costs me more than I expected” complaints in Q1 are still live:
Set ANTHROPIC_API_KEY in your shell and Claude Code bills at API rates — your subscription is silently ignored. Unset it before launching the CLI if you’re on a plan.
Weekly caps count active processing time only. Idle browsing is free. Long-running tool calls and extended-thinking budgets aren’t.
Extended thinking is billed as output tokens. On Opus 4.7 that’s roughly $25 per million. Default thinking budgets of tens of thousands of tokens per request stack up fast on API.
MCP server output sits in context for the rest of the session. A “list the last 20 PRs” call can dump 8,000 tokens of metadata that you’ll re-pay for on every subsequent turn until the conversation rolls over.
If you were running into the 5-hour wall and assumed it was a usage problem, check whether one of those four is actually the cause before you upgrade.
What to do this week
If you’re on Pro and were considering Max 5x, wait two weeks. The new Pro ceiling is high enough that the upgrade decision now needs different evidence than it did in April.
If you’re already on Max 5x and felt squeezed, the May 13 weekly bump should give you breathing room — but mark July 13 on your calendar. If the temporary 50% increase isn’t extended, the squeeze comes back.
If you’re picking a plan from scratch today: start on Pro. The doubled limits are real, the peak-hour penalty is gone, and the upgrade path to Max stays open with no friction. Buy quota when you’ve measured that you need it, not before.
The model versions to use
For anyone writing the API string into a script this week: flagship is claude-opus-4-7, workhorse is claude-sonnet-4-6, fast tier is claude-haiku-4-5-20251001. Pull from docs.anthropic.com/en/docs/about-claude/models before shipping anything — the version strings have moved twice already this year and they’ll move again.
Published: May 25, 2026 | Last fact-check: May 25, 2026 against Anthropic docs and Claude Code v2.1+ behavior
Quick Answer
Plan Mode is a Claude Code setting that forces the agent to think through and approve a plan before taking destructive actions. Trigger it with Shift+Tab pressed twice in the terminal (the first press cycles to Auto-Accept Mode; the second lands on Plan Mode). Use it for risky multi-step work; skip it for simple read-only or contained edits.
How to enable it, when it pays off, and when it gets in your way below.
Plan Mode (sometimes called “planning mode”) is one of the more underused features in Claude Code in 2026. It changes how the agent works in a specific, measurable way: before Claude Code edits files, runs commands, or modifies state, it produces a plan and waits for your approval. You see what it intends to do, you say yes or no, and only then does it act.
For the right kind of task, Plan Mode is the difference between a clean execution and a regrettable one. For the wrong kind of task, it is friction that slows you down. This guide separates the two.
Claude Code Plan Mode vs Auto Mode: When to Use Each
Scenario
Use Plan Mode
Use Auto Mode
Unfamiliar codebase
Yes — review the plan first
Only if you know it well
Large multi-file refactor
Yes — catch scope creep early
Not recommended
Simple bug fix (< 5 lines)
Overkill
Yes
Adding a new feature
Yes — plan clarifies approach
Acceptable for small features
Writing tests
Optional
Yes, usually safe
Touching database migrations
Yes — irreversible changes
No
CI/CD pipeline changes
Yes
No
What Plan Mode Actually Does
In default mode, Claude Code is allowed to take actions as it reasons. It can read files, write files, run bash, edit code, all in one conversational flow. This is the strength of Claude Code as an agent — it gets work done without asking permission for every step.
In Plan Mode, Claude Code’s behavior changes:
You describe the task.
Claude Code investigates the codebase (read-only operations are still allowed).
Claude Code drafts a plan listing every file it intends to change, every command it intends to run, and every decision point.
You read the plan. You approve it, modify it, or reject it.
Only after approval does Claude Code start writing files or running commands.
The plan is presented in the terminal as a structured outline. You can ask Claude Code to revise the plan, add steps, remove steps, or change the order. Iterating on the plan is fast because no actions have been taken yet.
How to Enable Plan Mode
There are four ways to activate Plan Mode in Claude Code:
Shift+Tab pressed twice. Each press of Shift+Tab cycles through the three permission modes: Default → Auto-Accept → Plan → Default. Two presses lands on Plan Mode. The status bar shows ⏸ plan mode on when active.
The /plan slash command. Type /plan at the start of any prompt to enter Plan Mode for that turn only. Useful for one-off plans without flipping the whole session.
The –permission-mode plan flag at startup. Start the session in Plan Mode from the command line.
Headless mode for scripts and CI.claude --print --permission-mode plan "your task" for automation that should never edit files.
# Start session in Plan Mode
claude --permission-mode plan
# Or mid-session — press Shift+Tab TWICE
# (first press = Auto-Accept Mode, second press = Plan Mode)
# Or one-shot Plan Mode for next prompt only
/plan
Plan Mode is persistent within a session — it stays on until you cycle out with another Shift+Tab. Close and reopen Claude Code and it defaults back to off. Toggle it on for risky work, leave it on for the whole session if you are doing higher-risk work end-to-end.
Important: Plan Mode is a hard read-only sandbox enforced at the tool level. Claude Code physically cannot edit files, run commands, or modify state while Plan Mode is active. This is not a suggestion or a soft check — the write tools are unavailable.
When Plan Mode Pays Off
Plan Mode is worth the friction in these situations:
Multi-file refactors. When the agent will touch 5+ files, you want to see the list before it starts editing. A small confusion about which files to change becomes a big mess fast.
Database migrations or schema changes. Anything that touches durable state and is hard to undo benefits from a confirmed plan.
Production code paths. If a session affects code that ships to users, the plan checkpoint is cheap insurance.
Ambiguous instructions. When you are not sure how the agent will interpret your request, Plan Mode surfaces the interpretation before any work happens.
New repository onboarding. When you do not yet know the codebase well, Plan Mode lets the agent show you what it learned during investigation before it acts.
Long-running batch jobs. Approving a plan for 200 file edits and then walking away is safer than launching 200 edits blind.
When Plan Mode Gets In the Way
Plan Mode is not free. The friction it adds is a real cost for certain workflows:
Single-file tweaks. Asking Claude Code to fix a typo or rename a variable does not need a plan. The plan takes longer than the fix.
Tight feedback loops. When you are iterating quickly — try a change, see the result, adjust — Plan Mode slows the loop. Default mode wins here.
Read-only investigation. If you are asking questions about the codebase (“how does this auth flow work”), there is nothing to plan. Plan Mode is irrelevant.
Work in a sandbox. If you are working in a throwaway directory or branch where mistakes are cheap, the safety net of Plan Mode is overkill.
The decision is not “is Plan Mode good.” It is “is the cost of approval less than the cost of an unintended action.” For risky multi-step work, yes. For cheap iteration, no.
Working Inside the Plan
Once Claude Code presents a plan, you have several options:
Approve as-is. Tell Claude Code to proceed. It executes the plan in order.
Approve with modifications. Tell Claude Code to remove specific steps, reorder them, or add additional steps. It revises the plan and re-presents.
Ask questions. Drill into specific steps. “Why are you editing file X?” Claude Code explains the reasoning.
Reject and restart. If the plan is wrong-shape, tell Claude Code so. It will rebuild the plan from a corrected understanding.
Cancel. Exit Plan Mode entirely if you’ve decided this is not the right task or session for it.
The plan is conversational. You are not stuck with the first draft. Iterating on the plan is much cheaper than iterating after the work is done.
What Plan Mode Does Not Protect Against
Plan Mode is not a sandbox. The plan, once approved, executes for real. Plan Mode does not:
Prevent you from approving a bad plan
Catch logic errors inside individual file edits
Prevent destructive bash commands if you approved them in the plan
Replace tests or code review
It is a thinking checkpoint, not a safety net. The human still owns the decision.
Plan Mode vs Other Safety Patterns
Plan Mode is one of several safety patterns Claude Code supports:
Read-only sessions: Restrict the agent to read operations only.
Per-tool permissions: Approve each tool use individually as it happens.
Plan Mode: Approve a batch of intended actions before execution begins.
Auto-accept mode: The opposite — accept all tool uses without asking. Fast and risky.
Per-tool permission is more granular but slower. Plan Mode is bulkier but faster once approved. Use the right tool for the situation; do not assume one is always correct.
A Working Habit
The habit that has worked across hundreds of Claude Code sessions: default mode on, Shift+Tab twice into Plan Mode before any session that will (a) touch production state, (b) edit more than 5 files, or (c) run commands that are hard to undo. Shift+Tab again to cycle back to default for everything else.
The shortcut becomes muscle memory in a week. Once it is muscle memory, the cost of Plan Mode drops to nearly zero, and you can use it liberally on anything that even smells risky.
Frequently Asked Questions
What is Plan Mode in Claude Code?
Plan Mode is a Claude Code setting that forces the agent to produce a written plan and wait for your approval before making changes. It surfaces what the agent intends to do so you can adjust it before any work happens.
How do I enable Plan Mode in Claude Code?
Press Shift+Tab twice in the terminal (the first press cycles to Auto-Accept; the second lands on Plan Mode), type /plan as a slash command, or start the session with –permission-mode plan. The status bar shows ⏸ plan mode on when active.
When should I use Plan Mode?
For multi-file refactors, database migrations, production code paths, ambiguous instructions, new repositories you don’t know yet, and long-running batch jobs. Skip Plan Mode for single-file tweaks, tight iteration loops, and read-only investigation.
Does Plan Mode make Claude Code slower?
Yes, for short tasks — the plan adds latency that is not worth it on quick edits. For long or risky tasks, the plan is faster than fixing mistakes afterward.
Can I edit the plan before approving it?
Yes. Tell Claude Code to revise the plan — add steps, remove steps, reorder. Iterating on the plan is much cheaper than iterating after execution.
Is Plan Mode the same as a sandbox?
Plan Mode IS a hard read-only sandbox at the tool level — Claude Code cannot write files or run commands while it’s active. But once you approve the plan and exit Plan Mode, the work executes for real. Plan Mode prevents accidental writes during planning; it does not prevent you from approving a bad plan.
What’s the difference between Plan Mode and per-tool permissions?
Per-tool permissions ask you to approve each tool use individually as it happens (more granular, slower). Plan Mode batches all intended actions into one plan you approve up front (bulkier, faster once approved).
The Bottom Line
Plan Mode is leverage for risky work and friction for everything else. Make Shift+Tab+Shift+Tab muscle memory. Use Plan Mode whenever the cost of an unintended action exceeds the cost of approval — multi-file refactors, production changes, ambiguous specs. Skip it on cheap iteration. That single rule will save you more headaches than any other Claude Code habit.
Published: May 25, 2026 | Last fact-check: May 25, 2026 — current model lineup: Opus 4.7, Sonnet 4.6, Haiku 4.5
Quick Answer
A Claude Code router is any layer that decides which Claude model handles which request — Opus for hard reasoning, Sonnet for daily work, Haiku for fast cheap tasks. Anthropic ships some built-in routing, but the most leveraged users build their own routing rules on top to optimize cost and latency.
Built-in routing, manual model selection, and the third-party router landscape below.
“Claude Code router” is a phrase that means different things to different people in 2026, and the differences matter for what you should actually build or buy.
It can mean (1) Anthropic’s built-in logic that picks a model when you do not specify one, (2) third-party tools that route between Anthropic models and other LLMs through one Claude Code interface, or (3) custom routing rules you build yourself to match models to tasks. This guide walks through each, when each makes sense, and the trade-offs.
Why Routing Matters in the First Place
Claude is not one model. It is a family. As of 2026 the production tiers are roughly:
Claude Opus 4.7 — $5/$25 per million tokens. Current flagship. Best for hard, ambiguous, multi-step reasoning and agentic coding.
Claude Sonnet 4.6 — $3/$15 per million tokens. The workhorse. Within ~1 point of Opus on coding benchmarks at 40% less cost. Right answer for 80% of daily work.
Claude Haiku 4.5 — $1/$5 per million tokens. Fast and cheap. Right answer for high-volume formulaic tasks: classification, extraction, formatting, routing, simple Q&A.
Output costs 5x input across all three tiers. Prompt caching cuts cached input costs by ~90%. Batch API cuts everything by 50% if you can wait up to 24 hours.
Using Opus for everything is wasteful. Using Haiku for everything is sloppy. Routing — matching the model to the task — is how you get the best output for the lowest cost. For someone running Claude Code several hours a day, intelligent routing is the difference between a $100/month Max bill and a $1,000/month API bill for the same work.
Anthropic’s Built-In Claude Code Routing
When you launch Claude Code without specifying a model, it picks a default. As of 2026 the default for most users is Sonnet, with Opus accessible via flags or settings, and Haiku used internally for some sub-tasks like tool selection and simple file operations.
You can override the default at session start:
# Start Claude Code with Opus for a tough refactor
claude --model claude-opus-4-7 # current flagship
# Or set it in your settings.json
{
"model": "claude-sonnet-4-6" // current workhorse
}
Anthropic also routes internally: when Claude Code uses sub-agents for parallel work, it can route those sub-agents to lighter models automatically. This routing is opaque to you and generally well-tuned. You usually do not need to think about it.
Manual Model Selection: The 80/20 Approach
For most users, manual routing beats automatic routing. The rule:
Opus when you hit a wall. Architectural decisions, hard refactors, ambiguous specs, anything that requires real reasoning.
Haiku for batch. Classification, taxonomy assignment, metadata generation, SEO meta descriptions, anything formulaic at volume.
This 80/20 split is achievable with two or three commands and zero infrastructure. It is the right starting point.
Third-Party Claude Code Routers
A small ecosystem has emerged around third-party routers that sit between Claude Code and the model layer. The two most common patterns:
OpenRouter and Multi-Provider Routers
OpenRouter is the most widely used third-party router. You point Claude Code at OpenRouter as the API endpoint, and OpenRouter routes your requests to Claude (or to GPT, Gemini, DeepSeek, Llama, etc.). Why use it:
You want fallback when Anthropic has an outage.
You want to mix Claude with other models on a per-task basis.
You want a single billing surface across providers.
You want BYOK (bring your own key) routing where you mix your own provider keys.
The trade-off: latency adds a few hundred milliseconds per call, and some Anthropic-specific features (prompt caching, certain beta tools) work less smoothly through the proxy.
Custom In-House Routers
Larger teams build their own routing layer. A typical pattern: a small Python or TypeScript service that inspects the incoming request, applies routing rules (length thresholds, task type detection, cost ceilings), picks a model, and forwards the call to Anthropic.
This is overkill for most individuals. It pays off when you have:
Strict cost controls that need enforcement, not suggestion
Multi-tenant usage where different customers get different models
Compliance requirements that need request inspection and logging
A real engineering team that can maintain the service
Routing Rules That Actually Work
If you are going to invest in any routing logic, these are the rules that pay back:
By task type. Code review → Opus. New code generation → Sonnet. Format conversion → Haiku.
By input length. Long context (40K+ tokens) where you need careful reasoning → Opus. Long context where you need extraction → Sonnet with prompt caching.
By cost ceiling. Anything over a threshold token count gets a hard cap or downgrade.
By time of day. Overnight batch jobs route to cheaper models. Interactive daytime work routes to your preferred quality tier.
By failure recovery. If a Sonnet call returns a low-confidence or refused response, retry once with Opus before giving up.
Most of these rules are five lines of code each. The discipline is more about deciding the rules than implementing them.
What Anthropic Does Not Yet Ship
As of writing, Anthropic does not ship a built-in “route this query to the right model” intelligence layer in Claude Code. The model you set is the model you get for the session, with the exception of internal sub-agent routing.
This is likely to change. The shape of where Claude Code is going — more autonomy, longer sessions, more parallel agents — implies more sophisticated internal routing. For now, the routing decisions worth making are the ones you make yourself.
Costs: What Routing Actually Saves
Concrete example. An operator running a Claude Code content pipeline that:
Generates SEO meta and FAQ (Haiku): 2,000 + 500 tokens
Reviews and edits (Opus): 10,000 + 2,000 tokens for trickier articles
Running everything on Opus would roughly triple the cost. Running everything on Sonnet would save vs Opus but produce noticeably weaker meta-generation than Haiku at similar quality. Routing by task type saves real money — often 40-60% versus a single-model approach — without sacrificing output quality.
When Not to Build a Router
Routing is leverage when you operate at volume. If you run Claude Code casually — a couple of hours a day, one task at a time — you do not need a router. You need to learn the three models well enough to pick the right one by feel. Build a router only when (a) cost is a real line item in your budget, (b) you are running multiple workflows that have genuinely different model needs, or (c) you want fallback infrastructure for resilience.
Frequently Asked Questions
What is a Claude Code router?
A Claude Code router is any layer — Anthropic’s built-in defaults, a third-party tool like OpenRouter, or custom code — that decides which Claude model handles a given request.
Does Claude Code have built-in routing?
Partial. Claude Code picks a default model (Sonnet) and routes internal sub-agent tasks to lighter models. It does not automatically promote your main session to Opus when a task gets hard.
What’s the difference between OpenRouter and a custom router?
OpenRouter is a hosted multi-provider gateway with billing and fallback built in. A custom router is something you build to enforce your own rules. OpenRouter is right for most teams. Custom routers are right for teams with strict requirements.
Should I use OpenRouter with Claude Code?
Useful if you want fallback, multi-provider mixing, or unified billing. Less useful if you only use Claude and want Anthropic-specific features like prompt caching to work optimally.
How do I pick the right Claude model for a task?
Default Sonnet. Opus for hard reasoning, architectural decisions, ambiguous specs. Haiku for high-volume formulaic tasks (classification, formatting, metadata).
How much can routing save me?
For volume users, 40-60% versus running everything on Opus, with no measurable drop in output quality if the routing rules are sensible.
Is there a cost to routing through OpenRouter?
OpenRouter adds a small markup on token pricing in exchange for the routing and aggregation features. For most users this is acceptable; for very high volume, going direct to Anthropic is cheaper.
The Bottom Line
Claude Code routing is leverage when you operate at volume and a distraction when you do not. Start by learning the three Claude models by feel and picking manually. Add OpenRouter if you want fallback. Build a custom router only when cost or compliance actually justifies the engineering. The router is not the goal; the right model on the right task is the goal.
Published: May 25, 2026 | Last fact-check: June 12, 2026 — added Claude Fable 5 ($10/$50/MTok)
Quick Answer
Get an Anthropic API key at console.anthropic.com → API Keys → Create Key. The key starts with sk-ant- and is shown once — copy and store it in a password manager immediately. Add billing credits before making API calls.
Full setup, security, and usage walkthrough below.
An Anthropic API key is the credential that lets your application, script, or tool call Claude programmatically. Whether you are wiring Claude into Claude Code, building an internal agent, or integrating Claude into a SaaS product, the API key is the first step. This walkthrough covers how to create one, how to keep it safe, and the most common mistakes people make in the first 48 hours after they have it.
All models support 50% Batch API discount for non-real-time requests. Fable 5 is free on Pro/Max/Team through June 22, 2026. Prices verified June 12, 2026.
What an Anthropic API Key Is (and Isn’t)
The Anthropic API key authenticates requests to the Anthropic Messages API. It identifies which workspace and organization is making the call, what model permissions it has, and where to bill the token usage.
What an API key is not: a login. You cannot use an API key to sign into claude.ai. The web interface and the API are separate billing surfaces. Your Pro or Max subscription does not grant API credit by default; API usage requires its own billing setup.
How to Get an Anthropic API Key
The process takes three minutes if you already have an Anthropic account, ten if you do not.
Go to console.anthropic.com. This is the Claude Console (sometimes called the Anthropic Console), the developer dashboard separate from the consumer claude.ai interface.
Sign in or create an account. If you already use claude.ai, your login works here. New accounts require email verification.
Click “API Keys” in the left sidebar. You may need to expand the navigation under your workspace name first.
Click “Create Key.” Give the key a descriptive name (e.g., “Claude Code Laptop,” “Production Backend,” “Local Dev”). The name is for your reference only.
Copy the key immediately. Anthropic shows the full key exactly once. After you close the modal, you cannot retrieve it — only revoke it and create a new one.
Store it in a password manager or secret vault. 1Password, Bitwarden, AWS Secrets Manager, GCP Secret Manager — anywhere except a text file on your desktop or a committed .env in a public repo.
Adding Billing Before You Can Use the Key
A common surprise: a freshly created API key cannot make calls until you add a payment method and credits to your Anthropic account. The key exists, but every request returns a billing error.
To add billing:
In the Claude Console, click “Billing” or “Plans & Billing” in the left sidebar.
Add a payment method (credit card; Anthropic also supports invoicing for enterprise).
Either pre-purchase API credits or enable auto-recharge. Most users enable auto-recharge with a low threshold to avoid hitting empty mid-job.
Set a monthly usage limit if you want a safety cap.
Once billing is set up, your API key works.
Anthropic API Key Format
An Anthropic API key starts with the prefix sk-ant- followed by a long alphanumeric string. The full key is roughly 100 characters. If your key does not start with sk-ant-, you have copied something incomplete.
Different key types exist:
Live keys (sk-ant-api...): Production calls, real billing.
Admin keys (sk-ant-admin...): Workspace admin operations, not for inference calls.
Most developers only need a live key.
Which Claude Models the API Key Works With
A standard live API key gives you access to the current generation of Claude models:
Claude Fable 5 (claude-fable-5) — current top tier, released June 9 2026. $10/$50 per million tokens. Anthropic’s first Mythos-class model. Note: carries a mandatory 30-day data retention requirement (no zero data retention option). Full breakdown here.
Claude Opus 4.8 (claude-opus-4-8) — second tier, released April 16 2026. $5/$25 per million tokens. Supports zero data retention.
Claude Sonnet 4.6 (claude-sonnet-4-6) — released February 17 2026. $3/$15 per million tokens. The production default for most workloads.
Claude Haiku 4.5 (claude-haiku-4-5) — released October 15 2025. $1/$5 per million tokens. Fast and cheap for high-volume work.
Earlier model versions (Sonnet 4, Opus 4.6, Haiku 3.5, etc.) are still callable by their specific snapshot IDs until Anthropic announces deprecation. Check the deprecation timeline in the Claude Console for any model you depend on in production.
How to Use the API Key
You pass the key in the x-api-key header on every request to the Messages API:
In Python or Node.js, the official SDKs read ANTHROPIC_API_KEY from your environment automatically. You should never hardcode the key in source code.
Security: How to Not Leak Your Key
Anthropic API keys leak constantly. Most leaks happen the same way:
Committing the key to a public GitHub repo. The single most common leak. GitHub scans for known credential patterns and notifies Anthropic; your key gets auto-revoked within minutes. You will know because your calls suddenly start failing.
Pasting the key into a shared chat or document. Anyone with access becomes a credential holder.
Putting the key in client-side JavaScript. A browser app shipping its API key to users is giving the key away. Always proxy through a backend.
Logging the key. Any logging system that captures HTTP headers can leak the key. Mask sensitive headers in your logger config.
The good rule: treat your API key like a credit card number, because that’s what it functions as.
Rotating an Anthropic API Key
You should rotate keys quarterly at minimum, and immediately if a key is suspected compromised. Rotation in the Claude Console:
Go to API Keys.
Create a new key with a fresh name (e.g., “Claude Code Laptop 2026 Q3”).
Update your application’s environment variable or secret manager to use the new key.
Verify the new key works.
Revoke the old key.
The five-minute rotation is far cheaper than dealing with a leaked key that was used by an attacker for hours before you noticed.
Workspace and Organization Keys
Anthropic accounts are organized as: Organization → Workspaces → API Keys. Most individuals only use one of each. Teams use multiple workspaces to separate environments (production, staging, dev) or projects.
Each key belongs to one workspace. Billing rolls up to the organization. If you need separate billing visibility per project, separate workspaces are the lever.
Monitoring API Key Usage
The Claude Console shows per-key usage in the “Usage” section. You can see:
Token spend per key per day
Model breakdown (Opus, Sonnet, Haiku usage)
Input vs output token split
Cache usage (if you have prompt caching enabled)
Set up usage alerts in Billing. The Anthropic console can email you when daily or monthly spend crosses a threshold. This is the cheapest insurance against a runaway loop or compromised key.
Frequently Asked Questions
How do I get an Anthropic API key?
Sign in to console.anthropic.com, open API Keys in the sidebar, click Create Key, name it, and copy the key immediately. You cannot retrieve the full key after closing the creation modal.
Is the Anthropic API key free?
The key itself is free to generate. Using it costs money — Anthropic bills per token at the API pricing in effect. You must add billing credits before the key works.
Does my Claude Pro or Max subscription include API credits?
No. Pro and Max subscriptions cover the chat interface and Claude Code (with usage caps). API usage is billed separately against your Anthropic account.
What does an Anthropic API key start with?
Live API keys start with sk-ant-api. Admin keys start with sk-ant-admin. The key is roughly 100 characters long.
What happens if my Anthropic API key gets leaked?
Anyone with the key can use it to make API calls billed to your account until the key is revoked. If you suspect a leak, revoke immediately in the Claude Console and check Usage for any suspicious activity.
Can I use the same API key for Claude Code and my own app?
You can, but you should not. Use separate keys per environment (Claude Code Laptop, Production Backend, Local Dev). Separate keys make revocation surgical instead of catastrophic.
Where should I store my Anthropic API key?
In a password manager (1Password, Bitwarden) for personal use, or in a secret manager (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault) for production. Never commit it to a repo or hardcode it in source.
How do I rotate an Anthropic API key?
Create a new key in the Claude Console, update your application to use the new key, verify it works, then revoke the old key. Rotate quarterly as a baseline.
The Bottom Line
Getting an Anthropic API key is a three-minute process. Keeping it safe is a discipline. Use a password manager, rotate quarterly, never put the key in client-side code, and set usage alerts in the Claude Console. Treat the key as production infrastructure, not a developer toy, and it will serve you for years without incident.
You have your key. Now hit the ground running.
The Solo Builder Seed Kit includes a ready-made Claude skill file, 20 tested prompts for solo operators, and a step-by-step setup guide. Paste your API key, install the skill, and you’re building — $47.
Go to console.anthropic.com, sign in or create an account, then navigate to Settings > API Keys. Click ‘Create Key’, give it a name, and copy the key immediately — it is only shown once. You’ll need to add a credit card and funds to your account before making API calls.
Is there a free tier for the Anthropic API?
Anthropic does not offer a persistent free tier for the API. New accounts may receive a small initial credit to test the API. After that, all usage is billed at standard token rates. The free tier of claude.ai (the chat interface) is separate from API access.
How much does the Anthropic API cost?
As of June 2026: Claude Haiku 4.5 costs $1 input / $5 output per million tokens. Claude Sonnet 4.6 costs $3/$15. Claude Opus 4.8 costs $5/$25. Claude Fable 5 (newest, released June 9) costs $10/$50 per million tokens. The Batch API offers 50% off for non-real-time workloads.
How do I keep my Anthropic API key secure?
Never commit API keys to version control. Store them in environment variables or a secrets manager (AWS Secrets Manager, GCP Secret Manager, Vault). Use separate keys per application so you can rotate or revoke them independently. Set spending limits in the Anthropic console to cap accidental runaway costs.
What happens if my Anthropic API key is compromised?
Go to console.anthropic.com > Settings > API Keys immediately and click Revoke next to the compromised key. Create a new key and rotate it into your applications. Review your usage logs for unexpected spend. Anthropic will not refund charges made with a compromised key unless you contact support promptly.
Can I use my Anthropic API key with Claude Code and Claude Cowork?
Claude Code (the CLI tool) uses your API key when you run it outside a claude.ai subscription context. Claude Cowork (the desktop app) uses your subscription, not a raw API key. For self-hosted integrations, scripts, and Agent SDK workflows, your API key from console.anthropic.com is what you need.
Published: June 9, 2026 | Last fact-check: June 10, 2026 against Anthropic’s pricing page. Rates change — always verify at anthropic.com/pricing before commitments.
Quick Answer
Claude Code is included with Pro ($20/month), Max 5x ($100/month), Max 20x ($200/month), and Team Premium seats ($100/seat annual, 5-seat minimum). Team Standard does NOT include Claude Code. API-only billing is also available: Sonnet 4.6 at $3/$15 per million tokens, Opus 4.8 at $5/$25, Haiku 4.5 at $1/$5. Most individual developers get the best value from Max 5x at $100/month.
Full pricing breakdown and which tier fits which user below.
Claude Code pricing in 2026 is structured around two paths: subscription plans (Pro, Max, Team) that include Claude Code with usage caps, and API-only access where you pay Anthropic per token used. Most users choose a subscription. Heavy enterprise users sometimes choose the API path, and some use both.
This guide breaks down what each tier actually costs, what you get, and which path makes sense for which kind of user. The price ceiling sits at the Max $200/month plan for individuals, and at custom enterprise contracts above that.
Claude Code Subscription Plans (2026)
Claude Code pricing: model cost breakdown (June 2026)
Model
Input $/MTok
Output $/MTok
Context
Best for in Claude Code
Claude Fable 5
$10
$50
1M tokens
Most demanding reasoning, maximum capability
Claude Opus 4.8
$5
$25
1M tokens
Complex refactors, long-horizon agentic coding
Claude Sonnet 4.6
$3
$15
1M tokens
Daily development — best cost/capability ratio
Claude Haiku 4.5
$1
$5
200k tokens
Fast lookups, simple completions, cost control
Prices from platform.claude.com as of June 10, 2026. Batch API reduces costs by 50%. Prompt caching can reduce input costs significantly for repeated context. Claude Code bills through your Anthropic API account.
Claude Code subscription vs API billing
Option
How billed
Best for
Claude Max plan
Flat monthly ($100 or $200)
Heavy daily Claude Code users who want predictable costs
API pay-as-you-go
Per token used
Variable usage, cost-optimized workflows, teams
API with caching
Per token (cached inputs discounted)
Long system prompts or repeated context (e.g., large codebase)
Anthropic offers four consumer-facing tiers that include Claude Code:
Small teams; collaboration but NO Claude Code access
Team Premium
$100/seat/month (annual, 5-seat minimum)
Engineering teams; required for Claude Code on Team plans
Enterprise
Custom
Larger orgs with security/compliance needs
Critical note for Team customers: Team Standard does NOT include Claude Code. You need Team Premium seats ($100/seat annual, $125/seat monthly) for any developer who needs Claude Code access. You can mix Standard and Premium seats on one team — useful when only part of your org codes.
What Each Tier Actually Includes
Pro: $20/month
Pro gives you access to Claude.ai (the chat interface), Claude Desktop, and Claude Code via the CLI. Usage limits are tighter than most committed users prefer — running multi-file refactors or long agent sessions hits the cap quickly. Pro is reasonable as a starting point. It is not adequate for serious daily Claude Code work.
Max 5x: $100/month
The 5x designation refers to the rough multiplier on usage limits compared to Pro. For most individual developers who use Claude Code several hours per day, this tier provides enough headroom to work without running into limits constantly. It is the sweet spot for solo operators and small consultancies.
Max 20x: $200/month
20x headroom for users who run Claude Code as an always-on agent — overnight jobs, batch processing, multi-hour orchestration. If you find yourself routinely worried about hitting limits on the 5x tier, the 20x tier removes that worry.
Team Standard: $20-25/seat/month (5-seat minimum)
Team Standard gives a small group shared admin, SSO, SCIM, shared projects, usage analytics, and centralized billing. It is collaboration infrastructure. Crucially, Team Standard does not include Claude Code access — any developer who needs Claude Code must be on a Premium seat.
Team Premium: $100-125/seat/month (5-seat minimum)
Team Premium adds Claude Code to the Team Standard feature set. At $100/seat annual, the per-seat economics match individual Max 5x ($100/month) while adding team management. For an engineering team of 5+ developers using Claude Code daily, Team Premium is a straight upgrade over individual Max subscriptions. You can mix Standard and Premium seats on one team — non-coding teammates can sit on Standard while developers get Premium.
Claude Code via API: Pay-Per-Token
The alternative to a subscription is using Claude Code with API credentials directly. You provide an Anthropic API key, and your token usage gets billed against your Anthropic account at API rates.
API pricing (per million tokens, May 2026 standard rates):
Claude Haiku 4.5: $1.00 input / $5.00 output — cheapest current-generation model, ideal for classification, routing, summarization at volume
Claude Sonnet 4.6: $3.00 input / $15.00 output — best price-to-quality ratio; the production default
Claude Opus 4.8: $5.00 input / $25.00 output — current flagship; complex reasoning and agentic coding
Prompt caching: cached reads at 10% of standard input rate — up to 90% savings on repeated context
Batch API: 50% off both input and output if you can wait up to 24 hours for results
Output:input ratio: consistently 5x across all current-generation models
One catch with Opus 4.8: list price is identical to Opus 4.8, but Anthropic shipped a new tokenizer that can produce up to 35% more tokens for the same input text. Your effective bill per request can go up even though the rate card did not. Worth knowing before you switch your default model.
For heavy users, the API path can be cheaper than Max, but you give up the predictability of a flat monthly fee. For lighter users, the API path is almost always more expensive than Pro.
How to Decide: Subscription vs API
The decision tree is simpler than it looks.
You use Claude Code less than an hour a day: Pro at $20/month.
You use Claude Code several hours a day: Max 5x at $100/month.
You run Claude Code as an unattended agent or for batch work: Max 20x at $200/month, or API with prompt caching enabled.
You’re a team of 5+ developers: Team Premium at $100/seat/month (annual; $125 monthly), or look at Enterprise.
You have unpredictable spikes: API with budget alerts gives you the most control.
What’s Not Included in Subscription Plans
Even on Max 20x, a few things still cost extra or fall outside the standard plan:
Anthropic API tokens for non-Claude Code use: If you build apps that call the Anthropic API directly, those tokens bill against API credits, not your Max subscription.
Third-party MCP servers with their own costs: Many MCP servers are free, but some integrate with paid services that bill you separately.
Storage and infrastructure costs: Where you actually run Claude Code (your laptop, your cloud VM) still costs whatever it costs.
Hidden Value: Why Max Pays Back Quickly
$100/month sounds steep until you compare it to what Claude Code replaces. For an operator running multi-step content workflows, infrastructure automation, or coding tasks that would otherwise require additional contracting hours, the Max plan typically pays back inside the first week of the month.
One concrete example: drafting and publishing a single SEO-optimized WordPress article with full schema, taxonomy, internal linking, and AEO/GEO optimization takes a human content team 3-5 hours. Running it through a Claude Code pipeline takes 15 minutes of supervised work. The output quality difference is small; the cost difference is large.
This is the framing that matters: Claude Code pricing is not “how much does the AI cost.” It is “how much labor does the AI replace.” On that framing, Max 5x is the cheapest line item in most knowledge-work budgets.
Annual vs Monthly Billing
Anthropic offers a discount for annual prepayment on Pro and Max tiers — generally around 20% off. If you are confident in your usage pattern, the annual prepay is the right call. If you are still evaluating, monthly gives you flexibility to change tiers as your needs shift.
New for June 15, 2026: the Agent SDK Credit Pool (Dual-Bucket Billing)
Starting June 15, 2026, Anthropic splits subscription usage into two buckets: interactive Claude Code sessions keep drawing from your normal plan limits, while unattended Agent SDK work (claude -p, cron jobs, CI pipelines, scripts) draws from a new monthly credit pool — Pro $20, Max 5x $100, Max 20x $200, Team Standard $20/seat, Team Premium $100/seat — with overage billed at standard API rates.
Practical impact: if you run any headless automation on a subscription today, that usage stops counting against your interactive limits and starts metering against the credit pool. Light automation — a nightly script or two — fits comfortably inside Pro’s $20 pool; sustained agent fleets will spill into API-rate overage, at which point a dedicated API key is usually easier to manage. Full mechanics, worked examples, and what to do before the cutover: Claude Agent SDK dual-bucket billing — what changes June 15, 2026. To model your own numbers, use the interactive calculator on our main Claude pricing page.
Frequently Asked Questions
How much does Claude Code cost per month?
Claude Code is included with Claude Pro ($20/month), Max 5x ($100/month), or Max 20x ($200/month). API-only usage is billed per token at separate rates.
Is there a free version of Claude Code?
No. Claude Code requires either a paid Claude subscription (Pro, Max, or Team) or API credentials with a funded account. The Claude free tier does not include Claude Code.
What’s the difference between Max 5x and Max 20x?
The numbers refer to roughly how much usage you get relative to Pro. Max 5x ($100/month) suits daily developers. Max 20x ($200/month) suits heavy users running agent workflows or long batch jobs.
Can I use Claude Code with just an API key instead of a subscription?
Yes. Claude Code accepts an Anthropic API key for authentication. You pay per-token usage at API rates instead of a flat subscription fee.
Is Claude Code cheaper than GitHub Copilot or Cursor?
At the entry level, Copilot ($10/month) and Cursor Pro ($20/month) cost less than Max. Per unit of output for serious work, Claude Code on Max often comes out cheaper because of how much it can do per session.
Does Team pricing include Claude Code?
Only Team Premium ($100/seat annual, $125/seat monthly, 5-seat minimum) includes Claude Code. Team Standard does NOT include Claude Code. You can mix Standard and Premium seats on the same team so non-coding teammates can sit on Standard while developers get Premium.
What happens if I hit my Claude Code usage limit?
On Pro and Max, Claude Code slows or pauses until your usage window resets (typically rolling 5-hour windows on Pro, longer reset cadences on Max). You can upgrade tiers anytime for immediate additional capacity.
The Bottom Line on Claude Code Pricing
For most serious users: Max 5x at $100/month. For light users: Pro at $20/month. For heavy agent workloads: Max 20x at $200/month or API with prompt caching. The pricing is competitive with other AI coding tools, and the value relative to labor it replaces makes Max the cheapest line item on most knowledge-work budgets.
More Claude Code Pricing Questions: Plans, Seats, and Limits
Is Claude Code free?
Claude Code is not free. It requires a paid subscription: Pro ($20/month), Max 5x ($100/month), Max 20x ($200/month), or Team Premium seats ($100/seat/month annual). The Free tier does not include Claude Code. API-only access is also available at standard token rates.
What is the cheapest plan that includes Claude Code?
Pro at $20/month is the cheapest Claude subscription that includes Claude Code. However, Pro has tighter usage limits and heavy Claude Code sessions will hit the cap quickly. For daily developer use, Max 5x at $100/month provides much more headroom.
Does Claude Code use API tokens from my subscription?
Claude Code usage counts against your subscription plan’s included usage, not against separate API credits. Subscription plans and API access are billed separately — a Pro subscription does not give you API credits. If you need programmatic API access alongside Claude Code, you need both.
How does Claude Code pricing compare to GitHub Copilot?
GitHub Copilot costs $10–$19/month for individuals. Claude Code starts at $20/month (Pro) with usage limits, or $100/month (Max 5x) for heavier use. Claude Code offers a larger context window and stronger reasoning for complex multi-file tasks; Copilot has tighter IDE integration. For pure code completion, Copilot is cheaper. For agentic coding and large-context work, Claude Code is more capable.
Can I use Claude Code on a Team Standard plan?
No. Team Standard ($25/seat/month annual) does not include Claude Code. Only Team Premium seats ($100/seat/month annual) include Claude Code. You can mix Standard and Premium seats on one Team plan — assign Premium only to developers who need Claude Code.
What happens to Claude Code usage when I hit my plan limit?
When you hit your included usage limit, you can continue on Pro, Max 5x, and Max 20x using extra usage billed at standard API rates with a spending cap you set. This prevents surprise overages while keeping Claude Code available for critical work beyond your plan ceiling.
Claude Code API and Model Questions
How much does Claude Code cost in 2026?
Claude Code bills through your Anthropic API account based on which model you use. As of June 2026: Claude Opus 4.8 costs $5/$25 per million input/output tokens; Claude Sonnet 4.6 costs $3/$15 per MTok; Claude Haiku 4.5 costs $1/$5 per MTok; Claude Fable 5 (the new June 2026 flagship) costs $10/$50 per MTok. There is no separate Claude Code subscription — usage is API-billed. Heavy users may find the Claude Max plan ($100–$200/month flat) more cost-effective.
What is the cheapest way to use Claude Code?
Use Claude Haiku 4.5 ($1/$5 per MTok) for simple tasks and Claude Sonnet 4.6 ($3/$15 per MTok) for most development work. Enable prompt caching for large codebases — repeated context (like a long system prompt or frequently referenced file) is cached and billed at a significant discount. Use the Message Batches API for non-real-time work to get 50% off standard rates. Reserve Opus 4.8 or Fable 5 for tasks that genuinely require maximum capability.
Does Claude Code have a subscription plan?
Claude Code itself does not have its own subscription — it bills through your Anthropic API account. However, the Claude Max plan ($100/month for 5x usage limits, or $200/month for 20x limits) can cover Claude Code usage. If you’re using Claude Code heavily every day, Max may be more cost-effective than pure pay-as-you-go API billing. Check platform.claude.com/docs/en/about-claude/pricing for current plan details.
Which Claude model should I use with Claude Code?
Claude Sonnet 4.6 is the best default for most Claude Code workflows — it offers near-Opus intelligence at half the price ($3 vs $5 per input MTok) and supports extended thinking. Use Claude Opus 4.8 for complex multi-file refactors or architecturally difficult problems where output quality is worth the premium. Claude Fable 5 (launched June 10, 2026) is available for maximum capability tasks. Use Haiku 4.5 for fast, cheap lookups and simple completions.
Does Claude Code support prompt caching?
Yes. Claude Code supports Anthropic’s prompt caching feature. For workflows where you repeatedly pass the same large context — a codebase system prompt, a long CLAUDE.md file, frequently referenced documentation — prompt caching stores that context and bills repeated reads at a discounted rate. This can significantly reduce costs for projects with large persistent context. See platform.claude.com/docs/en/build-with-claude/prompt-caching for implementation details.
How do I track my Claude Code API spending?
Monitor usage at platform.claude.com — the console shows token usage and cost by model, date range, and API key. Set spending limits on your API key to cap maximum monthly spend. For teams, use separate API keys per project or environment to attribute costs. The usage dashboard updates in near-real time so you can catch runaway spend before it compounds.
Once you stop asking what Claude is and start asking how to use it at scale, the limits become the conversation.
Once you stop asking “what is Claude” and start asking “how do I use Claude at scale,” you run into a different category of question. How big is the context window, actually, in this specific situation? What’s the file upload limit? What happens when one teammate burns through the Team plan? Where does the 1M context window apply and where doesn’t it? When does extra usage kick in and what does it cost?
The answers exist — they’re just spread across a dozen Anthropic Help Center articles, and the wrong combination of guesses can make you think you’ve hit a hard limit when you’ve actually just hit the wrong setting. This article is the consolidated map. Triple-sourced against Anthropic’s official documentation, verified May 15, 2026.
Claude Usage Limits by Plan (June 2026)
Plan
Messages/Day
Context Window
File Upload
Projects
Free
Limited (varies)
200K tokens
Up to 10MB per file
No
Pro ($20/mo)
~2,000 (Sonnet)
1M tokens (Opus/Sonnet)
Up to 30MB per file
Yes
Max 5x ($100/mo)
~10,000 (Sonnet)
1M tokens
Up to 30MB per file
Yes
Max 20x ($200/mo)
~40,000 (Sonnet)
1M tokens
Up to 30MB per file
Yes
Team ($25/seat/mo)
~2,000/seat
1M tokens
Up to 30MB per file
Yes
API (pay-per-token)
Rate-limited by tier
1M tokens (Opus/Sonnet)
Per API limits
Via system prompt
Message limits are approximate and vary by model. Anthropic adjusts limits based on system load. Verified June 9, 2026.
The four limits that matter most
If you’re running Claude in any sustained capacity, four limits will define your experience. Get these right and you have headroom. Get them wrong and you’ll think Claude is broken when it’s actually working as designed.
1. Context window — how much Claude can read in a single conversation. Varies by model and surface. The 1M window is real but only available in specific places.
2. File upload size — how big a single file can be. 30 MB cap per file across the board, with workarounds for larger files.
3. Usage limits — how much Claude work you can do per session/week. Per-user, not pooled. Different limits for chat vs Claude Code vs Agent SDK.
4. Extra usage / overage — what happens when you hit the cap. Either you’ve enabled it and you keep going at API rates, or you’re stopped until the limit resets.
Context window: where 1M tokens actually applies
Per Anthropic’s Help Center documentation (verified May 15, 2026), context window size depends on the model AND on the surface you’re using Claude through. This is the single most-misunderstood limit because the same model can have a different context window in chat than it does in Claude Code or the API.
Web and desktop chat (claude.ai):
Opus 4.8, Opus 4.6, Sonnet 4.6 — 500K tokens on all paid plans
All other models — 200K tokens on paid plans
Claude Code:
Opus 4.8 — 1M tokens on Pro, Max, Team, and Enterprise
Sonnet 4.6 — 1M tokens on all paid plans, but extra usage must be enabled to access it (except on usage-based Enterprise plans)
Claude API:
Opus 4.8, Opus 4.6, Sonnet 4.6 — 1M tokens at standard pricing (no long-context premium)
All other models — 200K tokens
The practical translation: if you need the full 1M token window, use Claude Code or the API with one of the supported models. The web chat tops out at 500K even on the most capable models. That difference matters when you’re trying to feed Claude an entire codebase, a long video transcript, or a multi-document research bundle.
File upload size: 30 MB per file, with workarounds
Per Anthropic’s Help Center, the maximum file size for both uploads and downloads is 30 MB per file. This applies whether you’re uploading a PDF, a CSV, an image, or any other supported file type.
For PDFs larger than 30 MB, Anthropic’s documentation notes that Claude can process them through its computing environment without loading them into the context window. That’s a real workaround for big PDFs but it doesn’t help you for other large file types.
If you regularly hit the 30 MB cap, the practical patterns are:
Split before upload — break the file into chunks under 30 MB, upload each, work with them as separate sources
Convert format — a 35 MB Word doc with embedded images may compress to under 30 MB as a PDF; CSVs can often be reduced by removing unused columns
Upload to GCS or S3 and let Claude read via tools — for the Agent SDK / API path, you can put the file in cloud storage and have Claude read it via web fetch or a custom tool, bypassing the upload cap entirely
Usage limits: per-user, not pooled
This is the limit that confuses teams the most. Per Anthropic’s Help Center documentation on the Team plan (verified May 15, 2026): each team member has their own set of usage limits. They are not shared across the team.
If one teammate burns through their session limit, the rest of the team is unaffected. There is no pooled team allowance that one user can drain on behalf of others. The math is per-seat, always.
The usage limits themselves vary by seat type:
Standard Team seats — 1.25x more usage per session than Pro plan. One weekly usage limit applies across all models. Resets seven days after the session starts.
Premium Team seats — 6.25x more usage per session than Pro plan. Two weekly limits: one across all models, plus a separate one for Sonnet models specifically. Both reset seven days after session start.
For the actual numeric token-per-session limits, Anthropic does not publish exact numbers — they describe relative multipliers vs Pro. This is intentional; the underlying math is calibrated against typical workloads rather than a hard token ceiling.
Extra usage: what happens when you hit the cap
When a user hits their weekly limit, two things can happen depending on whether the organization has enabled extra usage:
If extra usage is enabled: additional Claude requests continue to flow at standard API rates (the same per-token pricing published on Anthropic’s pricing docs — $5/$25 MTok for Opus 4.8, $3/$15 for Sonnet 4.6, $1/$5 for Haiku 4.5). Extra usage is billed separately from the subscription. Team and Enterprise admins can enable, cap, and monitor extra usage at the organization level.
If extra usage is not enabled: the user’s Claude requests stop until their limit resets at the start of the next session window (seven days from when the current session started, not a fixed weekly day).
The right setting depends on your team’s tolerance for surprise bills versus interrupted workflows. Most production teams enable extra usage with a hard organizational cap so individual users have continuity but the org has predictable spend ceiling.
Claude Code limits: a separate model
Claude Code has its own usage limit accounting that exists alongside chat usage limits. Per Anthropic’s Help Center on Claude Code models, usage, and limits (verified May 15, 2026):
Interactive Claude Code (typing in terminal/IDE) draws from your subscription’s usage limits, the same pool as web chat
Non-interactive claude -p mode currently also draws from subscription usage limits — until June 15, 2026
Starting June 15, 2026, non-interactive mode and Agent SDK usage move to a separate per-user monthly Agent SDK credit pool
The June 15 change is important enough that it gets its own breakdown in our Agent SDK Dual-Bucket Billing article. The short version: if you’re running unattended Claude Code work in cron jobs or CI, your billing model is changing. Plan capacity against the new credit pool.
The limits that aren’t really limits
Three things that get reported as limits but are actually configuration choices:
“My context window keeps filling up.” This is usually caused by long-running conversations accumulating history rather than the model’s actual context window being too small. Starting a new conversation (or running /clear in Claude Code) resets the working context. Long sessions are not a hard limit; they are a working-memory pressure that compounds over turns.
“Claude won’t read my whole repository.” Repository size is rarely the actual limit; the limit is how much you can load into the context window at once. Tools like Claude Code’s file reading and search work around this by loading files on demand rather than upfront. The 1M context window helps but is not a substitute for selective loading.
“My team keeps hitting limits even though we’re on Team.” Almost always one of two things: (a) people are mistakenly assuming the seat allowance is shared, when it’s strictly per-user; (b) someone is running heavy automation through a subscription seat instead of a Claude Developer Platform API key (which is the recommended path for sustained team-wide automation, especially after June 15).
Decision matrix: which limits affect which use case
Map your use case to the limits that actually apply:
Solo chat user on Pro — 500K context on Opus 4.8/4.6/Sonnet 4.6 in chat, weekly session limit, 30 MB upload cap. Hit your limit and you wait or pay extra usage.
Solo developer using Claude Code — 1M context on Opus 4.8 (1M on Sonnet 4.6 with extra usage on). Same weekly session limit. June 15 billing change applies if you use claude -p.
Small team on Team Standard — Per-seat limits at 1.25x Pro session capacity, not pooled. 30 MB upload cap. June 15 billing change applies per-seat.
Team running Claude Code in CI — All of the above plus separate Agent SDK credit pool starting June 15. Strongly consider a Developer Platform API key for the CI workload to get true pay-as-you-go billing.
Enterprise running large-scale automation — Subscription limits are the wrong tool. Move to a Developer Platform API key, monitor usage at the org level, set spend caps in the Console.
What to actually do this week
Identify which surface you’re using Claude through (web, Claude Code, API). Different surfaces have different context windows even for the same model.
If you’re hitting “limit” errors, check whether extra usage is enabled at the organization level before assuming it’s a hard cap.
If you’re a Team admin and your team is reporting hitting limits, audit per-seat usage rather than assuming you need to upgrade the plan — the issue is often one heavy user, not the plan tier.
If anyone on your team is running unattended Claude work, read the Agent SDK billing change before June 15.
If you need the full 1M context window, switch to Claude Code or the API. Web chat tops out at 500K.
For uploads larger than 30 MB, split, compress, or move the file to cloud storage and have Claude read it via tools.
Frequently Asked Questions
Is the Claude Team plan usage limit shared across team members?
No. Per Anthropic’s Help Center documentation, each team member has their own set of usage limits. If one team member reaches their seat’s included limit, other team members are unaffected and can keep working.
What is Claude’s file upload size limit?
30 MB per file for both uploads and downloads, per Anthropic’s official documentation. For PDFs larger than 30 MB, Claude can process them through its computing environment without loading them into the context window.
Where does the 1M token context window actually apply?
1M context is available on Claude Code with Opus 4.8 (Pro/Max/Team/Enterprise) and on the API with Opus 4.8, Opus 4.6, and Sonnet 4.6. Web chat tops out at 500K tokens even on the most capable models. Sonnet 4.6 in Claude Code requires extra usage to be enabled to access the 1M window (except on usage-based Enterprise plans).
What’s the difference between Standard and Premium Team seats?
Standard seats offer 1.25x Pro plan usage per session with one weekly limit across all models. Premium seats offer 6.25x Pro session usage with two weekly limits (one across all models, one Sonnet-specific). Both reset seven days after the session starts.
What happens when I hit my Claude usage limit?
If extra usage is enabled at your organization, you continue at standard API rates billed separately. If extra usage is not enabled, your requests stop until your limit resets at the next session window (seven days from session start, not a fixed weekly day).
Should I use a Team plan or the API for production automation?
For sustained shared automation (CI pipelines, cron jobs, background services), Anthropic recommends the Claude Developer Platform with an API key over subscription seats. Subscription seats are sized for individual interactive use; API keys give you predictable pay-as-you-go billing, no per-seat caps, and don’t compete with team members’ interactive usage.
Anthropic Help Center: Understanding usage and length limits, What is the Team plan?, How is my Team plan bill calculated?, Manage extra usage for Team and seat-based Enterprise plans, Models, usage, and limits in Claude Code, How large is the context window on paid Claude plans?, How large is the Claude API’s context window?, Upload files to Claude (primary sources for all limit specifics)
Anthropic platform documentation: Context windows at docs.claude.com (primary source for API context window behavior)
Anthropic Help Center: Use the Claude Agent SDK with your Claude plan (primary source for the June 15, 2026 billing change)
All limit numbers and policies are accurate as of May 15, 2026. Anthropic adjusts subscription mechanics regularly; if you’re making procurement decisions on this article more than 60 days from the date stamp, re-verify the per-seat multipliers and context window availability against the current Help Center.
Frequently Asked Questions
What is Claude’s message limit per day?
Message limits vary by plan. Free: limited daily messages (Anthropic adjusts based on load). Pro ($20/month): approximately 2,000 Sonnet-equivalent messages per day. Max 5x ($100/month): approximately 10,000. Max 20x ($200/month): approximately 40,000. API users are rate-limited by tier with no hard daily message cap, instead governed by tokens-per-minute limits.
What is Claude’s maximum context window in 2026?
Claude Opus 4.8 and Claude Sonnet 4.6 both support a 1 million token context window. Claude Haiku 4.5 supports 200,000 tokens. Anthropic eliminated long-context surcharges in March 2026, so large-context requests are billed at standard per-token rates. The Free plan is limited to 200K context even on Sonnet.
What is the maximum file size I can upload to Claude?
On Pro, Max, and Team plans: up to 30MB per file, up to 5 files per conversation. Supported formats include PDF, text, CSV, code files, and images. The Free tier supports up to 10MB per file. For API users, file uploads are handled via the Files API with a 32MB per file limit.
How do I scale Claude beyond subscription message limits?
For high-volume workloads, switch to the Claude API (pay-per-token, no daily message cap beyond rate limits). Enterprise plans offer higher rate limits and custom agreements. The Batch API processes large jobs at 50% off standard prices for non-real-time workloads. Claude Max 20x ($200/month) is the highest subscription tier for interactive use.
What are Claude’s API rate limits?
API rate limits depend on your usage tier. New API accounts start at Tier 1 with lower limits. Spending history and account age automatically promote accounts to higher tiers with increased requests-per-minute and tokens-per-minute. Current tier limits are published at console.anthropic.com/settings/limits. Enterprise customers can negotiate custom rate limits.
Does Claude have a token limit per message?
There is no enforced per-message token limit separate from the overall context window. A single message can use up to the full context window (1M tokens for Opus 4.8 / Sonnet 4.6, 200K for Haiku 4.5). However, very long single messages may be slower to process. The practical limit is the context window of whichever model you are using.