Claude Model Comparison

Q: What is the best Claude model in 2026?

Claude Sonnet 4.6 is the recommended default. Use Opus 4.8 for maximum reasoning depth or outputs over 64K tokens. Use Haiku 4.5 for high-volume speed-sensitive work.

Q: Is Claude Opus 4.8 better than Sonnet?

Opus 4.8 has a higher capability ceiling but Sonnet 4.6 uniquely supports extended thinking and costs 40% less. For most users Sonnet 4.6 is the better choice.

Q: Which Claude model supports extended thinking?

Claude Sonnet 4.6 and Haiku 4.5 both support extended thinking. Claude Opus 4.8 does not.

Updated June 12, 2026

Claude Fable 5 launched June 9, 2026 as a new tier above Opus 4.8 — priced at $10/$50/MTok (2× Opus). This guide now covers all four models. Full Fable 5 breakdown →

Anthropic’s Claude model lineup in 2026 now spans four tiers: Fable 5 at the top for maximum capability ($10/$50/MTok), Opus 4.8 for serious production work ($5/$25), Sonnet 4.6 for the best balance of performance and cost ($3/$15), and Haiku 4.5 for speed and high-volume work ($1/$5). Picking the wrong model costs money or performance — sometimes both. This guide covers every meaningful difference so you can make the right call.

Quick answer: Sonnet 4.6 handles 80–90% of tasks at a fraction of the cost of higher tiers. Use Fable 5 for the hardest engineering and long-horizon agentic work ($10/$50/MTok). Use Opus 4.8 for serious production work with zero data retention requirements ($5/$25). Use Sonnet 4.6 as your daily driver ($3/$15). Use Haiku 4.5 when speed and cost dominate ($1/$5).

The Current Claude Model Lineup (June 2026)

Claude Fable 5 vs Opus 4.8 vs Sonnet 4.6 vs Haiku 4.5: side-by-side

Feature	Claude Fable 5 🆕	Claude Opus 4.8	Claude Sonnet 4.6	Claude Haiku 4.5
Best for	Hardest engineering, long-horizon autonomy	Production work, zero-data-retention	Best speed/intelligence balance	Fastest responses, high-volume tasks
Input price	$10 / MTok	$5 / MTok	$3 / MTok	$1 / MTok
Output price	$50 / MTok	$25 / MTok	$15 / MTok	$5 / MTok
Context window	1M tokens	1M tokens	1M tokens	200k tokens
Max output	128k tokens	128k tokens	64k tokens	64k tokens
Extended thinking	No (adaptive always on)	No	Yes	Yes
Adaptive thinking	Always on	Yes	Yes	No
Zero data retention	No (30-day mandatory)	Yes	Yes	Yes
Latency	Slow–Moderate	Moderate	Fast	Fastest
API ID	claude-fable-5	claude-opus-4-8	claude-sonnet-4-6	claude-haiku-4-5

As of June 2026, Anthropic’s four current models are Claude Fable 5, Claude Opus 4.8, Claude Sonnet 4.6, and Claude Haiku 4.5. All four support text and image input, multilingual output, and vision processing. They differ significantly in pricing, context window, output limits, and capability.

Feature	Fable 5 🆕	Opus 4.8	Sonnet 4.6	Haiku 4.5
Input price	$10 / MTok	$5 / MTok	$3 / MTok	$1 / MTok
Output price	$50 / MTok	$25 / MTok	$15 / MTok	$5 / MTok
Context window	1M tokens	1M tokens	1M tokens	200K tokens
Max output	128K tokens	128K tokens	64K tokens	64K tokens
Extended thinking	No (adaptive always on)	No	Yes	Yes
Adaptive thinking	Always on	Yes	Yes	No
Latency	Slow–Moderate	Moderate	Fast	Fastest
Reliable knowledge cutoff	2026	Jan 2026	Aug 2025 (reliable)	Feb 2025 (reliable)

Pricing is per million tokens (MTok) via the Claude API. Source: Anthropic Models Overview, June 2026.

Claude Fable 5: The New Top Tier (June 9, 2026)

Fable 5 is Anthropic’s first Mythos-class model released for general availability. It landed June 9, 2026 and sits above Opus 4.8 in capability — scoring 95.0% on SWE-bench Verified (vs 88.6% for Opus 4.8) and 80.0% on SWE-bench Pro (vs 69.2%). On the Senior Engineer benchmark, Fable 5 scores 91/100 vs approximately 63/100 for Opus 4.8.

Key differentiators for Fable 5:

Adaptive thinking always on — Fable 5 doesn’t have an extended thinking toggle. It always reasons adaptively, scaling depth to task complexity.
128K max output — same as Opus 4.8, twice Sonnet’s 64K cap.
1M token context window — same as Opus 4.8 and Sonnet 4.6.

Two constraints that matter:

Mandatory 30-day data retention. Fable 5 is not available under zero data retention. If your use case requires ZDR (healthcare, legal, finance with strict data handling), use Opus 4.8.
Safety classifier routing. Prompts touching cybersecurity, biology, chemistry, and distillation route to an Opus 4.8 fallback — at Fable 5 pricing. If your workload is in these domains, the upgrade is less impactful.

Use Fable 5 for: large migrations or refactors, multi-agent orchestration at frontier quality, long-horizon agentic work, complex scientific analysis, and any task where quality on hard problems justifies 2x cost over Opus.

Skip Fable 5 for: well-scoped routine work, high-volume pipelines (2x cost compounds), ZDR-required use cases, or domains where the safety classifier fallback applies.

Claude Opus 4.8: The Production Standard

Opus 4.8 is Anthropic’s most capable model supporting zero data retention (ZDR) — the right default for most production API work. Fable 5 has since surpassed it in raw capability, but Opus 4.8 remains the better choice for ZDR workloads, cost-sensitive pipelines, and domains where Fable 5’s safety classifier routing applies. Anthropic describes it as a step-change improvement in agentic coding over Opus 4.8, with a new tokenizer that contributes to improved performance on a range of tasks. Note that this new tokenizer may use up to 35% more tokens for the same text compared to previous models — a cost consideration worth factoring in for high-volume workflows.

Key differentiators for Opus 4.8 over the other two models:

128K max output tokens — double Sonnet and Haiku’s 64K cap. This matters for generating long-form code, detailed reports, or complete document drafts in a single call.
1M token context window — same as Sonnet 4.6, meaning Opus can process entire codebases or book-length documents in a single session.
Adaptive thinking — Opus 4.8 and Sonnet 4.6 both support adaptive thinking, which lets the model adjust reasoning depth based on task complexity.
Most recent knowledge cutoff — January 2026, versus August 2025 (reliable) for Sonnet and February 2025 (reliable) for Haiku.

Opus does not support extended thinking — that capability lives on Sonnet 4.6 and Haiku 4.5 Extended thinking lets the model reason step-by-step before generating output, which is particularly useful for complex math, science, and multi-step logic problems.

Use Opus 4.8 for: complex architecture decisions, large codebase analysis, multi-agent orchestration tasks, outputs that require more than 64K tokens, tasks demanding the latest possible knowledge, and any work where you need Opus-tier reasoning with zero data retention (Fable 5 is the absolute frontier, but does not support ZDR).

Skip Opus 4.8 for: routine content generation, customer support pipelines, high-volume classification or extraction, real-time applications requiring low latency, or any task where Sonnet scores within your acceptable quality threshold.

Claude Sonnet 4.6: The Workhorse

Sonnet 4.6 is the model Anthropic recommends as the best combination of speed and intelligence. Released in February 2026, it delivers a 1M token context window at $3 input / $15 output per million tokens — the same context window as Opus at 40% lower cost.

Sonnet 4.6 also uniquely offers extended thinking, which Opus 4.8 does not. When extended thinking is enabled, Sonnet can perform additional internal reasoning before generating its response — useful for reasoning-heavy tasks like complex debugging, multi-step research, and technical problem-solving where chain-of-thought depth matters.

For developers and teams using Claude Code, Sonnet 4.6 is the standard daily driver. It handles tool calling, agentic workflows, and multi-file code reasoning reliably, at a price point that makes heavy daily use economically viable.

Use Sonnet 4.6 for: most production workloads, Claude Code sessions, long-document analysis, content generation, coding tasks, research synthesis, customer-facing applications, and any workflow requiring the 1M context window where Opus’s premium isn’t justified.

Skip Sonnet 4.6 for: high-volume pipelines where Haiku’s lower cost is acceptable, simple classification or extraction tasks, or real-time applications where Haiku’s faster latency is required.

Claude Haiku 4.5: Speed and Volume

Haiku 4.5 is the fastest model in the Claude family and the most cost-efficient at $1 input / $5 output per million tokens. It has a 200K token context window — smaller than Opus and Sonnet’s 1M, but still substantial for most single-task work. It supports extended thinking but not adaptive thinking.

The 200K context limit is the most important practical constraint. Most single-document, single-task workflows fit within 200K. Multi-file codebases, long books, or extended conversation histories that push past that threshold need Sonnet or Opus.

Haiku 4.5 has the oldest knowledge cutoff of the three: February 2025. For tasks requiring awareness of events or developments from mid-2025 onward, Haiku won’t have that context baked in.

Use Haiku 4.5 for: content moderation, classification pipelines, entity extraction, customer support triage, real-time chat interfaces, simple Q&A, high-volume API workflows where cost and speed dominate, and any task where quality requirements are modest.

Skip Haiku 4.5 for: complex reasoning, large codebase analysis, tasks requiring recent knowledge (post-February 2025), multi-step agent workflows, or any output requiring more than 200K tokens of input context.

Pricing: What the Numbers Actually Mean in Practice

All three models price output tokens at 5x the input rate — a ratio that holds across the entire Claude lineup. This means verbose, long-form outputs cost significantly more than short, targeted responses. Minimizing generated output length is the highest-leverage cost optimization available before you touch model routing or caching.

To put the pricing in concrete terms: generating one million output tokens (roughly 750,000 words of generated text) costs $25 on Opus, $15 on Sonnet, and $5 on Haiku. For input-heavy workloads like document analysis where you’re feeding in large amounts of text but getting shorter responses, the cost gap narrows.

Three additional pricing levers apply across all models:

Prompt caching: Cuts cache-read input costs by up to 90% for repeated system prompts or documents. If your application reuses a large system prompt across many requests, caching is the single highest-impact cost reduction available.
Batch API: Provides a 50% discount for non-time-sensitive workloads processed asynchronously. Combine with prompt caching for up to 95% savings on qualifying workflows.
Model routing: Running a mix of Haiku for simple tasks, Sonnet for production workloads, and Opus for complex reasoning — rather than using one model for everything — can reduce total API costs by 60–70% without meaningful quality loss on the tasks that don’t require a flagship model.

Context Windows: 1M Tokens vs. 200K

Opus 4.8 and Sonnet 4.6 both offer a 1M token context window at standard pricing — no premium surcharge for extended context. For reference, 1 million tokens is roughly 750,000 words, enough to hold a large codebase, a full academic textbook, or months of business communications in a single conversation.

Haiku 4.5 has a 200K token context window. That’s still roughly 150,000 words — sufficient for most single-document tasks, but it creates a hard ceiling for anything requiring multi-file code review, book-length document analysis, or lengthy conversation histories.

If your workflow consistently requires more than 200K tokens of input, Sonnet 4.6 is the cost-efficient choice. Opus 4.8 is the right call only when the input load requires the additional reasoning capability Opus provides, not just the context window size — because Sonnet gets you the same 1M window at 40% lower cost.

Extended Thinking vs. Adaptive Thinking

These are two distinct features that appear together in the comparison table but serve different purposes.

Extended thinking (available on Sonnet 4.6 and Haiku 4.5, not Opus 4.8) lets Claude perform additional internal reasoning before generating its response. When enabled, the model produces a “thinking” content block that exposes its reasoning process — step-by-step problem decomposition before the final answer. Extended thinking tokens are billed as standard output tokens at the model’s output rate. A minimum thinking budget of 1,024 tokens is required when enabling this feature.

Adaptive thinking (available on Opus 4.8 and Sonnet 4.6, not Haiku 4.5) adjusts reasoning depth dynamically based on task complexity — the model allocates more reasoning for harder problems and less for simpler ones, without requiring explicit configuration.

The practical implication: if you need transparent, controllable step-by-step reasoning that you can inspect and use in your application, Sonnet 4.6’s extended thinking is often the right tool — and at lower cost than Opus.

Which Claude Model Should You Choose?

The right framework for model selection in mid-2026 is a four-tier stack: Fable 5 for the hardest problems, Opus 4.8 as the production standard, Sonnet 4.6 as the daily driver, Haiku 4.5 for volume. Start with Sonnet 4.6 and escalate selectively. Most production workloads — coding, writing, analysis, customer-facing applications — are well-served by Sonnet. Opus 4.8 earns its premium when you need ZDR, outputs over 64K tokens, or the January 2026 knowledge cutoff. Fable 5 earns its 2x premium when the task is genuinely hard enough that 10+ percentage points on SWE-bench matters for your outcome.

Haiku 4.5 belongs in any pipeline where you’ve identified tasks that don’t require Sonnet’s capability. High-volume routing, triage, classification, and real-time response scenarios are Haiku’s natural territory. The optimal production routing split is roughly 70% Haiku 4.5, 20% Sonnet 4.6, 8% Opus 4.8, 2% Fable 5 — rather than using a single model for everything. That ratio cuts costs by 60–70% without meaningful quality loss on the tasks that don’t need a flagship model.

You picked your model tier. Now get the pre-built setup.

Claude Seed Kits are pre-configured skill files with 20 tested prompts and a setup guide for your specific use case. Pick the kit that matches how you work — $47 each.

Solo Builder
Creator & Independent
Local Operator
Field Operator
Regulated Specialist

Frequently Asked Questions

What is the difference between Claude Opus 4.8, Sonnet, and Haiku?

Opus is Anthropic’s most capable model, optimized for complex reasoning, large outputs, and agentic tasks. Sonnet offers a balance of capability and cost, handling most production workloads at lower price. Haiku is the fastest and cheapest option, suited for high-volume, lower-complexity tasks. All three share the same core Claude architecture and safety training.

Is Claude Opus 4.8 worth the extra cost over Sonnet?

For most tasks, no. Sonnet 4.6 handles the majority of coding, writing, and analysis work at 40% lower cost. Opus 4.8 is worth the premium when you need outputs longer than 64K tokens, maximum agentic coding capability, or the most recent knowledge cutoff (January 2026 vs. Sonnet’s August 2025).

Which Claude model is best for coding?

Sonnet 4.6 is the standard recommendation for most coding work, including Claude Code sessions. Opus 4.8 is preferred for large codebase analysis, complex architecture decisions, or multi-agent coding workflows where maximum reasoning depth is required. Haiku 4.5 can handle simple code edits and explanations at much lower cost.

What is the Claude context window?

Claude Opus 4.8 and Sonnet 4.6 both have a 1 million token context window — roughly 750,000 words of combined input and conversation history. Claude Haiku 4.5 has a 200,000 token context window. Context window size determines how much information Claude can hold and reference in a single conversation.

Does Claude Opus 4.8 support extended thinking?

No. Extended thinking is available on Claude Sonnet 4.6 and Claude Haiku 4.5, but not on Claude Opus 4.8 Opus 4.8 supports adaptive thinking instead, which dynamically adjusts reasoning depth based on task complexity.

What is the cheapest Claude model?

Claude Haiku 4.5 is the least expensive model at $1 per million input tokens and $5 per million output tokens. It is also the fastest Claude model, making it well-suited for high-volume, latency-sensitive applications.

Can I use Claude through Amazon Bedrock or Google Vertex AI?

Yes. All three current Claude models — Opus 4.8, Sonnet 4.6, and Haiku 4.5 — are available through Amazon Bedrock and Google Vertex AI in addition to the direct Anthropic API. Bedrock and Vertex AI offer regional and global endpoint options. Pricing on third-party platforms may vary from direct Anthropic API rates.

Claude vs GPT-4o: Which Model Wins for Everyday Work?

Claude Sonnet 4.6 and GPT-4o are the primary head-to-head competitors in 2026 for professional daily use. They price similarly ($3 vs $3.00 per MTok input) but perform differently depending on task type.

Task Type	Claude Sonnet 4.6	GPT-4o
Long-document analysis (200K+ tokens)	✓ 1M context window	128K limit
Multi-step reasoning	Extended thinking available	o1 series for reasoning
Code generation	Strong; Claude Code natively	Strong; GitHub Copilot integration
Instruction following	Very consistent	Consistent
API cost (output)	$15/MTok	$10/MTok
Context window	1M tokens	128K tokens

The clearest differentiator is context window size. If your workflow involves analyzing full codebases, long contracts, or book-length documents in a single call, Claude Sonnet 4.6’s 1M token window eliminates chunking overhead that GPT-4o requires at 128K. For shorter tasks, either model performs comparably.

Claude vs Gemini 2.5 Pro: How Do They Compare?

Google’s Gemini 2.5 Pro competes directly with Claude Sonnet 4.6 on price and capability. Key differences:

Feature	Claude Sonnet 4.6	Gemini 2.5 Pro
Input price	$3.00/MTok	$3.00/MTok (under 200K tokens)
Output price	$15.00/MTok	$10.00/MTok
Context window	1M tokens	1M tokens
Extended thinking	Yes	Yes (2.5 Pro)
Agentic coding	Claude Code native	Via Gemini API / IDX

Gemini 2.5 Pro is cheaper on paper, especially for prompts under 200K tokens. Claude Sonnet 4.6’s advantage is instruction-following consistency on complex multi-step tasks and the Claude Code ecosystem for engineering teams already in the Anthropic stack.

Which Claude Model Should You Use in Claude Code?

Claude Code supports all four models. The recommended routing for most teams:

Fable 5 — Use for the hardest agentic tasks: large migrations, complex multi-file refactors, long-horizon autonomous workflows. Enable with claude --model claude-fable-5.
Opus 4.8 — Default for serious work: multi-agent orchestration, large codebase analysis, outputs over 64K tokens.
Sonnet 4.6 — Daily driver. Best cost-to-performance ratio for most coding tasks. Extended thinking handles complex architecture decisions.
Haiku 4.5 — High-frequency, low-complexity tasks: formatting, renaming, boilerplate, pipeline steps where speed matters more than depth.

The Max plan (available on claude.ai) unlocks 1M token context in Claude Code at no additional charge, which is the practical differentiator for large codebase work.

Frequently Asked Questions: Claude Model Comparison

What is the best Claude model in 2026?

Claude Sonnet 4.6 is the recommended default for most tasks — it delivers 80-90% of Opus 4.8’s capability at 40% lower cost. Use Opus 4.8 when you need maximum reasoning depth, outputs longer than 64K tokens, or the most recent knowledge cutoff (January 2026). Use Haiku 4.5 for high-volume, speed-sensitive work.

Is Claude Opus 4.8 better than Sonnet?

Claude Opus 4.8 has a higher capability ceiling than Sonnet 4.6: larger output window (128K vs 64K tokens), the most recent knowledge cutoff, and stronger performance on complex agentic coding tasks. However, Sonnet 4.6 uniquely offers extended thinking which Opus does not support, and it costs 40% less. For most users, Sonnet 4.6 is the better practical choice.

What is Claude Haiku 4.5 used for?

Claude Haiku 4.5 is optimized for speed and cost efficiency at $1 input / $5 output per million tokens. It is best suited for high-volume pipelines, classification, metadata generation, social media content, and any task where fast response time matters more than maximum reasoning depth. It has a 200K token context window.

Which Claude model supports extended thinking?

Claude Sonnet 4.6 and Claude Haiku 4.5 both support extended thinking. Claude Opus 4.8 does not. Extended thinking allows the model to reason step-by-step internally before generating output, which improves performance on complex math, science, and multi-step logic problems.

Frequently Asked Questions

What is the difference between Claude Opus, Sonnet, and Haiku?

Claude Opus 4.8 is the most capable model in the standard tier — best for complex reasoning, long-horizon agentic coding, and tasks requiring high autonomy. Claude Sonnet 4.6 balances intelligence and speed for production workloads — it supports extended thinking and adaptive thinking while costing less than Opus. Claude Haiku 4.5 is the fastest and cheapest option, suited for high-volume tasks where speed and cost matter more than maximum capability.

Which Claude model should I use in 2026?

Start with Claude Sonnet 4.6 for most production applications — it offers near-Opus intelligence at $3/$15 per million tokens and supports extended thinking. Use Claude Opus 4.8 for complex multi-step reasoning, long-horizon agentic work, or tasks where quality is worth the higher cost ($5/$25 per MTok). Use Claude Haiku 4.5 for high-volume, latency-sensitive tasks where cost is the primary concern. For maximum capability above Opus 4.8, Claude Fable 5 launched June 9, 2026.

How much does Claude Opus 4.8 cost?

Claude Opus 4.8 is priced at $5 per million input tokens and $25 per million output tokens on the Claude API (per platform.claude.com as of June 2026). Batch API offers 50% discounts. For comparison: Claude Sonnet 4.6 is $3/$15 per MTok and Claude Haiku 4.5 is $1/$5 per MTok.

Does Claude Sonnet support extended thinking?

Yes. Claude Sonnet 4.6 supports both extended thinking and adaptive thinking (per platform.claude.com/docs/en/about-claude/models/overview). Extended thinking lets the model reason through complex problems before answering. Claude Haiku 4.5 also supports extended thinking. Claude Opus 4.8 does not use extended thinking but does support adaptive thinking.

What is Claude Fable 5 and how does it compare to Opus?

Claude Fable 5 (API ID: claude-fable-5) is Anthropic’s most capable widely-released model as of June 9, 2026. It uses adaptive thinking (always on), has a 1M token context window, 128k max output, and is priced at $10 input / $50 output per million tokens. Fable 5 is positioned above Opus 4.8 in the model lineup for the most demanding reasoning and long-horizon agentic work.

What is the context window for each Claude model?

Claude Opus 4.8 and Claude Sonnet 4.6 both support 1 million token context windows. Claude Haiku 4.5 supports 200,000 tokens. All three are dramatically larger than the 200k context window that was standard in previous generations. The 1M context window allows Opus and Sonnet to process entire codebases, long research documents, or extended conversations without truncation.

📖 Related Claude Guides

Get alerted when Claude pricing or limits change

We track Anthropic’s models, pricing, and limits daily and send a short note when something changes that affects what you pay or build. Occasional, no spam.

Tag: Claude Model Comparison

Claude Opus 4.7: 3× Vision Resolution, Task Budgets, and the xhigh Effort Level Explained

Vision Resolution: What 3× Actually Means

Task Budgets

The xhigh Effort Level

Pricing: Unchanged from 4.6

Claude Fable 5 vs Opus 4.8 vs Sonnet vs Haiku: Model Comparison (June 2026)

The Current Claude Model Lineup (June 2026)

Claude Fable 5 vs Opus 4.8 vs Sonnet 4.6 vs Haiku 4.5: side-by-side

Claude Fable 5: The New Top Tier (June 9, 2026)

Claude Opus 4.8: The Production Standard

Claude Sonnet 4.6: The Workhorse

Claude Haiku 4.5: Speed and Volume

Pricing: What the Numbers Actually Mean in Practice

Context Windows: 1M Tokens vs. 200K

Extended Thinking vs. Adaptive Thinking

Which Claude Model Should You Choose?

Frequently Asked Questions

What is the difference between Claude Opus 4.8, Sonnet, and Haiku?

Is Claude Opus 4.8 worth the extra cost over Sonnet?

Which Claude model is best for coding?

What is the Claude context window?

Does Claude Opus 4.8 support extended thinking?

What is the cheapest Claude model?

Can I use Claude through Amazon Bedrock or Google Vertex AI?

Claude vs GPT-4o: Which Model Wins for Everyday Work?

Claude vs Gemini 2.5 Pro: How Do They Compare?

Which Claude Model Should You Use in Claude Code?

Frequently Asked Questions: Claude Model Comparison

What is the best Claude model in 2026?

Is Claude Opus 4.8 better than Sonnet?

What is Claude Haiku 4.5 used for?

Which Claude model supports extended thinking?

Frequently Asked Questions

What is the difference between Claude Opus, Sonnet, and Haiku?

Which Claude model should I use in 2026?

How much does Claude Opus 4.8 cost?

Does Claude Sonnet support extended thinking?

What is Claude Fable 5 and how does it compare to Opus?

What is the context window for each Claude model?

📖 Related Claude Guides

Get alerted when Claude pricing or limits change