Tag: Claude AI Pricing

  • Claude Team Plan Usage Limits: What Doubled in May 2026 (and What Didn’t)

    Claude Team Plan Usage Limits: What Doubled in May 2026 (and What Didn’t)

    Last refreshed: May 15, 2026

    The Claude Team plan’s usage limits changed significantly in May 2026. If you’re a Team subscriber and you haven’t noticed yet, you’re now getting substantially more capacity than you were in April — and the free tier got left behind entirely. Here’s exactly what changed, what you have now, and what it means in practice.

    Updated May 9, 2026

    Rate limits doubled for Team plan subscribers following Anthropic’s SpaceX Colossus 1 compute deal (announced May 6, 2026). Free plan excluded from all increases. This page reflects current limits.

    What Changed in May 2026: The SpaceX Rate Limit Increase

    On May 6, 2026, Anthropic announced a compute partnership with SpaceX, giving it access to SpaceX’s Colossus 1 data center. The practical result for paying subscribers came fast: rate limits doubled. Here’s the breakdown by tier:

    • Claude Code Pro and Max: 5-hour rate limits doubled
    • Team plan (all seats): 5-hour rate limits doubled
    • Seat-based Enterprise: 5-hour rate limits doubled
    • Tier 1 API customers: Max input tokens per minute increased 1,500%; max output tokens per minute increased 900%
    • Peak-hours throttling: Eliminated entirely for Pro and Max subscribers
    • Free plan: No change. Explicitly excluded from all increases.

    Source: Anthropic’s official announcement at anthropic.com/news/higher-limits-spacex.

    The 1,500% input token figure for Tier 1 API is the one that didn’t get much press coverage. That’s a 15× ceiling increase for API users who’ve been running agent pipelines and hitting hard walls. If you’ve been rate-limited during multi-step Claude Code runs, this is the change that matters most.

    Team Plan Seat Structure (Still Current)

    The seat types haven’t changed — just the capacity within them. The Team plan still offers two seat types that can be mixed within the same organization:

    Seat Type Annual Price Monthly Price Usage vs Pro Claude Code
    Standard $25/seat/month $30/seat/month 1.25× more per session No
    Premium $100/seat/month $125/seat/month 6.25× more per session Yes

    Both seat types benefit from the May 2026 doubling of the 5-hour rate limit window. A Premium seat’s 6.25× multiplier now applies to a higher baseline than it did before May 6.

    How the 5-Hour Rate Limit Window Works

    Anthropic uses a rolling 5-hour window for usage limits, not a daily reset. Here’s what that means practically:

    • Usage is measured across a rolling 5-hour window, not midnight-to-midnight
    • If you hit the limit, you wait for the oldest usage to roll off — not for a fixed reset time
    • Heavy burst usage depletes your window faster than spread-out usage
    • The May 2026 doubling means the ceiling within that window is now twice as high

    Peak-hours throttling — the extra restriction that kicked in during high-demand periods — is now eliminated for Pro and Max. Team plan benefits from the doubled limit floor; the throttling elimination is Pro and Max specific.

    Current Models Available on Team Plan

    As of May 2026, the Claude model lineup (verified from Anthropic’s official models page):

    Model API String Context Window
    Claude Opus 4.7 claude-opus-4-7 1M tokens
    Claude Sonnet 4.6 claude-sonnet-4-6 1M tokens
    Claude Haiku 4.5 claude-haiku-4-5-20251001 200K tokens

    Deprecation notice: Claude Sonnet 4 and Opus 4 (original 4.0-generation, 20250514 date-string model IDs) are being retired June 15, 2026. Update any API integrations before that date.

    What the Free Plan Doesn’t Get

    The May 2026 rate limit increase does not apply to free accounts. Anthropic explicitly excluded the free tier from all capacity increases tied to the SpaceX deal. Paid plans now have a substantially higher ceiling while the free ceiling stays the same. If you’re hitting limits regularly on the free tier, the May 2026 changes are pressure toward upgrading — not relief.

    Team Plan vs Pro: Which Limit Structure Fits You?

    • Individual power user: Pro ($20/month) with throttling eliminated is a strong option.
    • Team with Claude Code needs: Team Premium seats ($100/seat/month annually) give Claude Code access, 6.25× multiplier, and the doubled 5-hour window.
    • Team without Claude Code needs: Standard Team seats ($20/seat/month annually) for shared access at higher limits than individual Pro.

    Frequently Asked Questions

    Did the Team plan rate limits actually double in May 2026?

    Yes. Anthropic confirmed the 5-hour rate limit doubled for Team plan subscribers following the SpaceX Colossus 1 compute deal announced May 6, 2026. This applies to both Standard and Premium seats.

    Does peak-hours throttling elimination apply to Team plan?

    The peak-hours throttling elimination was announced specifically for Pro and Max subscribers. Team plan benefits from the doubled rate limit floor; throttling elimination was not announced for Team.

    What happens when I hit a Team plan usage limit?

    Claude notifies you that you’ve reached your usage limit. With the 5-hour rolling window, you can continue once older usage rolls off — you’re not waiting for a midnight reset. Burst usage depletes the window faster than spread usage over the same period.

    Are Claude Sonnet 4 and Opus 4 still available on Team?

    They remain available but retire June 15, 2026. After that date, the active lineup is Fable 5, Opus 4.8, Sonnet 4.6, and Haiku 4.5.

    Does the 1,500% Tier 1 API increase apply to Team plan API usage?

    The 1,500% input and 900% output token increases apply to Tier 1 API customers specifically. Team plan through claude.ai uses the doubled 5-hour window. Both benefits apply in their respective contexts if you’re a Tier 1 API customer and a Team subscriber.

    Is the free plan getting any rate limit improvements?

    No. The free plan was explicitly excluded from all rate limit increases in the May 2026 SpaceX announcement.

  • Claude AI Pricing: Every Plan Explained (Free, Pro, Max, Team, Enterprise)

    Claude AI Pricing: Every Plan Explained (Free, Pro, Max, Team, Enterprise)

    🎁 Free: the Claude Cost Optimizer skill

    Paste it into Claude and it tells you the cheapest plan for your actual usage — drop your email and we’ll send it over.

    ⚡ Estimate your exact Claude cost in 20 seconds

    Skip the reading — the live calculator prices your real usage across every plan and the API.

    → Open the Claude Pricing Calculator

    Looking for quick answers? The FAQ version covers every common question directly.

    → Claude Pricing FAQ

    Anthropic’s Claude pricing covers six tiers — Free, Pro, Max 5x, Max 20x, Team, and Enterprise — plus a separate pay-per-token API. Choosing the wrong path can cost you significantly more than necessary. Here’s what each option actually includes in 2026.

    What Are Claude’s Subscription Plans and Prices?

    Claude offers six tiers: Free ($0), Pro ($20/month), Max 5x ($100/month), Max 20x ($200/month), Team (from $20/seat/month billed annually), and Enterprise (custom pricing).

    Plan Price Best For
    Free $0 Casual exploration
    Pro $20/month Individual power users
    Max 5x $100/month Developers hitting Pro limits
    Max 20x $200/month Full-day heavy usage
    Team Standard $20/seat/month (annual) · $25 monthly Collaborative teams
    Team Premium $100/seat/month (annual) · $125 monthly Developer teams needing Claude Code
    Enterprise Custom Large orgs with compliance needs

    What Does the Claude Free Plan Include?

    The Free plan gives you access to Claude on web, iOS, Android, and desktop with no credit card required, subject to rolling usage limits.

    The Free plan gives you access to Claude on web, iOS, Android, and desktop with no credit card required. It includes text, image, and code generation plus web search. Usage limits are intentionally opaque — Anthropic doesn’t publish exact message caps — but limits reset on a rolling 5-hour window. The Free tier is designed for exploration, not sustained daily work.

    Is Claude Pro Worth $20 a Month?

    Pro delivers substantially more usage than Free, plus Claude Code, unlimited projects, the Research feature, and Google Workspace integration — sufficient for most individual developers and writers.

    Pro delivers substantially more usage than Free, Claude Code in the terminal, unlimited projects, the Research feature, file creation, code execution, and Google Workspace integration. Usage still has limits — Anthropic does not publish exact message counts, but heavy sessions will reach the ceiling — but it’s sufficient for most individual developers and writers. Annual billing brings the effective rate to $17/month.

    What Is the Difference Between Claude Max 5x and Max 20x?

    Max 5x ($100/month) gives you 5x Pro’s per-session usage; Max 20x ($200/month) gives you 20x — enough that rate limits stop being a practical concern for full-day development work.

    Max 5x provides 5x Pro’s per-session headroom at $100/month. Max 20x at $200/month delivers 20x Pro usage — enough that rate limits stop being a practical concern for most full-day development work. Both tiers include Claude Code, with access to Claude Opus 4.8 and Sonnet 4.6, and a 1M token context window.

    Extra usage is available on Pro, Max 5x, and Max 20x — when you hit your included limit, you can continue at standard API-rate billing with a spending cap you set.

    How Does Claude Team Plan Pricing Work?

    Team requires a minimum of 5 seats: Standard seats at $20/seat/month billed annually ($25 monthly) include collaboration features but not Claude Code; Premium seats at $100/seat/month billed annually ($125 monthly) add Claude Code for developers.

    Team requires a minimum of 5 seats and comes in two flavors. Standard seats at $20/seat/month billed annually ($25 billed monthly) include 1.25x more usage per session than Pro with a weekly reset, plus collaboration features, central billing, SSO, and Microsoft 365 and Slack integrations. Standard seats do not include Claude Code.

    Premium seats at $100/seat/month billed annually ($125 monthly) add Claude Code, making them the right choice for engineering team members. You can mix Standard and Premium seats within one Team plan — so non-technical staff get Standard while developers get Premium.

    Enterprise Plan — Custom Pricing

    Enterprise is for organizations with compliance, data residency, or governance requirements. It includes access to the full 1M token context window, HIPAA readiness, SAML SSO, domain capture, spend controls, and dedicated support. Based on user reports, pricing starts around $60/seat with a 70-seat minimum, putting the floor near $50,000 annually — contact Anthropic sales for exact figures. Training on customer data is disabled contractually at this tier.

    How Much Does the Claude API Cost Per Token?

    As of June 2026: Claude Fable 5 costs $10.00 input / $50.00 output per million tokens; Claude Opus 4.8 costs $5.00 / $25.00; Claude Sonnet 4.6 costs $3.00 / $15.00; Claude Haiku 4.5 costs $1.00 / $5.00.

    The API is entirely separate from subscription plans. You pay per million tokens (MTok) with no monthly minimum. Current rates as of June 19, 2026 (verified June 19, 2026 from Anthropic’s official models page):

    • Claude Fable 5: $10.00 input / $50.00 output per MTok
    • Claude Opus 4.8: $5.00 input / $25.00 output per MTok
    • Claude Sonnet 4.6: $3.00 input / $15.00 output per MTok
    • Claude Haiku 4.5: $1.00 input / $5.00 output per MTok

    Prompt caching cuts input costs by up to 90% for repeated context. The Batch API processes requests within 24 hours at a flat 50% discount on all tokens — ideal for content pipelines, data enrichment, and any workload where real-time responses aren’t required. As of March 2026, Anthropic eliminated long-context surcharges, so a 900K-token request costs the same per-token rate as a 9K one.

    June 2026 — Professional Services Pricing

    Managed Agents

    Token rates + $0.08/session-hour active runtime. No surcharge for Orchestration or Outcomes (public beta).

    Claude Security Beta

    Included in Enterprise during beta. Powered by Opus 4.8 ($5/$25 per MTok at API rates).

    Claude Mythos Preview

    $25/$125 per MTok. Invitation-only via Project Glasswing.

    → Full Pricing FAQ · Managed Agents pricing deep-dive

    Which Claude Plan Is Right for You?

    Start with Pro for individual use, move to Max 5x if you regularly hit limits, choose Max 20x for full-day heavy use, and use Team for groups of 5+ where Standard seats cover non-technical staff and Premium covers developers.

    Start with Pro if you’re an individual who hits Free limits regularly. Move to Max 5x if you’re a developer doing focused coding sessions. Max 20x makes sense if Claude is your primary tool throughout the workday. For teams, buy Standard seats for non-technical staff and Premium seats for developers who need Claude Code. If you’re building an application or automation that calls Claude programmatically, use the API — subscription plans don’t provide API credits and don’t reduce API costs.

    Claude API Pricing: Pay-Per-Token Rates for Every Model

    The Claude API is priced separately from claude.ai subscriptions. You pay per million tokens (MTok) consumed — input and output priced separately. There is no monthly minimum; you add credits and they deplete as you use the API.

    Model Input (per MTok) Output (per MTok) Context Window
    Claude Opus 4.8 $5.00 $25.00 1M tokens
    Claude Sonnet 4.6 $3.00 $15.00 1M tokens
    Claude Haiku 4.5 $1.00 $5.00 200K tokens

    Prompt caching reduces costs significantly for repeated context: cache write is 25% of base input price, cache read is 10%. The Batch API offers 50% off all models for non-time-sensitive work. For a full breakdown of how to minimize token spend, see Claude on a Budget: the Complete Guide.

    How Does Claude Pricing Compare to GPT-4o and Gemini 2.0?

    Model Input (per MTok) Output (per MTok)
    Claude Sonnet 4.6 $3.00 $15.00
    Claude Haiku 4.5 $1.00 $5.00
    GPT-4o (OpenAI) $2.50 $10.00
    Gemini 2.0 Flash $0.075 $0.30
    Gemini 2.5 Pro $1.25 $10.00

    Claude Sonnet 4.6 sits above GPT-4o on price but competes at or above it on reasoning tasks. Claude Haiku 4.5 is the cost-competitive option for high-volume pipelines. Gemini 2.0 Flash is significantly cheaper for commodity tasks; the trade-off is reasoning depth and context handling on complex documents.

    How Much Does a Claude License Cost for Business?

    A Claude business license is sold per seat: Team Standard seats cost $20/seat/month billed annually ($25 monthly), Team Premium seats with Claude Code cost $100/seat/month billed annually ($125 monthly), with a 5-seat minimum. Enterprise licenses are custom-priced annual contracts.

    License typeAnnual billingMonthly billingMinimum seatsClaude Code
    Team Standard seat$20/seat/month$25/seat/month5No
    Team Premium seat$100/seat/month$125/seat/month5Yes
    Enterprise licenseCustom (annual contract — contact sales)~70 (reported)Yes

    If you’re writing a budget request or procurement document, here are the numbers that matter: a 10-person team with 7 Standard and 3 Premium seats runs $440/month on annual billing — $5,280/year. Licenses are managed centrally with consolidated billing, SSO, and admin controls, and you can mix Standard and Premium seats within one plan. A Claude license covers the claude.ai apps and (on Premium seats) Claude Code; it does not include API credits, which are billed separately per token. There is no perpetual or one-time license option — all Claude licensing is subscription-based.

    How Much Does Claude Code Cost?

    Claude Code has no standalone price — it’s included with Pro ($20/month), Max 5x ($100/month), Max 20x ($200/month), Team Premium seats ($100/seat/month annual), and Enterprise. Alternatively, run it against an API key and pay per token.

    PlanClaude Code included?Usage headroom
    FreeNo
    Pro ($20/mo)YesStandard Pro limits — enough for an hour or two of daily coding
    Max 5x ($100/mo)Yes5x Pro — sustained daily development
    Max 20x ($200/mo)Yes20x Pro — full-day heavy use and parallel sessions
    Team StandardNo
    Team Premium ($100/seat annual)YesPer-seat developer allocation
    EnterpriseYes (Premium seats)Custom
    API key (pay-per-token)YesNo plan limits — billed at standard model token rates

    For automation — cron jobs, CI pipelines, claude -p scripts — note the June 15, 2026 change: subscription plans get a monthly Agent SDK credit pool (Pro $20, Max 5x $100, Max 20x $200, Team Standard $20/seat, Team Premium $100/seat), with overage billed at API rates. Full details in the Agent SDK dual-bucket billing guide. For the complete tier-by-tier breakdown including API-key economics, see the full Claude Code pricing guide.

    What Are Claude’s Usage Limits and Extra Usage Costs?

    Every Claude plan has usage limits that reset on a rolling 5-hour window, plus weekly caps on paid tiers. When you hit a paid plan’s limit, you can either wait for the reset or buy extra usage at standard API token rates with a spending cap you control.

    PlanRelative usageReset windowExtra usage available?
    FreeBaseline (light use)Rolling 5 hoursNo — upgrade required
    Pro~5x FreeRolling 5 hours + weekly capYes — API rates, capped by you
    Max 5x5x ProRolling 5 hours + weekly capYes
    Max 20x20x ProRolling 5 hours + weekly capYes
    Team Standard1.25x Pro per seatWeekly resetYes (admin-controlled)
    Team PremiumHigher, includes Claude CodeWeekly resetYes (admin-controlled)

    Anthropic intentionally doesn’t publish exact message counts — limits are measured in compute, so long conversations, large file uploads, and Opus-heavy sessions consume your window much faster than short Haiku chats. For the full mechanics, see Claude Team plan usage limits and Claude API rate limits.

    Claude Pricing by Country: UK, Australia, India, and Canada

    Anthropic charges the same USD list price in every country — Claude Pro is $20/month worldwide. Your bank converts to local currency, and applicable local tax (VAT or GST) is added at checkout.

    CountryClaude Pro (approx. local)Claude Max 5x (approx. local)Tax added at checkout
    United Kingdom≈ £16/month≈ £79/month20% VAT
    Australia≈ A$31/month≈ A$153/month10% GST
    India≈ ₹1,700/month≈ ₹8,600/month18% GST
    Canada≈ C$27/month≈ C$137/monthGST/HST (5–15% by province)
    New Zealand≈ NZ$33/month≈ NZ$166/month15% GST

    Local-currency figures are approximate conversions at June 2026 exchange rates — your card statement reflects your bank’s rate plus any foreign-transaction fee. There is no region-specific discount pricing for claude.ai plans, and API token rates are likewise USD-denominated everywhere. Prices shown on Anthropic’s pricing page exclude applicable tax.

    Frequently Asked Questions: Claude Pricing

    How much does Claude cost per month?

    Claude costs $0 (Free), $20/month (Pro), $100/month (Max 5x), or $200/month (Max 20x) for individual plans. Team plans start at $20/seat/month (annual billing, 5-seat minimum). API access is pay-per-token with no monthly minimum.

    Is there a free version of Claude?

    Yes. The Free plan gives access to Claude on web, iOS, Android, and desktop with no credit card required. Usage limits apply and reset on a rolling 5-hour window. The Free tier is suitable for light, exploratory use but not sustained daily work.

    What does Claude Pro include at $20/month?

    Pro includes approximately 5x the usage of Free, Claude Code in the terminal, unlimited projects, the Research feature, file creation, code execution, and Google Workspace integration. Annual billing brings the effective rate to $17/month.

    What is the cheapest way to use Claude?

    The Free plan is the cheapest at $0. For API access, Claude Haiku 4.5 at $1 input / $5 output per MTok is the most cost-efficient model. Combined with the Batch API (50% discount) and prompt caching, high-volume workflows can run at a fraction of standard API cost.

    What is Claude Max and is it worth $100–$200 per month?

    Claude Max comes in two tiers: Max 5x at $100/month gives 5x Pro’s per-session usage, and Max 20x at $200/month gives 20x. Max is worth it if you’re hitting Pro limits regularly during development or coding sessions. Both include Claude Code and the full 1M token context window with Claude Opus 4.8 and Sonnet 4.6.

    How does Claude Team pricing work?

    Team plans require a minimum of 5 seats. Standard seats cost $20/seat/month billed annually ($25 monthly) and include collaboration features. Premium seats cost $100/seat/month billed annually ($125 monthly) and add Claude Code — the right choice for developers on the team. You can mix Standard and Premium seats within the same Team plan.

    Does Claude Pro give you access to Claude Opus 4.8?

    Pro gives you access to Claude’s models including Opus 4.8 for complex tasks, Sonnet 4.6, and Haiku 4.5, subject to usage limits. The Max tiers give you significantly more headroom to use Opus 4.8 for extended sessions. For unlimited, predictable API access to Opus 4.8, use the API directly at $5 input / $25 output per million tokens.

    What is the Claude API cost per million tokens in 2026?

    As of June 2026 (verified from Anthropic’s official docs): Claude Opus 4.8 costs $5.00 input / $25.00 output per million tokens; Claude Sonnet 4.6 costs $3.00 input / $15.00 output; Claude Haiku 4.5 costs $1.00 input / $5.00 output. The Batch API offers 50% off all models for non-real-time work.

    Does Claude have a student discount?

    There is no individual self-serve student discount, but Anthropic now offers an Education plan with discounted rates for universities and their members — check whether your institution participates. Otherwise students can use the Free tier without a credit card, and the cheapest paid path is Pro at $17/month with annual billing.

    Can I use Claude without a subscription by paying per use?

    Not directly through claude.ai — the website only offers Free, Pro, Max, and Team subscription plans. Pay-per-use access is available only through the Claude API, which requires a developer account. API pricing starts at $1 input / $5 output per million tokens for Haiku 4.5 with no monthly minimum charge.

    How much does the Anthropic Console (Claude Console) cost?

    The Anthropic Console itself is free — it’s the developer dashboard for managing API keys, tracking usage, and testing prompts in the Workbench. You only pay for the API tokens you consume, starting at $1 input / $5 output per million tokens for Haiku 4.5. You add prepaid credits to get started; there is no monthly platform fee.

    How much is a Claude license for business?

    Claude business licensing is per-seat: Team Standard seats cost $20/seat/month billed annually ($25 monthly), and Team Premium seats with Claude Code cost $100/seat/month billed annually ($125 monthly), with a 5-seat minimum. Enterprise licenses are custom annual contracts. There is no perpetual license — all Claude licensing is subscription-based.

    Does the Claude desktop app cost extra?

    No. The Claude desktop app for Windows and macOS is included with every plan, including Free. Desktop, web, and mobile all share the same account and the same usage limits — there is no separate desktop pricing.

    Is Claude cheaper in India, the UK, or Australia?

    No — Anthropic charges the same USD list price worldwide. Claude Pro is $20/month everywhere; your bank converts it to local currency (roughly £16, A$31, or ₹1,700) and local VAT or GST is added at checkout where applicable. There is no regional discount pricing.

    Is Claude available on Azure, AWS, or Google Cloud?

    Yes. Claude models are available through Amazon Bedrock and the Claude Platform on AWS, Google Cloud’s Vertex AI, and Microsoft Foundry. Cloud-platform pricing is token-based and aligned with Anthropic’s API rates, billed through your existing cloud account — useful if your organization has cloud spend commitments to draw down.

    Does Anthropic offer nonprofit pricing?

    Anthropic doesn’t list a standing nonprofit discount on its pricing page as of June 2026. Nonprofits typically start with Team at standard rates or contact Anthropic sales about Enterprise terms. An Education plan with discounted rates does exist for universities and their members.

    May 2026: Managed Agents & Claude Security Pricing

    Updated June 19, 2026

    Anthropic’s professional services now include Managed Agents and Claude Security. Pricing for both is API-based, not subscription-based.

    Claude Managed Agents Pricing

    Managed Agents pricing follows the standard API token rates for whichever Claude model you use inside the agent pipeline — there’s no separate Managed Agents surcharge on top of model costs. You pay for the tokens the models consume:

    Component Model Used Input / Output per MTok Status
    Multiagent Orchestration Your choice Model rate applies Public beta
    Outcomes Your choice Model rate applies Public beta
    Dreaming (memory refinement) Advisor model (short plan) + executor model Billed separately by role Developer preview

    The Dreaming advisor tool uses a short-plan generation (typically 400–700 tokens) at the advisor model’s rate, while the executor handles full output at its lower rate — keeping combined cost well below running the advisor model end-to-end. Use max_uses to cap advisor calls per request. Requires beta header: anthropic-beta: advisor-tool-2026-03-01. Docs: platform.claude.com/docs/en/managed-agents/dreams

    Claude Security Beta Pricing

    Claude Security is currently in public beta for Enterprise customers. Anthropic has not published a standalone per-scan or per-seat price for Claude Security Beta — access is included as part of Enterprise during the beta period. Underlying model is Claude Opus 4.8 ($5 input / $25 output per million tokens at API rates). For Enterprise pricing including Claude Security, contact Anthropic sales.

    Claude Mythos Preview Pricing (Project Glasswing)

    Claude Mythos Preview is not available via standard API or any subscription tier. Through Project Glasswing (invitation-only, defensive cybersecurity workflows): $25 per million input tokens, $125 per million output tokens. No self-serve access — contact Anthropic for Glasswing information at anthropic.com/glasswing.

    What to do next

    Now that you have the price — here’s how to actually run it

    Knowing the cost is step one. The harder questions are whether Managed Agents is the right architecture for your use case, how it compares to building on the raw API, and what a realistic monthly bill looks like at scale.


    Claude Pricing Calculator (Updated June 19, 2026)

    Use this tool to figure out which Claude plan actually fits your usage, what you’d pay on the API equivalent, and how the new June 15, 2026 Agent SDK billing change affects your costs. All rates verified against Anthropic’s official pricing documentation as of June 19, 2026.

    Tell us how you use Claude





    2 = roughly 30 hours of normal Claude use per month


    Output is typically ~25% of input for chat work


    $ value of unattended Claude work (cron jobs, scripts, GitHub Actions). 0 if you only chat.

    Email me this breakdown

    Get your numbers in your inbox so you can compare plans later — or forward them to whoever approves the budget.

    This calculator uses Anthropic’s published API rates as of June 19, 2026. Subscription pricing reflects current public plans. The Agent SDK monthly credit pool launches June 15, 2026 — Pro $20, Max 5x $100, Max 20x $200, Team Standard $20/seat, Team Premium $100/seat.

    What Claude Actually Costs: Six Worked Examples (June 2026)

    The calculator above is interactive; these are the same calculations worked through for six common usage profiles, using Anthropic’s published rates as of June 19, 2026. API-equivalent figures assume standard rates with no prompt caching or batch discounts.

    ProfileMonthly usageBest planPlan costAPI equivalent
    Casual user — questions a few times a week0.5M in / 0.13M out (Sonnet 4.6)Free, or Pro for headroom$0–$20≈ $3.45/mo
    Individual writer or analyst — daily use2M in / 0.5M out (Opus 4.8)Pro$20 ($17 annual)≈ $22.50/mo
    Developer — focused daily coding with Claude Code10M in / 2.5M out (Opus 4.8)Max 5x$100≈ $112.50/mo
    Power user — Claude open all day, parallel sessions30M in / 7.5M out (Opus 4.8)Max 20x$200≈ $337.50/mo
    5-person team — 3 non-technical, 2 developersMixedTeam: 3 Standard + 2 Premium$260/mo (annual billing)Varies by usage
    High-volume pipeline — classification or enrichment50M in / 10M out (Haiku 4.5, Batch API)API direct≈ $50/mo (after 50% batch discount)

    The pattern: subscriptions beat the API whenever usage is steady and interactive — Pro pays for itself at roughly 2M input tokens a month on Opus 4.8. The API wins for spiky automated workloads, anything that can use the Batch API, and pipelines that run on Haiku 4.5. A reasonable rule of thumb: if your monthly API equivalent lands more than about 50% above a subscription price, take the subscription.

    Next Steps: What to Read After This

    You came here for pricing. Depending on what you actually need to do next, these are the right places to go:

    If you’re deciding whether to subscribe

    Is Claude Free? What You Actually Get Without Paying

    Walk through the free tier limits and decide if you need to pay at all.

    If you’re working at a team or company

    Claude Team Plan: When to Upgrade and What You Get

    Per-seat pricing, shared usage limits, admin controls, and when Team beats individual Pro.

    If you’re running automation or scripts

    Claude Agent SDK Dual-Bucket Billing: What Changes June 15, 2026

    The new Agent SDK credit pool, what it covers, and what to do before the cutover.

    If you want to actually start building

    Anthropic Console: The Complete Guide to Getting Started

    Set up an API key, navigate the console, and run your first request.

    If you’re a student looking to save

    Claude Student Discount: The Honest Guide to Getting Claude for Less

    No public student discount exists, but here are the legitimate paths to free or reduced access.

    If you’re choosing which model to use

    Claude Models Roadmap May 2026: Opus 4.8, Knowledge Cutoffs, the 1M Context Window

    The current lineup, what each tier costs, and what’s actually verified about Claude 5.

    For the broader operating philosophy of how Claude fits alongside the rest of a working AI stack, see The Three-Legged Stack: Why I Run Everything on Notion, Claude, and Google Cloud.

    Related Claude pricing guides

  • Claude Student Discount: Every Legit Route to Cheaper Claude (2026)

    Claude Student Discount: Every Legit Route to Cheaper Claude (2026)

    Last verified: June 13, 2026

    There is no public “Claude student discount” code, and as of June 13, 2026 Anthropic does not publish a percentage-off student price on Claude Pro. What actually exists is better than a coupon for many students: free Claude through a participating university, a paid campus program, free API credits to test against, and a genuinely capable free tier. Below is every route we could verify against a primary source — who qualifies, what you get, the real cost, and how to claim it. Anything we could not confirm from an official Anthropic or GitHub page is listed at the end as “not verified,” not in the tables.

    The routes at a glance

    Lift any single row. Each route is verified against the source linked in the last column’s footnote. “Cost” is the price to the student, not the institution.

    Route Who qualifies What you get Cost How to get it
    Claude for Education Students, faculty & staff at a partner university Claude’s premium features, incl. Learning Mode & Claude Code, provided institution-wide Free to the student (institution buys a university-wide plan) Sign in to claude.ai with your school email; access is provisioned by your school
    Claude Campus Program — Ambassadors Selected students at eligible campuses Claude Pro access, API credits, paid stipend; lead AI initiatives on campus Free + paid (you are paid a stipend) Apply during an open cohort at claude.com/programs/campus (Spring 2026 round closed)
    Claude Campus Program — Builder Clubs Students starting/joining an Anthropic-supported campus club Claude Pro access and monthly API credits for members; run hackathons & workshops Free Apply via claude.com/programs/campus when a cohort is open
    Free API credits Anyone with a new Claude Console account “A small amount of free credits to test the API” (no fixed amount published by Anthropic) Free, one-time Create an account at console.anthropic.com / platform.claude.com
    Claude free tier Anyone, no enrollment needed Web/mobile/desktop chat, web search, file creation, code execution, extended thinking, connectors $0 Sign up at claude.ai
    Academic / research API discount Academic & research users (case-by-case) “Academic and research discounts may be available” on API usage Negotiated Contact Anthropic sales

    The “discount” that isn’t — avoid these

    Most “Claude student discount code” pages rank for a deal that does not exist. There is no Anthropic-issued promo code that takes a percentage off Claude Pro for individual students. Do not enter a code from a coupon aggregator, and do not buy “discounted Claude Pro” from a third-party reseller — shared or resold accounts violate Anthropic’s terms and can be revoked.

    Claim you’ll see Reality
    “Use this Claude Pro student promo code for X% off” Not real. Anthropic publishes no individual-student discount code on Pro as of June 13, 2026. Verify with your university route instead.
    “Buy cheap shared Claude Pro / Max accounts” Avoid. Reselling and account-sharing breach Anthropic’s terms; access can be terminated. Not a legitimate route.

    Route detail: free Claude through your university

    Claude for Education is Anthropic’s official higher-education program. When a university buys in, eligible students, faculty, and staff get Claude’s premium capabilities — including Learning Mode (which Anthropic describes as working “like a tutor — it asks the questions that help you find the answers yourself”) and Claude Code for teaching programming. The student does not pay; the institution licenses a university-wide plan and provisions accounts, typically tied to your school email domain. If your school is not yet a partner, the only action available to you is to ask your IT or student-services team to contact Anthropic’s education team — there is no individual sign-up for this plan.

    Route detail: the Claude Campus Program

    The Campus Program runs in cohort rounds and has two student tracks. Campus Ambassadors work directly with Anthropic to lead AI-education efforts on campus and receive Claude Pro access plus API credits and a paid stipend. Builder Clubs let students set up an Anthropic-supported organization for AI builders on their campus; members get Claude Pro access and monthly API credits and run hackathons, workshops, and demo nights. Applications open and close by cohort — the Spring 2026 round is in session and closed; watch claude.com/programs/campus for the next intake.

    Route detail: free API credits and the free tier

    If you want to build with Claude rather than chat, create a Claude Console account: Anthropic’s pricing documentation states that “new users receive a small amount of free credits to test the API.” Anthropic does not publish a fixed dollar figure on that page, so treat any specific number you see elsewhere as unverified. Separately, the no-cost Claude free tier covers a lot of student work on its own — chat across web, iOS, Android, and desktop, plus web search, file creation, code execution, extended thinking, and connectors. For heavier API use, Anthropic also notes that “academic and research discounts may be available” — a sales conversation, not a self-serve coupon.

    A note on GitHub Copilot (read before you rely on it)

    Many guides still claim verified students get free GitHub Copilot Pro — which includes Anthropic’s Claude models — through the GitHub Student Developer Pack. As of June 13, 2026, GitHub’s own documentation tells a narrower story: the two ways to qualify for free Copilot Pro are being a verified teacher on GitHub Education or a maintainer of a popular open-source repository. GitHub’s docs also state that, starting April 20, 2026, “new sign-ups for Copilot Pro, Copilot Pro+, Copilot Max, and student plans are temporarily paused,” and the Student Pack page itself shows Copilot sign-ups paused. Because the student-Copilot path is in flux, we are keeping it out of the verified routes table — check GitHub Education for current status before counting on Claude-via-Copilot.

    For comparison: what Claude costs without a discount

    These are the standard consumer prices (USD), so you can judge whether a route is worth the effort. Prices verified from claude.com/pricing on June 13, 2026.

    Plan Price Notable inclusions
    Free $0 Chat, web search, file creation, code execution, extended thinking, connectors
    Pro $17/mo billed annually ($200 upfront), or $20/mo monthly Higher usage, Claude Code, unlimited projects, Research access, more model options
    Max From $100/mo 5x or 20x Pro usage, elevated output limits, early features, priority during peak
    Team $20/seat/mo annual ($25 monthly); 5–150 people Enterprise search, SSO, admin controls, central billing

    FAQ

    Is there a Claude student discount code?

    No. As of June 13, 2026 Anthropic does not publish an individual-student discount code for Claude Pro. The legitimate ways to save are free Claude through a partner university (Claude for Education), the Claude Campus Program, free API credits, and the free tier. Treat any “promo code” from a coupon site as not real.

    How do I get free Claude Pro as a student?

    Through your school. If your university participates in Claude for Education, sign in to claude.ai with your school email and your account is provisioned with premium features at no cost to you. If your school isn’t a partner, ask IT or student services to contact Anthropic’s education team — there is no individual self-serve sign-up for this plan.

    Do I still get Claude through the GitHub Student Developer Pack?

    It’s uncertain right now. GitHub’s documentation currently lists only verified teachers and popular open-source maintainers as qualifying for free Copilot Pro, and states that new student-plan sign-ups are temporarily paused as of April 20, 2026. Check GitHub Education for current status before relying on Claude-via-Copilot.

    How much free API credit does a new account get?

    Anthropic’s pricing docs say new users receive “a small amount of free credits to test the API” but do not publish a fixed dollar amount on that page. Any specific figure you see elsewhere is not officially confirmed. Create an account at console.anthropic.com to see your current credit.

    What does the Claude Campus Program pay?

    Campus Ambassadors receive Claude Pro access, API credits, and a paid stipend for leading AI-education work on campus; Builder Club members get Claude Pro access and monthly API credits. Applications run in cohorts — the Spring 2026 round is closed; watch claude.com/programs/campus for the next one. The exact stipend amount is not published on the official program page.


  • Claude API Access from Singapore and China: What Actually Works in 2026

    Claude API Access from Singapore and China: What Actually Works in 2026

    Last refreshed: May 15, 2026

    If you are a developer in Singapore or China trying to use Claude, you have already noticed that the standard instructions don’t quite apply to you. The console.anthropic.com onboarding assumes a US billing address. The latency numbers assume you are pinging from a US data center. And for developers in mainland China, the direct API doesn’t work at all without a workaround.

    This is a practical guide to what actually works in 2026, written for the Asian developer market that is increasingly one of Claude’s most active audiences.

    Singapore: What Works Directly

    Singapore is a fully supported country for the Anthropic API. You can create an account at console.anthropic.com, add a payment method, and generate API keys with no restrictions. Most major international credit cards work without issues. If you are at a company with a Singapore entity, Anthropic accepts international wire transfers for enterprise contracts.

    Latency from Singapore to Anthropic’s US API endpoints typically runs 180–250ms round-trip depending on your ISP and the model you are calling. For most application use cases this is acceptable. For latency-sensitive real-time applications — voice interfaces, live coding assistants — you will want to route through a closer compute layer, which is where Vertex AI becomes relevant.

    Vertex AI: The Regional Solution for Both Markets

    Google Cloud’s Vertex AI hosts Claude models (Sonnet and Haiku tiers as of mid-2026) and has a data center in Singapore: asia-southeast1. This is the cleanest solution for developers in both Singapore and the broader Asia-Pacific region who want lower latency and enterprise-grade SLAs.

    The practical difference: instead of calling api.anthropic.com, you call a Vertex AI endpoint scoped to asia-southeast1. Your tokens are processed in Singapore, not Virginia. For regulated industries — fintech, healthcare, legal — this also means your data doesn’t leave the region, which is a compliance requirement in several Singapore regulatory frameworks (MAS TRM guidelines being the primary one).

    To get started with Claude on Vertex AI from Singapore:

    1. Create a GCP project and enable the Vertex AI API
    2. Request access to Claude models via the Vertex AI Model Garden (approval is typically same-day for Singapore accounts)
    3. Set your region to asia-southeast1 in all API calls
    4. Authenticate via a GCP service account rather than an Anthropic API key

    The pricing on Vertex AI is comparable to direct Anthropic API pricing, with GCP committed use discounts available at higher volumes.

    AWS Bedrock: The Other Regional Option

    Amazon Bedrock also hosts Claude models and has a Singapore region (ap-southeast-1). If your infrastructure is already on AWS, this is often the simpler path. The setup mirrors Vertex AI: enable Bedrock in your AWS console, request Claude model access, and specify the Singapore region in your SDK calls.

    The practical consideration: as of mid-2026, model availability on Bedrock sometimes lags behind the direct Anthropic API by a few weeks when new versions ship. If being on the latest Claude version immediately matters for your use case, the direct API or Vertex AI are more current.

    China: The Honest Situation

    The direct Anthropic API is not accessible from mainland China without a VPN. Console.anthropic.com is not blocked at the DNS level in the same way Google is, but connectivity is unreliable and payment processing from Chinese-issued cards through Stripe (Anthropic’s payment processor) fails for most users.

    The workarounds that Chinese developers are actually using in 2026:

    VPN plus international card. Developers with access to a VPN and an international payment card (Hong Kong or Singapore bank account) use the direct API without issues. This is the most common setup among individual developers and small teams.

    Hong Kong entity. Companies with a Hong Kong subsidiary or registered office use that entity for the Anthropic API account. Hong Kong is a fully supported region with no connectivity issues.

    Third-party API proxies. Several API aggregators operating out of Hong Kong and Singapore re-sell Anthropic API access to mainland China developers. Quality and terms vary significantly — vet carefully before using in production.

    Vertex AI via a non-China GCP account. Some development teams maintain a GCP account registered to a Singapore or Hong Kong entity, then call the Vertex AI Claude endpoint from within China via GCP’s global network. Google Cloud has limited but operational connectivity from within China through its global backbone. This is the most enterprise-appropriate solution for teams that need a compliant path.

    Latency Reality Check by Access Method

    Access Method From Singapore From China (with VPN)
    Direct Anthropic API (us-east) 180–250ms 300–500ms+
    Vertex AI (asia-southeast1) 30–60ms 150–300ms via GCP backbone
    AWS Bedrock (ap-southeast-1) 25–55ms Not directly accessible

    Latency figures are representative ranges based on typical ISP routing. Your numbers will vary.

    Payment and Billing Notes

    For Singapore developers on the direct Anthropic API: Visa, Mastercard, and American Express issued by Singapore banks work reliably. PayNow and local payment rails are not supported — you need an international card.

    For enterprise: Anthropic’s sales team handles invoiced billing for Singapore and other APAC markets. If you are spending meaningfully on the API, contact sales rather than running on a credit card — the invoiced route gives you better cost predictability and eliminates card limit friction.

    The Bottom Line

    If you are in Singapore, the direct API works and Vertex AI’s asia-southeast1 region gives you a lower-latency, compliance-friendly alternative worth evaluating for production workloads.

    If you are in mainland China, the direct API requires a workaround. A Hong Kong entity plus Vertex AI is the cleanest enterprise path. For individual developers, VPN plus an international card is the practical reality.

    The Asian developer market is using Claude at scale. The tooling is there — it just requires knowing which path to take from where you are sitting.

    Based in Singapore or Asia-Pacific?

    I can help you pick the right access path for your stack and region.

    Email me your setup — direct API, Vertex AI, or Bedrock — and I’ll give you a straight answer on what makes sense.

    Email Will → will@tygartmedia.com

  • Claude Context Window Size 2026: What 1 Million Tokens Actually Means

    Claude Context Window Size 2026: What 1 Million Tokens Actually Means

    Last refreshed: June 20, 2026

    Looking for quick answers? The FAQ version covers every common question directly.

    → Context Window FAQ

    Claude’s context window is one of those specs that sounds simple until you actually need to use it. “1 million tokens” means almost nothing without a frame of reference. This is the guide we wish existed when we started building on Claude — written from our own experience running it in production, with numbers pulled directly from Anthropic’s official documentation.

    Quick Definition

    The context window is Claude’s working memory for a conversation. It holds everything Claude can see and reason about at once: your messages, Claude’s responses, any documents you’ve shared, and system prompts. When the window fills up, earlier content drops out.

    Current Context Window Sizes by Model (June 2026)

    These numbers come directly from Anthropic’s official models page, fetched May 9, 2026. Model strings are exact API identifiers:

    Model API String Context Window Max Output
    Claude Fable 5 claude-fable-5 1,000,000 tokens 128,000 tokens
    Claude Opus 4.8 claude-opus-4-8 1,000,000 tokens 128,000 tokens
    Claude Sonnet 4.6 claude-sonnet-4-6 1,000,000 tokens 64,000 tokens
    Claude Haiku 4.5 claude-haiku-4-5-20251001 200,000 tokens 64,000 tokens

    Fable 5, Opus 4.8, and Sonnet 4.6 all have the full 1M token context window. Haiku 4.5 is 200K. The key difference between Opus 4.7 and Sonnet 4.6 in this table is the max output — Opus 4.7 can write up to 128K tokens in a single response, Sonnet 4.6 caps at 64K.

    What Does 1 Million Tokens Actually Hold?

    Token counts are an abstraction. Here’s what 1 million tokens translates to in practical terms:

    • About 750,000 words of English text — roughly 10 full-length novels, or 1,500 average blog posts
    • A full mid-size codebase — a 50,000-line Python project with comments fits comfortably
    • Hours of meeting transcripts — a full workday of recorded calls, transcribed, fits in one context window
    • Multiple large documents simultaneously — 10 research PDFs at 30 pages each, all in the same conversation
    • Long conversation histories — hundreds of back-and-forth exchanges before anything starts dropping off

    We’ve loaded entire Notion exports, full project histories, and multi-document research packs into a single Claude session. At 1M tokens, you’re unlikely to hit the ceiling in a normal working session. You hit it when you’re doing things like: loading your entire codebase plus documentation plus conversation history and then asking Claude to do a full architectural review.

    Context Window vs. Memory: What’s the Difference?

    This is where a lot of people get confused. The context window and memory are not the same thing:

    • Context window: What Claude can see right now, in this session. Once a session ends, it’s gone.
    • Memory (in claude.ai): A separate system that extracts and stores key information from past sessions. It surfaces relevant facts into future conversations as a snippet in the context.
    • Managed Agents memory stores: A developer-layer construct where agents maintain and update knowledge bases across sessions — distinct from both the context window and the consumer memory feature.

    The 1M token context window is your working memory for one session. It doesn’t persist. Memory systems are what carry information across sessions — but they work by injecting a summary into the context window of the new session, not by giving Claude access to the full history.

    Does a Bigger Context Window Mean Better Performance?

    Mostly yes, with one important nuance. More context means Claude has more information to reason about, which generally produces better outputs for tasks that benefit from full context — code reviews, document synthesis, long-form writing, multi-document comparison.

    The nuance: performance can degrade on tasks involving specific information buried deep in a very long context. This is sometimes called the “lost in the middle” problem — models tend to pay more attention to the beginning and end of a long context than the middle. Anthropic has worked on this with Claude’s architecture, and it performs well on long-context tasks, but it’s worth structuring important information at natural reference points rather than burying it in the middle of a 500-page document.

    How We Actually Use the 1M Token Window

    We run Claude in production for content operations, site management, and agentic coding workflows. Here’s where the 1M context window makes a concrete difference in our work:

    • Full site audits: Loading every post from a WordPress site (200+ posts worth of content) into one session for comprehensive SEO analysis — without having to chunk and re-prompt
    • Cross-session context: Pasting in long Notion briefings, prior session transcripts, and the current task in one go. The window is large enough that we don’t have to decide what to leave out.
    • Codebase-wide reasoning: In Claude Code, having the full project context means Claude can make changes that account for how files interact rather than reasoning only about the current file
    • Multi-document synthesis: Research projects where we load 10-15 source documents and ask Claude to synthesize across them — something that was impossible at 100K context windows

    The practical shift from 200K to 1M tokens wasn’t just “more room.” It changed what we could ask Claude to do in a single session.

    Context Window on the API: Batch Output Extension

    For API users: on the Message Batches API, Fable 5, Opus 4.8, and Sonnet 4.6 support up to 300K output tokens using the output-300k-2026-03-24 beta header. This is relevant for batch generation tasks where you need very long outputs — documentation generation, large codebases, book-length content.

    Frequently Asked Questions

    What is Claude’s context window in 2026?

    Claude Fable 5, Claude Opus 4.8, and Claude Sonnet 4.6 all have 1,000,000 token (1M token) context windows as of June 2026. Claude Haiku 4.5 has a 200,000 token context window. These are the current generally available models.

    How many pages can Claude read at once?

    At 1M tokens, Claude can hold roughly 750,000 words of English text — equivalent to approximately 3,000 average pages. In practice, a typical 20-page PDF is roughly 10,000-15,000 tokens, so you could load 60-100 such documents in a single session before approaching the limit.

    Does the context window reset between messages?

    No — the context window accumulates across an entire conversation session. Every message you send and every response Claude gives adds to the total. The window doesn’t reset between individual messages; it resets when you start a new conversation.

    What happens when Claude hits the context window limit?

    When a conversation reaches the context window limit, earlier messages begin to drop out of the active context. Claude can no longer reference information from those earlier messages — it effectively forgets that part of the conversation. In the claude.ai interface, you’ll see a notification when you’re approaching the limit.

    Is the 1M context window available on the free plan?

    The model available to free plan users has access to the 1M context window. However, free plan usage limits mean long-context sessions hit rate limits faster than paid plans. The window is technically available, but sustained heavy use of it is more practical on paid tiers.

    What’s the difference between Claude Opus 4.8 and Sonnet 4.6 context windows?

    Both have the same 1M token input context window. The difference is max output: Opus 4.8 can generate up to 128,000 tokens in a single response; Sonnet 4.6 caps at 64,000 tokens. For most tasks this distinction doesn’t matter, but for very long document generation or large code outputs, Opus 4.8 has the higher output ceiling.

  • Claude Fable 5 vs Opus 4.8 vs Sonnet vs Haiku: Model Comparison (June 2026)

    Claude Fable 5 vs Opus 4.8 vs Sonnet vs Haiku: Model Comparison (June 2026)

    Updated June 12, 2026

    Claude Fable 5 launched June 9, 2026 as a new tier above Opus 4.8 — priced at $10/$50/MTok (2× Opus). This guide now covers all four models. Full Fable 5 breakdown →

    

    Anthropic’s Claude model lineup in 2026 now spans four tiers: Fable 5 at the top for maximum capability ($10/$50/MTok), Opus 4.8 for serious production work ($5/$25), Sonnet 4.6 for the best balance of performance and cost ($3/$15), and Haiku 4.5 for speed and high-volume work ($1/$5). Picking the wrong model costs money or performance — sometimes both. This guide covers every meaningful difference so you can make the right call.

    Quick answer: Sonnet 4.6 handles 80–90% of tasks at a fraction of the cost of higher tiers. Use Fable 5 for the hardest engineering and long-horizon agentic work ($10/$50/MTok). Use Opus 4.8 for serious production work with zero data retention requirements ($5/$25). Use Sonnet 4.6 as your daily driver ($3/$15). Use Haiku 4.5 when speed and cost dominate ($1/$5).

    The Current Claude Model Lineup (June 2026)

    Claude Fable 5 vs Opus 4.8 vs Sonnet 4.6 vs Haiku 4.5: side-by-side

    Feature Claude Fable 5 🆕 Claude Opus 4.8 Claude Sonnet 4.6 Claude Haiku 4.5
    Best for Hardest engineering, long-horizon autonomy Production work, zero-data-retention Best speed/intelligence balance Fastest responses, high-volume tasks
    Input price $10 / MTok $5 / MTok $3 / MTok $1 / MTok
    Output price $50 / MTok $25 / MTok $15 / MTok $5 / MTok
    Context window 1M tokens 1M tokens 1M tokens 200k tokens
    Max output 128k tokens 128k tokens 64k tokens 64k tokens
    Extended thinking No (adaptive always on) No Yes Yes
    Adaptive thinking Always on Yes Yes No
    Zero data retention No (30-day mandatory) Yes Yes Yes
    Latency Slow–Moderate Moderate Fast Fastest
    API ID claude-fable-5 claude-opus-4-8 claude-sonnet-4-6 claude-haiku-4-5

    As of June 2026, Anthropic’s four current models are Claude Fable 5, Claude Opus 4.8, Claude Sonnet 4.6, and Claude Haiku 4.5. All four support text and image input, multilingual output, and vision processing. They differ significantly in pricing, context window, output limits, and capability.

    Feature Fable 5 🆕 Opus 4.8 Sonnet 4.6 Haiku 4.5
    Input price $10 / MTok $5 / MTok $3 / MTok $1 / MTok
    Output price $50 / MTok $25 / MTok $15 / MTok $5 / MTok
    Context window 1M tokens 1M tokens 1M tokens 200K tokens
    Max output 128K tokens 128K tokens 64K tokens 64K tokens
    Extended thinking No (adaptive always on) No Yes Yes
    Adaptive thinking Always on Yes Yes No
    Latency Slow–Moderate Moderate Fast Fastest
    Reliable knowledge cutoff 2026 Jan 2026 Aug 2025 (reliable) Feb 2025 (reliable)

    Pricing is per million tokens (MTok) via the Claude API. Source: Anthropic Models Overview, June 2026.

    Claude Fable 5: The New Top Tier (June 9, 2026)

    Fable 5 is Anthropic’s first Mythos-class model released for general availability. It landed June 9, 2026 and sits above Opus 4.8 in capability — scoring 95.0% on SWE-bench Verified (vs 88.6% for Opus 4.8) and 80.0% on SWE-bench Pro (vs 69.2%). On the Senior Engineer benchmark, Fable 5 scores 91/100 vs approximately 63/100 for Opus 4.8.

    Key differentiators for Fable 5:

    • Adaptive thinking always on — Fable 5 doesn’t have an extended thinking toggle. It always reasons adaptively, scaling depth to task complexity.
    • 128K max output — same as Opus 4.8, twice Sonnet’s 64K cap.
    • 1M token context window — same as Opus 4.8 and Sonnet 4.6.

    Two constraints that matter:

    • Mandatory 30-day data retention. Fable 5 is not available under zero data retention. If your use case requires ZDR (healthcare, legal, finance with strict data handling), use Opus 4.8.
    • Safety classifier routing. Prompts touching cybersecurity, biology, chemistry, and distillation route to an Opus 4.8 fallback — at Fable 5 pricing. If your workload is in these domains, the upgrade is less impactful.

    Use Fable 5 for: large migrations or refactors, multi-agent orchestration at frontier quality, long-horizon agentic work, complex scientific analysis, and any task where quality on hard problems justifies 2x cost over Opus.

    Skip Fable 5 for: well-scoped routine work, high-volume pipelines (2x cost compounds), ZDR-required use cases, or domains where the safety classifier fallback applies.

    Claude Opus 4.8: The Production Standard

    Opus 4.8 is Anthropic’s most capable model supporting zero data retention (ZDR) — the right default for most production API work. Fable 5 has since surpassed it in raw capability, but Opus 4.8 remains the better choice for ZDR workloads, cost-sensitive pipelines, and domains where Fable 5’s safety classifier routing applies. Anthropic describes it as a step-change improvement in agentic coding over Opus 4.8, with a new tokenizer that contributes to improved performance on a range of tasks. Note that this new tokenizer may use up to 35% more tokens for the same text compared to previous models — a cost consideration worth factoring in for high-volume workflows.

    Key differentiators for Opus 4.8 over the other two models:

    • 128K max output tokens — double Sonnet and Haiku’s 64K cap. This matters for generating long-form code, detailed reports, or complete document drafts in a single call.
    • 1M token context window — same as Sonnet 4.6, meaning Opus can process entire codebases or book-length documents in a single session.
    • Adaptive thinking — Opus 4.8 and Sonnet 4.6 both support adaptive thinking, which lets the model adjust reasoning depth based on task complexity.
    • Most recent knowledge cutoff — January 2026, versus August 2025 (reliable) for Sonnet and February 2025 (reliable) for Haiku.

    Opus does not support extended thinking — that capability lives on Sonnet 4.6 and Haiku 4.5 Extended thinking lets the model reason step-by-step before generating output, which is particularly useful for complex math, science, and multi-step logic problems.

    Use Opus 4.8 for: complex architecture decisions, large codebase analysis, multi-agent orchestration tasks, outputs that require more than 64K tokens, tasks demanding the latest possible knowledge, and any work where you need Opus-tier reasoning with zero data retention (Fable 5 is the absolute frontier, but does not support ZDR).

    Skip Opus 4.8 for: routine content generation, customer support pipelines, high-volume classification or extraction, real-time applications requiring low latency, or any task where Sonnet scores within your acceptable quality threshold.

    Claude Sonnet 4.6: The Workhorse

    Sonnet 4.6 is the model Anthropic recommends as the best combination of speed and intelligence. Released in February 2026, it delivers a 1M token context window at $3 input / $15 output per million tokens — the same context window as Opus at 40% lower cost.

    Sonnet 4.6 also uniquely offers extended thinking, which Opus 4.8 does not. When extended thinking is enabled, Sonnet can perform additional internal reasoning before generating its response — useful for reasoning-heavy tasks like complex debugging, multi-step research, and technical problem-solving where chain-of-thought depth matters.

    For developers and teams using Claude Code, Sonnet 4.6 is the standard daily driver. It handles tool calling, agentic workflows, and multi-file code reasoning reliably, at a price point that makes heavy daily use economically viable.

    Use Sonnet 4.6 for: most production workloads, Claude Code sessions, long-document analysis, content generation, coding tasks, research synthesis, customer-facing applications, and any workflow requiring the 1M context window where Opus’s premium isn’t justified.

    Skip Sonnet 4.6 for: high-volume pipelines where Haiku’s lower cost is acceptable, simple classification or extraction tasks, or real-time applications where Haiku’s faster latency is required.

    Claude Haiku 4.5: Speed and Volume

    Haiku 4.5 is the fastest model in the Claude family and the most cost-efficient at $1 input / $5 output per million tokens. It has a 200K token context window — smaller than Opus and Sonnet’s 1M, but still substantial for most single-task work. It supports extended thinking but not adaptive thinking.

    The 200K context limit is the most important practical constraint. Most single-document, single-task workflows fit within 200K. Multi-file codebases, long books, or extended conversation histories that push past that threshold need Sonnet or Opus.

    Haiku 4.5 has the oldest knowledge cutoff of the three: February 2025. For tasks requiring awareness of events or developments from mid-2025 onward, Haiku won’t have that context baked in.

    Use Haiku 4.5 for: content moderation, classification pipelines, entity extraction, customer support triage, real-time chat interfaces, simple Q&A, high-volume API workflows where cost and speed dominate, and any task where quality requirements are modest.

    Skip Haiku 4.5 for: complex reasoning, large codebase analysis, tasks requiring recent knowledge (post-February 2025), multi-step agent workflows, or any output requiring more than 200K tokens of input context.

    Pricing: What the Numbers Actually Mean in Practice

    All three models price output tokens at 5x the input rate — a ratio that holds across the entire Claude lineup. This means verbose, long-form outputs cost significantly more than short, targeted responses. Minimizing generated output length is the highest-leverage cost optimization available before you touch model routing or caching.

    To put the pricing in concrete terms: generating one million output tokens (roughly 750,000 words of generated text) costs $25 on Opus, $15 on Sonnet, and $5 on Haiku. For input-heavy workloads like document analysis where you’re feeding in large amounts of text but getting shorter responses, the cost gap narrows.

    Three additional pricing levers apply across all models:

    • Prompt caching: Cuts cache-read input costs by up to 90% for repeated system prompts or documents. If your application reuses a large system prompt across many requests, caching is the single highest-impact cost reduction available.
    • Batch API: Provides a 50% discount for non-time-sensitive workloads processed asynchronously. Combine with prompt caching for up to 95% savings on qualifying workflows.
    • Model routing: Running a mix of Haiku for simple tasks, Sonnet for production workloads, and Opus for complex reasoning — rather than using one model for everything — can reduce total API costs by 60–70% without meaningful quality loss on the tasks that don’t require a flagship model.

    Context Windows: 1M Tokens vs. 200K

    Opus 4.8 and Sonnet 4.6 both offer a 1M token context window at standard pricing — no premium surcharge for extended context. For reference, 1 million tokens is roughly 750,000 words, enough to hold a large codebase, a full academic textbook, or months of business communications in a single conversation.

    Haiku 4.5 has a 200K token context window. That’s still roughly 150,000 words — sufficient for most single-document tasks, but it creates a hard ceiling for anything requiring multi-file code review, book-length document analysis, or lengthy conversation histories.

    If your workflow consistently requires more than 200K tokens of input, Sonnet 4.6 is the cost-efficient choice. Opus 4.8 is the right call only when the input load requires the additional reasoning capability Opus provides, not just the context window size — because Sonnet gets you the same 1M window at 40% lower cost.

    Extended Thinking vs. Adaptive Thinking

    These are two distinct features that appear together in the comparison table but serve different purposes.

    Extended thinking (available on Sonnet 4.6 and Haiku 4.5, not Opus 4.8) lets Claude perform additional internal reasoning before generating its response. When enabled, the model produces a “thinking” content block that exposes its reasoning process — step-by-step problem decomposition before the final answer. Extended thinking tokens are billed as standard output tokens at the model’s output rate. A minimum thinking budget of 1,024 tokens is required when enabling this feature.

    Adaptive thinking (available on Opus 4.8 and Sonnet 4.6, not Haiku 4.5) adjusts reasoning depth dynamically based on task complexity — the model allocates more reasoning for harder problems and less for simpler ones, without requiring explicit configuration.

    The practical implication: if you need transparent, controllable step-by-step reasoning that you can inspect and use in your application, Sonnet 4.6’s extended thinking is often the right tool — and at lower cost than Opus.

    Which Claude Model Should You Choose?

    The right framework for model selection in mid-2026 is a four-tier stack: Fable 5 for the hardest problems, Opus 4.8 as the production standard, Sonnet 4.6 as the daily driver, Haiku 4.5 for volume. Start with Sonnet 4.6 and escalate selectively. Most production workloads — coding, writing, analysis, customer-facing applications — are well-served by Sonnet. Opus 4.8 earns its premium when you need ZDR, outputs over 64K tokens, or the January 2026 knowledge cutoff. Fable 5 earns its 2x premium when the task is genuinely hard enough that 10+ percentage points on SWE-bench matters for your outcome.

    Haiku 4.5 belongs in any pipeline where you’ve identified tasks that don’t require Sonnet’s capability. High-volume routing, triage, classification, and real-time response scenarios are Haiku’s natural territory. The optimal production routing split is roughly 70% Haiku 4.5, 20% Sonnet 4.6, 8% Opus 4.8, 2% Fable 5 — rather than using a single model for everything. That ratio cuts costs by 60–70% without meaningful quality loss on the tasks that don’t need a flagship model.

    You picked your model tier. Now get the pre-built setup.

    Claude Seed Kits are pre-configured skill files with 20 tested prompts and a setup guide for your specific use case. Pick the kit that matches how you work — $47 each.

    Solo Builder
    Creator & Independent
    Local Operator
    Field Operator
    Regulated Specialist

    Frequently Asked Questions

    What is the difference between Claude Opus 4.8, Sonnet, and Haiku?

    Opus is Anthropic’s most capable model, optimized for complex reasoning, large outputs, and agentic tasks. Sonnet offers a balance of capability and cost, handling most production workloads at lower price. Haiku is the fastest and cheapest option, suited for high-volume, lower-complexity tasks. All three share the same core Claude architecture and safety training.

    Is Claude Opus 4.8 worth the extra cost over Sonnet?

    For most tasks, no. Sonnet 4.6 handles the majority of coding, writing, and analysis work at 40% lower cost. Opus 4.8 is worth the premium when you need outputs longer than 64K tokens, maximum agentic coding capability, or the most recent knowledge cutoff (January 2026 vs. Sonnet’s August 2025).

    Which Claude model is best for coding?

    Sonnet 4.6 is the standard recommendation for most coding work, including Claude Code sessions. Opus 4.8 is preferred for large codebase analysis, complex architecture decisions, or multi-agent coding workflows where maximum reasoning depth is required. Haiku 4.5 can handle simple code edits and explanations at much lower cost.

    What is the Claude context window?

    Claude Opus 4.8 and Sonnet 4.6 both have a 1 million token context window — roughly 750,000 words of combined input and conversation history. Claude Haiku 4.5 has a 200,000 token context window. Context window size determines how much information Claude can hold and reference in a single conversation.

    Does Claude Opus 4.8 support extended thinking?

    No. Extended thinking is available on Claude Sonnet 4.6 and Claude Haiku 4.5, but not on Claude Opus 4.8 Opus 4.8 supports adaptive thinking instead, which dynamically adjusts reasoning depth based on task complexity.

    What is the cheapest Claude model?

    Claude Haiku 4.5 is the least expensive model at $1 per million input tokens and $5 per million output tokens. It is also the fastest Claude model, making it well-suited for high-volume, latency-sensitive applications.

    Can I use Claude through Amazon Bedrock or Google Vertex AI?

    Yes. All three current Claude models — Opus 4.8, Sonnet 4.6, and Haiku 4.5 — are available through Amazon Bedrock and Google Vertex AI in addition to the direct Anthropic API. Bedrock and Vertex AI offer regional and global endpoint options. Pricing on third-party platforms may vary from direct Anthropic API rates.

    Claude vs GPT-4o: Which Model Wins for Everyday Work?

    Claude Sonnet 4.6 and GPT-4o are the primary head-to-head competitors in 2026 for professional daily use. They price similarly ($3 vs $3.00 per MTok input) but perform differently depending on task type.

    Task Type Claude Sonnet 4.6 GPT-4o
    Long-document analysis (200K+ tokens) ✓ 1M context window 128K limit
    Multi-step reasoning Extended thinking available o1 series for reasoning
    Code generation Strong; Claude Code natively Strong; GitHub Copilot integration
    Instruction following Very consistent Consistent
    API cost (output) $15/MTok $10/MTok
    Context window 1M tokens 128K tokens

    The clearest differentiator is context window size. If your workflow involves analyzing full codebases, long contracts, or book-length documents in a single call, Claude Sonnet 4.6’s 1M token window eliminates chunking overhead that GPT-4o requires at 128K. For shorter tasks, either model performs comparably.

    Claude vs Gemini 2.5 Pro: How Do They Compare?

    Google’s Gemini 2.5 Pro competes directly with Claude Sonnet 4.6 on price and capability. Key differences:

    Feature Claude Sonnet 4.6 Gemini 2.5 Pro
    Input price $3.00/MTok $3.00/MTok (under 200K tokens)
    Output price $15.00/MTok $10.00/MTok
    Context window 1M tokens 1M tokens
    Extended thinking Yes Yes (2.5 Pro)
    Agentic coding Claude Code native Via Gemini API / IDX

    Gemini 2.5 Pro is cheaper on paper, especially for prompts under 200K tokens. Claude Sonnet 4.6’s advantage is instruction-following consistency on complex multi-step tasks and the Claude Code ecosystem for engineering teams already in the Anthropic stack.

    Which Claude Model Should You Use in Claude Code?

    Claude Code supports all four models. The recommended routing for most teams:

    • Fable 5 — Use for the hardest agentic tasks: large migrations, complex multi-file refactors, long-horizon autonomous workflows. Enable with claude --model claude-fable-5.
    • Opus 4.8 — Default for serious work: multi-agent orchestration, large codebase analysis, outputs over 64K tokens.
    • Sonnet 4.6 — Daily driver. Best cost-to-performance ratio for most coding tasks. Extended thinking handles complex architecture decisions.
    • Haiku 4.5 — High-frequency, low-complexity tasks: formatting, renaming, boilerplate, pipeline steps where speed matters more than depth.

    The Max plan (available on claude.ai) unlocks 1M token context in Claude Code at no additional charge, which is the practical differentiator for large codebase work.

    Frequently Asked Questions: Claude Model Comparison

    What is the best Claude model in 2026?

    Claude Sonnet 4.6 is the recommended default for most tasks — it delivers 80-90% of Opus 4.8’s capability at 40% lower cost. Use Opus 4.8 when you need maximum reasoning depth, outputs longer than 64K tokens, or the most recent knowledge cutoff (January 2026). Use Haiku 4.5 for high-volume, speed-sensitive work.

    Is Claude Opus 4.8 better than Sonnet?

    Claude Opus 4.8 has a higher capability ceiling than Sonnet 4.6: larger output window (128K vs 64K tokens), the most recent knowledge cutoff, and stronger performance on complex agentic coding tasks. However, Sonnet 4.6 uniquely offers extended thinking which Opus does not support, and it costs 40% less. For most users, Sonnet 4.6 is the better practical choice.

    What is Claude Haiku 4.5 used for?

    Claude Haiku 4.5 is optimized for speed and cost efficiency at $1 input / $5 output per million tokens. It is best suited for high-volume pipelines, classification, metadata generation, social media content, and any task where fast response time matters more than maximum reasoning depth. It has a 200K token context window.

    Which Claude model supports extended thinking?

    Claude Sonnet 4.6 and Claude Haiku 4.5 both support extended thinking. Claude Opus 4.8 does not. Extended thinking allows the model to reason step-by-step internally before generating output, which improves performance on complex math, science, and multi-step logic problems.

    Frequently Asked Questions

    What is the difference between Claude Opus, Sonnet, and Haiku?

    Claude Opus 4.8 is the most capable model in the standard tier — best for complex reasoning, long-horizon agentic coding, and tasks requiring high autonomy. Claude Sonnet 4.6 balances intelligence and speed for production workloads — it supports extended thinking and adaptive thinking while costing less than Opus. Claude Haiku 4.5 is the fastest and cheapest option, suited for high-volume tasks where speed and cost matter more than maximum capability.

    Which Claude model should I use in 2026?

    Start with Claude Sonnet 4.6 for most production applications — it offers near-Opus intelligence at $3/$15 per million tokens and supports extended thinking. Use Claude Opus 4.8 for complex multi-step reasoning, long-horizon agentic work, or tasks where quality is worth the higher cost ($5/$25 per MTok). Use Claude Haiku 4.5 for high-volume, latency-sensitive tasks where cost is the primary concern. For maximum capability above Opus 4.8, Claude Fable 5 launched June 9, 2026.

    How much does Claude Opus 4.8 cost?

    Claude Opus 4.8 is priced at $5 per million input tokens and $25 per million output tokens on the Claude API (per platform.claude.com as of June 2026). Batch API offers 50% discounts. For comparison: Claude Sonnet 4.6 is $3/$15 per MTok and Claude Haiku 4.5 is $1/$5 per MTok.

    Does Claude Sonnet support extended thinking?

    Yes. Claude Sonnet 4.6 supports both extended thinking and adaptive thinking (per platform.claude.com/docs/en/about-claude/models/overview). Extended thinking lets the model reason through complex problems before answering. Claude Haiku 4.5 also supports extended thinking. Claude Opus 4.8 does not use extended thinking but does support adaptive thinking.

    What is Claude Fable 5 and how does it compare to Opus?

    Claude Fable 5 (API ID: claude-fable-5) is Anthropic’s most capable widely-released model as of June 9, 2026. It uses adaptive thinking (always on), has a 1M token context window, 128k max output, and is priced at $10 input / $50 output per million tokens. Fable 5 is positioned above Opus 4.8 in the model lineup for the most demanding reasoning and long-horizon agentic work.

    What is the context window for each Claude model?

    Claude Opus 4.8 and Claude Sonnet 4.6 both support 1 million token context windows. Claude Haiku 4.5 supports 200,000 tokens. All three are dramatically larger than the 200k context window that was standard in previous generations. The 1M context window allows Opus and Sonnet to process entire codebases, long research documents, or extended conversations without truncation.

    Get alerted when Claude pricing or limits change

    We track Anthropic’s models, pricing, and limits daily and send a short note when something changes that affects what you pay or build. Occasional, no spam.

    Subscription Form

  • How Claude Managed Agents Handles Idle Time (And Why It Matters for Your Bill)

    How Claude Managed Agents Handles Idle Time (And Why It Matters for Your Bill)

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    The most counterintuitive thing about Claude Managed Agents pricing is what you don’t pay for. Most people, when they hear “$0.08 per session-hour,” mentally model a virtual machine running continuously. That’s the wrong mental model. Here’s the right one, and why it matters for your bill.

    The Core Distinction: Active vs. Idle

    Managed Agents session runtime only accrues while your session’s status is running. The session can exist — open, initialized, capable of continuing — without accumulating runtime charges when it’s not actively executing.

    The specific states that do not count toward your $0.08/hr charge:

    • Time spent waiting for your next message
    • Time waiting for a tool confirmation
    • Time waiting on an external API response your tool is calling
    • Rescheduling delays
    • Terminated session time

    This is a meaningful architectural decision by Anthropic. They’re billing on what actually taxes their compute — active execution — not on session existence or wall-clock time.

    Why This Is Different From How You Might Expect Billing to Work

    Compare three billing models:

    Virtual machine billing (what this is not): You pay for every hour the instance exists, whether it’s idle or saturated. A VM running 24/7 with 10% actual utilization still costs 24 hours/day.

    Lambda/function billing (closer analogy): AWS Lambda bills on execution duration and invocation count — you pay when code actually runs, not when a function is “available.” Idle Lambda functions cost nothing.

    Managed Agents billing (what this actually is): Closer to Lambda than VM. You pay $0.08 per hour of active execution. A session that runs for 2 hours of wall-clock time but has 90 minutes of waiting costs $0.08 × 1.5 hours = $0.12, not $0.08 × 2 hours = $0.16.

    A Real Scenario: The Human-in-the-Loop Agent

    Consider an agent that processes your inbox for action items and waits for your approval before sending replies. Wall-clock time: 4 hours open during your workday. Actual active execution: 20 minutes of processing across that 4-hour window, with the rest spent waiting for your review decisions.

    • VM billing equivalent: 4 hours × rate = significant charge
    • Managed Agents billing: 20 minutes × $0.08/hr = $0.027

    The difference is real. For interaction-heavy agents where the agent frequently waits for human decisions, the idle-time exclusion significantly reduces costs versus a naive per-hour model.

    A Real Scenario: The Autonomous Batch Agent

    Now consider an agent running a fully autonomous content pipeline — no human checkpoints, just continuous execution through a queue. Wall-clock time and active execution time are nearly identical because the agent never waits.

    • A 2-hour autonomous batch: 2 hours × $0.08 = $0.16

    Here, the idle-time model provides no benefit — the agent has no idle time. The billing is effectively equivalent to per-hour pricing because execution is continuous.

    Code Execution Containers Are Included

    One more billing nuance worth knowing: when your agent runs code, the execution happens in sandboxed Linux containers. These containers are not separately billed on top of session runtime. The $0.08/hr covers both the session runtime and the container execution. This is explicitly documented by Anthropic and represents meaningful savings if your agent is doing significant code execution work — you’re not paying twice.

    What This Means for Workload Design

    If you’re designing agent workflows and have the choice between architectures, the billing model creates a useful signal:

    • Agents that wait on humans: Metered billing is favorable — you only pay for the actual reasoning and execution time, not the human decision time
    • Fully autonomous agents: Billing approaches equivalent to per-hour rates — optimize these on token efficiency, not idle reduction
    • Scheduled batch agents: Natural fit — run when needed, terminate when done, no idle accumulation

    The 24/7 Agent Math

    For anyone doing the 24/7 always-on calculation: the maximum theoretical runtime exposure is 24 hrs × $0.08 × 30 days = $57.60/month in session fees. But a 24/7 agent with zero idle time is rare in practice. Agents that sleep between triggers, wait on external data, or hold for human decisions have meaningful idle windows that reduce the actual charge below the theoretical ceiling.

    Full monthly cost analysis: The Real Monthly Cost of Running Claude Managed Agents 24/7. Pricing reference: Complete Pricing Guide. All questions: FAQ Hub.

  • Claude Managed Agents — Every Question Answered (Complete FAQ 2026)

    Claude Managed Agents — Every Question Answered (Complete FAQ 2026)

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    Everything people actually ask about Claude Managed Agents, answered straight. No preamble about “the exciting world of AI agents.” If you’re here, you already know why this matters — you just need answers.

    This page covers pricing, setup, capabilities, limits, comparisons, and the specific questions that don’t have obvious homes in Anthropic’s documentation. It updates as the beta evolves.

    Context

    Claude Managed Agents launched April 8, 2026 as a public beta. All answers reflect current documentation as of April 2026. Beta details change — verify specifics at platform.claude.com/docs.

    Pricing Questions

    What does Claude Managed Agents cost?

    Two charges: standard Claude API token rates (same as calling the Messages API directly) plus $0.08 per session-hour of active runtime. That’s the complete formula. See the complete pricing reference for worked examples by workload type.

    What exactly is a “session-hour” and when does it start billing?

    A session-hour is one hour of active session runtime — time when your session’s status is running. Billing is metered to the millisecond. It does not accrue during idle time, time waiting for your input, time waiting for tool confirmations, or after session termination.

    What’s included in the $0.08/session-hour charge?

    The session runtime charge covers Anthropic’s managed infrastructure: sandboxed code execution containers, state management, checkpointing, tool orchestration, error recovery, and scaling. You are not separately billed for container hours on top of session runtime.

    Does the $0.08/hr apply even if my agent is just waiting?

    No. Time spent waiting for your message, waiting for tool confirmations, or sitting idle does not accumulate runtime charges. Only active execution time counts.

    What does web search cost inside a Managed Agents session?

    $10 per 1,000 searches ($0.01 per search), billed separately from session runtime and token costs. This is the same rate as web search through the standard API.

    Are there volume discounts?

    Yes, negotiated case-by-case for high-volume users. Contact [email protected] or through the Claude Console.

    How does Managed Agents pricing compare to running my own agent infrastructure?

    The $0.08/session-hour is almost always cheaper than equivalent provisioned compute — but you trade infrastructure control and data locality for that simplicity. For a full comparison: Build vs. Buy: The Real Infrastructure Cost.

    What’s the real monthly cost if I run an agent 24/7?

    Maximum theoretical session runtime: 24 hrs × $0.08 × 30 days = $57.60/month. In practice, no production agent has zero idle time. Token costs become the dominant cost driver long before you hit the runtime ceiling. Detailed breakdown: The Real Monthly Cost of Running Claude Managed Agents 24/7.

    Setup and Access Questions

    How do I get access to Claude Managed Agents?

    Available to all Anthropic API accounts in public beta — no separate signup. You need the managed-agents-2026-04-01 beta header in your API requests. The Claude SDK adds this header automatically.

    Does it work with my existing API key?

    Yes. Same API key you’re already using for the Messages API. Same authentication. The beta header is the only new requirement.

    What three ways can I access Managed Agents?

    Via the Claude SDK (recommended — handles the beta header automatically), via direct API calls with the beta header, or via the Claude Console’s new Managed Agents section for no-code agent configuration and session tracing.

    Can I use Managed Agents through AWS Bedrock or Google Vertex AI?

    Managed Agents runs on Anthropic-managed infrastructure. This is distinct from Bedrock and Vertex AI deployments. Check Anthropic’s current documentation for multi-cloud availability status — this is an area of active development.

    Capability Questions

    What can Claude Managed Agents actually do?

    Run long autonomous sessions with persistent state, execute code in sandboxed Linux containers, use tools including web search and MCP servers, coordinate multiple Claude instances via Agent Teams, and maintain checkpoints for crash recovery. The session can last minutes or hours without you staying in the loop.

    What’s the difference between Agent Teams and subagents?

    Agent Teams coordinate multiple Claude instances with independent contexts, direct agent-to-agent communication, and a shared task list — suited for complex parallel tasks. Subagents operate within the same session as the main agent and only report results upward — more economical for sequential targeted tasks but less capable of true parallelism.

    Does it support MCP servers?

    Yes. MCP servers can be integrated as tool sources in Managed Agents sessions, extending what the agent can access and act on.

    How long can a session run?

    Anthropic’s documentation currently references session durations of minutes to hours. Claude Code’s longest autonomous sessions have reached 45 minutes. Managed Agents is architected for longer-running work. Check current documentation for specific session duration limits as the beta matures.

    What happened to Claude Code — is it the same as Managed Agents?

    No. Claude Code is a separate local coding workflow product. Anthropic’s docs explicitly note partners should not conflate the two. Managed Agents is a hosted API runtime service. Claude Code is a developer tool. Different products, different use cases, different billing.

    Rate Limit Questions

    What are the rate limits for Managed Agents?

    60 requests per minute for create endpoints; 600 requests per minute for read endpoints. Organization-level API limits still apply on top of these. For higher limits, contact Anthropic enterprise sales. Detailed breakdown: Claude Managed Agents Rate Limits Explained.

    Do standard Claude API rate limits still apply inside a session?

    Organization-level limits apply. The session runtime and create/read endpoint limits are Managed Agents-specific. If you’re running many parallel Agent Teams, model token throughput limits will become relevant.

    Comparison Questions

    How does Managed Agents compare to OpenAI’s Agents API?

    Both offer hosted agent infrastructure. Key differences: Managed Agents is Claude-native (no multi-model flexibility), sessions bill on runtime + tokens vs. OpenAI’s different pricing model, and lock-in dynamics differ. Full comparison: Claude Managed Agents vs. OpenAI Agents API.

    Should I use Managed Agents or the Claude Agent SDK?

    Use Managed Agents when you want Anthropic to host the runtime — less infrastructure work, faster to production. Use the SDK when you need tighter loop control, on-premise execution, or multi-cloud flexibility. Anthropic’s own migration docs draw this line clearly: SDK runs in your environment; Managed Agents runs in theirs.

    What companies are already using Managed Agents in production?

    Notion, Asana, Rakuten, Sentry, and Vibecode were launch partners. Rakuten deployed five enterprise agents within a week. Allianz is using Claude for insurance agent workflows. Anthropic’s run-rate from the agent developer segment exceeds $2.5 billion. How Rakuten did it in a week →

    Data and Security Questions

    Where does my data go when running in Managed Agents?

    Execution runs on Anthropic’s infrastructure. This is the explicit trade-off: you get managed infrastructure; they manage the compute. For companies with strict data sovereignty requirements, this is the key constraint to evaluate. On-premise or native multi-cloud deployment is not currently available.

    What are the sandboxing guarantees?

    Anthropic uses disposable Linux containers — “decoupled hands” in their terminology. Each container is a fresh sandboxed environment for code execution. State persistence is managed separately from the execution environment.

    Strategic Questions

    Is this a bet worth making?

    That depends on your switching cost tolerance. Lock-in is real: once your agents run on Anthropic’s infrastructure with their tools, session format, and sandboxing, switching providers isn’t trivial. The counter-argument: the infrastructure you’d otherwise build to match this is months of engineering. One developer’s reaction at launch was blunt: “there goes a whole YC batch.” That captures both the opportunity and the risk. Our take on why we’re staying our course →

    What does this mean for AI citation and visibility?

    Agents running on Anthropic’s infrastructure make decisions about what content to surface, cite, and synthesize. As agent workloads grow, being present in the knowledge sources agents draw from becomes a search strategy question in itself. What AI citation monitoring looks like →

  • Claude Managed Agents — Complete Pricing Reference + Dreaming Update (May 2026)

    Claude Managed Agents — Complete Pricing Reference + Dreaming Update (May 2026)

    Last refreshed: May 15, 2026

    May 2026 Update — Dreaming Feature + Beta Status

    Anthropic introduced Dreaming at Code w/ Claude (May 6, 2026) — a new Managed Agents capability where agents review their own session history overnight to improve future performance. Harvey (legal AI) reported a roughly 6× task completion rate increase after implementing it. Dreaming is developer-access preview only. Multiagent Orchestration and Outcomes are now in public beta. See the new Dreaming section below.

    What Is Claude Managed Agents? (Current Status, May 2026)

    Claude Managed Agents is Anthropic’s framework for long-running, stateful AI agents — agents that can maintain context across sessions, hand off between sub-agents, and now, improve themselves by reviewing their own work history. Here’s the current status of each component:

    Component Status Who Has Access
    Multiagent Orchestration Public Beta All API developers
    Outcomes Public Beta All API developers
    Dreaming Developer Preview Selected developers only

    Dreaming: The Feature the Press Mostly Missed

    Announced at Code w/ Claude on May 6, 2026, Dreaming is a Managed Agents capability that lets agents review and reorganize their own memory between sessions. The mechanism:

    1. After a session ends, the agent reads its existing memory store alongside the session transcripts
    2. It produces a new, reorganized memory store: duplicates merged, stale entries replaced, new patterns surfaced
    3. The next session starts with a higher-quality knowledge base — capturing insights no single session could hold

    This is meaningfully different from simply persisting conversation history. The agent isn’t just remembering what happened — it’s synthesizing what it learned. Think of it as the difference between taking notes and actually reviewing and reorganizing your notes the next morning.

    The Harvey Result

    Harvey, the legal AI company, reported approximately a 6× task completion rate increase after implementing Dreaming in their Managed Agents workflow. Harvey’s use case — complex legal research that spans multiple sessions with evolving context — is exactly the kind of work Dreaming was designed for. Sessions build on each other rather than starting fresh each time.

    Dreaming is developer-access preview as of May 2026. Docs: platform.claude.com/docs/en/managed-agents/dreams.

    What Dreaming Is Not

    A few clarifications worth making explicit:

    • Dreaming is not available to end users — it’s a developer-layer capability requiring implementation
    • It’s not persistent memory in the claude.ai chat interface
    • It’s not available to free or standard Pro subscribers through any interface
    • It’s a developer preview, not GA — expect it to evolve before full release

    Our Take: Why This Architecture Matters

    We run Managed Agents in our own Cowork workflows. The Dreaming announcement is the first time Anthropic has shipped something that resembles how expert human knowledge actually compounds over time — not by accumulating raw notes, but by periodically synthesizing and reorganizing what’s been learned into a cleaner structure.

    The Harvey 6× result is a real-world data point from a production legal AI workflow. That’s not a benchmark number — it’s a deployed system showing measurable improvement from session-to-session memory refinement. Whether that 6× figure holds across different use cases is unknown, but the direction of the effect is the signal: agents that learn from their own history outperform agents that don’t.

    For non-developer users watching this space: Dreaming is the preview of what agentic AI will look like when it becomes mainstream. The groundwork being laid now in developer preview will eventually surface in subscription-tier products.

    Model Accuracy Note — Updated May 2026

    Current flagship: Claude Opus 4.7 (claude-opus-4-7). Current models: Opus 4.7 · Sonnet 4.6 · Haiku 4.5. Claude Opus 4.7 (claude-opus-4-7) is the current flagship as of April 16, 2026. Where this article references Opus 4.6 or earlier models, those references are historical. See current model tracker →. See current model tracker →

    Tygart Media Strategy
    Volume Ⅰ · Issue 04Quarterly Position
    By Will Tygart
    Long-form Position
    Practitioner-grade

    You opened this tab because you need a number you can actually use. Not a vibe, not “it depends.” A real pricing breakdown you can put in a spreadsheet, a budget request, or a Slack message to your CTO.

    This is that page. Every pricing variable for Claude Managed Agents in one place, verified against Anthropic’s current documentation as of April 2026. Bookmark it. The beta will update; so will this.

    Quick Reference: The Formula

    Total Cost = Token Costs + Session Runtime ($0.08/hr) + Optional Tools
    Session runtime only accrues while status = running. Idle time is free.

    The Two Cost Dimensions

    Claude Managed Agents bills on exactly two dimensions: tokens and session runtime. Every pricing question you have collapses into one of these two buckets.

    Dimension 1: Token Costs

    These are identical to standard Claude API pricing. You pay the same rates you’d pay calling the Messages API directly. No Managed Agents markup on tokens. Current rates for the models most commonly used in agent work:

    • Claude Sonnet 4.6: ~$3/million input tokens, ~$15/million output tokens
    • Claude Opus 4.7: higher rates apply — check platform.claude.com/docs/en/about-claude/pricing for current figures
    • Prompt caching: same multipliers as standard API — cache hits dramatically reduce input token costs on long sessions with stable system prompts

    The implication: a token-heavy agent with a large system prompt that runs the same context repeatedly benefits significantly from prompt caching, and that benefit carries over unchanged into Managed Agents.

    Dimension 2: Session Runtime — $0.08/Session-Hour

    This is the Managed Agents-specific charge. You pay $0.08 per hour of active session runtime, metered to the millisecond.

    The critical word is active. Runtime only accrues while your session’s status is running. The following do not count toward your bill:

    • Time spent waiting for your next message
    • Time waiting for a tool confirmation
    • Idle time between tasks
    • Rescheduling delays
    • Terminated session time

    This is not how you’d bill a virtual machine. It’s closer to how AWS Lambda bills — you pay for execution, not reservation. An agent that “runs” for 8 hours but spends 6 of those hours waiting on human input has a very different bill than one running continuous autonomous loops.

    Optional Tool Costs

    Web Search: $10 per 1,000 Searches

    If your agent uses web search, each search costs $10/1,000 — that’s $0.01 per search. For most agents, this is negligible. For a research agent running hundreds of searches per session, it becomes a line item worth modeling separately.

    Code Execution: Included in Session Runtime

    Code execution containers are included in your $0.08/session-hour charge. You’re not separately billed for container hours on top of session runtime. This is explicitly stated in Anthropic’s docs and represents meaningful savings versus provisioning your own compute.

    Worked Cost Examples

    Example 1: Daily Research Agent

    Runs once per day. 30 minutes of active execution. Processes 10 documents, outputs a summary report. Moderate token volume.

    • Session runtime: 0.5 hrs × $0.08 = $0.04/day (~$1.20/month)
    • Tokens (estimate): 50K input + 5K output with Sonnet 4.6 = ~$0.23/run (~$7/month)
    • Total: ~$8–10/month

    Example 2: Weekly Batch Content Pipeline

    Runs 3x/week. 2-hour active sessions. Processes multiple documents, generates structured outputs.

    • Session runtime: 2 hrs × $0.08 × 12 sessions/month = $1.92/month
    • Tokens: depends on content volume — typically $10–40/month
    • Total: ~$12–42/month

    Example 3: Customer Support Agent (Business Hours)

    Active during business hours, handling tickets. 8 hours/day active, 5 days/week.

    • Session runtime: 8 hrs × $0.08 × 22 days = $14.08/month in runtime
    • Tokens: highly variable by ticket volume — the dominant cost driver at scale
    • Runtime cost alone: ~$14/month — tokens are likely 5–20x this depending on volume

    Example 4: 24/7 Always-On Agent

    The maximum theoretical runtime exposure. Continuous operation, no idle time.

    • Session runtime: 24 hrs × $0.08 × 30 days = $57.60/month
    • In practice, no agent has zero idle time — real cost will be lower
    • Token costs at this scale become the dominant factor by a wide margin

    Anthropic’s Official Example (from their docs)

    A one-hour coding session using Claude Opus 4.7 consuming 50,000 input tokens and 15,000 output tokens: session runtime = $0.08. With prompt caching active and 40,000 of those tokens as cache reads, the token costs drop significantly. The runtime charge stays flat at $0.08 regardless of caching.

    What’s Not Billed in Managed Agents

    A few things that might seem like costs but aren’t:

    • Infrastructure provisioning: Anthropic handles hosting, scaling, and monitoring at no additional charge
    • Container hours: Explicitly not separately billed on top of session runtime
    • State management and checkpointing: Included in the session runtime charge
    • Error recovery and retry logic: Anthropic’s infrastructure problem, not yours

    Rate Limits

    Managed Agents has specific rate limits separate from standard API limits:

    • Create endpoints: 60 requests/minute
    • Read endpoints: 600 requests/minute
    • Organization-level limits still apply
    • For higher limits, contact Anthropic enterprise sales

    How to Access Managed Agents Pricing

    Managed Agents is available to all Anthropic API accounts in public beta. No separate signup, no premium tier gate. You need the managed-agents-2026-04-01 beta header in your API requests — the Claude SDK adds this automatically.

    For high-volume agent applications, Anthropic’s enterprise sales team negotiates custom pricing arrangements. Contact them at [email protected] or through the Claude Console.

    The Pricing Signals Worth Noting

    Anthropic recently ended Claude subscription access (Pro/Max) for third-party agent frameworks, requiring those users to switch to pay-as-you-go API pricing. This signals a deliberate strategy: consumer subscriptions are for human-paced interactions; agent workloads route through the API. The $0.08/session-hour rate exists in that context — it’s infrastructure pricing for compute that runs beyond human attention spans.

    The session-hour model also signals something about Anthropic’s infrastructure cost structure. They’re pricing on active execution time because that’s what actually taxes their systems. Idle sessions don’t cost them much; active agents do. The billing model follows the actual resource consumption pattern.

    Frequently Asked Questions

    Is the $0.08/session-hour charge in addition to token costs, or does it replace them?

    In addition to. You pay both: standard token rates for all input and output tokens, plus $0.08 per hour of active session runtime. They’re separate line items.

    Does prompt caching work in Managed Agents sessions?

    Yes. Prompt caching multipliers apply identically to Managed Agents sessions as they do to standard API calls. If your agent has a large, stable system prompt, caching it can significantly reduce input token costs.

    What happens if my session crashes? Am I billed for the crashed time?

    Runtime accrues only while status is running. Terminated sessions stop accruing. Anthropic’s infrastructure handles checkpointing and crash recovery — the session state is preserved even if the session terminates unexpectedly.

    Can I use Managed Agents on the free API tier?

    Managed Agents is available to all Anthropic API accounts in public beta, but standard tier access and rate limits apply. Free API tier users receive a small credit for testing.

    How does this compare to running agents on my own infrastructure?

    See our full breakdown: Build vs. Buy: The Real Infrastructure Cost of Claude Managed Agents. Short version: the $0.08/hour is almost certainly cheaper than provisioning and maintaining equivalent compute, but you trade control and data locality for that simplicity.

    Are there volume discounts?

    Volume discounts are available for high-volume users but negotiated case-by-case. Contact Anthropic enterprise sales.

    Does web search billing count against the $10/1,000 rate if the search returns no results?

    Anthropic’s current docs don’t explicitly address failed searches. Treat any triggered search as billable until confirmed otherwise.

    For the full session-hour math worked out by workload type, see: Claude Managed Agents Pricing, Decoded: What a Session-Hour Actually Costs You. For the build-vs-buy infrastructure comparison: Build vs. Buy: The Real Infrastructure Cost. For enterprise deployment patterns: Rakuten Stood Up 5 Enterprise Agents in a Week.